Skip to content

Conversation

@xkong-anaconda
Copy link

llama.cpp 0.0.6188

Destination channel: defaults

Links

Explanation of changes:

  • Downgrade to b6188 for compatibility with llama-cpp-python 0.3.16

    • The llama_get_kv_self() API and related llama_kv_self_* functions were removed in b6239 (PR
      #15472
      )
    • llama-cpp-python 0.3.16 requires these APIs (compatible range: b6173-b6238)
  • Regenerated patches for b6188:

    • increase-nmse-tolerance.patch - Updated line numbers (5 hunks instead of 7)
    • increase-nmse-tolerance-aarch64.patch - Updated line numbers
    • mkl.patch - Updated for unquoted CMake variable syntax
    • metal_gpu_selection.patch - Targets ggml-metal.m (b6188 structure)
  • Removed patches not needed for b6188:

    • disable-metal-bf16.patch - BF16 is OFF by default in b6188
    • disable-metal-flash-attention.patch - Not needed with BF16 disabled

@xkong-anaconda xkong-anaconda self-assigned this Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants