llama : fix Gemma3 SWA KV cache shift #12373

ggerganov · 2025-03-13T15:18:18Z

This should fix the KV cache shift for Gemma3 models. Testing:

make -j && ./bin/llama-cli -m ../models/gemma-3-4b/ggml-model-f16.gguf --top-k 1 -s 1 -p "I believe the meaning of life is" -c 256

ggml-ci

ggerganov · 2025-03-13T15:18:52Z

src/llama-context.cpp

+        float freq_base_l  = cparams.rope_freq_base;
+        float freq_scale_l = cparams.rope_freq_scale;
+
+        // TODO: improve
+        if (model.arch == LLM_ARCH_GEMMA3) {
+            const bool is_sliding = hparams.is_sliding(il);
+
+            freq_base_l  = is_sliding ? 10000.0f : cparams.rope_freq_base;
+            freq_scale_l = is_sliding ? 1.0f     : cparams.rope_freq_scale;
+        }
+


Not sure how to avoid this special-casing here. It does not look great.

I think we can extend the llama_layer to hold this info in the near future.

For now, I've pushed the following version, which should be a bit cleaner: #12374

Will see if there is a better way to do it with the upcoming model implementation refactoring.

* llama : fix Gemma3 SWA KV cache shift ggml-ci * hparams : add comment [no ci]

llama : fix Gemma3 SWA KV cache shift

de9d18f

ggml-ci

ggerganov commented Mar 13, 2025

View reviewed changes

hparams : add comment [no ci]

21fe0ce

ggerganov merged commit 84d5475 into master Mar 13, 2025
1 check passed

ggerganov mentioned this pull request Mar 13, 2025

hparams : add SWA rope parameters #12374

Merged

jpohhhh pushed a commit to Telosnex/llama.cpp that referenced this pull request Mar 14, 2025

llama : fix Gemma3 SWA KV cache shift (ggml-org#12373)

71e0827

* llama : fix Gemma3 SWA KV cache shift ggml-ci * hparams : add comment [no ci]

cjpais mentioned this pull request Mar 18, 2025

Initial support for Gemma 3 models Mozilla-Ocho/llamafile#717

Merged

5 tasks

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Mar 19, 2025

llama : fix Gemma3 SWA KV cache shift (ggml-org#12373)

3428d14

* llama : fix Gemma3 SWA KV cache shift ggml-ci * hparams : add comment [no ci]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama : fix Gemma3 SWA KV cache shift #12373

llama : fix Gemma3 SWA KV cache shift #12373

Uh oh!

ggerganov commented Mar 13, 2025

Uh oh!

ggerganov Mar 13, 2025

Uh oh!

ngxson Mar 13, 2025

Uh oh!

ggerganov Mar 13, 2025

Uh oh!

Uh oh!

Uh oh!

llama : fix Gemma3 SWA KV cache shift #12373

llama : fix Gemma3 SWA KV cache shift #12373

Uh oh!

Conversation

ggerganov commented Mar 13, 2025

Uh oh!

ggerganov Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

ngxson Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

ggerganov Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!