Skip to content

llama_kv_cache_seq_shift delta does not appear to be calculated properly #3825

@MrJackSpade

Description

@MrJackSpade

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [Y] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [Y] I carefully followed the README.md.
  • [Y] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [Y] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Not 100% certain if this is a bug or not, but I was playing with the kv cache shifting functionality and I was getting some weird results so I figured I'd step through it and see what was going on.

I noticed that after performing a double shift on a chunk of the kv cache, that the cell delta only reflected the second shift. The cell positions are properly updated, however. It looks like the shift code treats the cell pos differently than the cell delta.

See cache.cells[i].pos += delta vs. cache.cells[i].delta = delta

The position is cumulative, but the delta maintains the value of the last shift, which throws the position out of sync with the delta once its been shifted more than once.

Of course, the example in main.cpp discards half the disposable context on every shift, so if my understanding is correct, this isn't something that would be noticed through normal usage of the "main.cpp" application. A block of the KV cache wouldn't be shifted twice using that code in the first place, since the second shift would dispose of the block that was moved during the first shift.

So at least superficially it looks like this might be an oversight that wouldn't have come up without directly calling the API. I cant think of why the delta would be fixed to the last shift operation instead of cumulative like the position either. I figured I would make a note of this here in case this was an actual issue and not intentional

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions