Skip to content

Bug: Gemma2 Context switching forgets original input #8251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Gomez12 opened this issue Jul 2, 2024 · 2 comments
Closed

Bug: Gemma2 Context switching forgets original input #8251

Gomez12 opened this issue Jul 2, 2024 · 2 comments
Labels
bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) stale

Comments

@Gomez12
Copy link

Gomez12 commented Jul 2, 2024

What happened?

If I have a prompt like the following "<start_of_turn>user\nProductGroup: Anvil<end_of_turn><start_of_turn>user\nCan you give me the 25 most important characteristics for the previous named Productgroup? Respond in the following Json Format : [{'Characteristic':string,'Explanation':string,'ExampleValues':[string]}<end_of_turn><start_of_turn>model"
Then it starts out well, but if you set the context low (IE 512) then it starts breaking up output after context switching.
It seems like it has forgotten the Initial ProductGroup, and it just continuous outputting based on the examples from the JSON which don't mention the specific product group, so it just outputs for random products.

I don't know if this is just how it is supposed to work, or that it would be possible to add the original prompt in front when context switching.

I basically noticed this because of the server with parallel which splits the context size so you run into this much quicker, the quick-fix version is to simply up the context window

Name and Version

./llama-cli --version
version: 3281 (023b880)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu

What operating system are you seeing the problem on?

Linux

Relevant log output

No response

@Gomez12 Gomez12 added bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) labels Jul 2, 2024
@matteoserva
Copy link
Contributor

Yes. The "forgetting" behaviour you described is the default.
You can change it with the --keep option to keep the original prompt instead of discarding it.

If I remember correctly, in the server you have to multiply the desired context by the batch size to get the number you put in the command line.

Copy link
Contributor

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-unconfirmed medium severity Used to report medium severity bugs in llama.cpp (e.g. Malfunctioning Features but still useable) stale
Projects
None yet
Development

No branches or pull requests

2 participants