How to chat now, after update since b3087? #7837
Replies: 4 comments 4 replies
-
I also want to know the proper way to run the "main" now. With -ins gone everything I try seems to give incoherent results or the model (Llama 3) just chats with itself. I used to run it like so: This seems to work sometimes but model starts randomly charting with itself or just generates garbage: ../llama.cpp/main --model ../../models/meta-llama-3-8b-instruct_q5_k_s.gguf --n-gpu-layers 35 -i -cnv --interactive-first --keep -1 --simple-io -b 2048 --ctx_size 2048 --temp 0.3 --top_k 10 --multiline-input --repeat_penalty 1.12 -t 6 |
Beta Was this translation helpful? Give feedback.
-
@ggerganov could we get your help here Georgi? |
Beta Was this translation helpful? Give feedback.
-
@superchargez I think this works now: See more here https://github.com/dspasyuk/llama.cui |
Beta Was this translation helpful? Give feedback.
-
For ChatML you can do something along the lines of I'm not sure why |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Removal of instruction and chatml params, see issue which got implemented in b3087, have made it difficult to use llamacpp for chatting, however, these changes are made in main and not in server server never supported chat format like this (where user write something and llama.cpp responds in loop). So, what to do now?
Beta Was this translation helpful? Give feedback.
All reactions