How to chat now, after update since b3087? #7837

superchargez · 2024-06-08T20:25:02Z

superchargez
Jun 8, 2024

Removal of instruction and chatml params, see issue which got implemented in b3087, have made it difficult to use llamacpp for chatting, however, these changes are made in main and not in server server never supported chat format like this (where user write something and llama.cpp responds in loop). So, what to do now?

dspasyuk · 2024-06-10T01:14:40Z

dspasyuk
Jun 10, 2024

I also want to know the proper way to run the "main" now. With -ins gone everything I try seems to give incoherent results or the model (Llama 3) just chats with itself. I used to run it like so:
../llama.cpp/main --model ../../models/meta-llama-3-8b-instruct_q5_k_s.gguf --n-gpu-layers 35 -ins --keep -1 -cml --simple-io -b 2048 --ctx_size 2048 --temp 0.5 --top_k 10 --multiline-input --repeat_penalty 1.12 -t 4

This seems to work sometimes but model starts randomly charting with itself or just generates garbage:

../llama.cpp/main --model ../../models/meta-llama-3-8b-instruct_q5_k_s.gguf --n-gpu-layers 35 -i -cnv --interactive-first --keep -1 --simple-io -b 2048 --ctx_size 2048 --temp 0.3 --top_k 10 --multiline-input --repeat_penalty 1.12 -t 6

0 replies

dspasyuk · 2024-06-10T01:35:01Z

dspasyuk
Jun 10, 2024

@ggerganov could we get your help here Georgi?

0 replies

dspasyuk · 2024-06-10T03:17:07Z

dspasyuk
Jun 10, 2024

@superchargez I think this works now:
../llama.cpp/main --model ../../models/meta-llama-3-8b-instruct_q5_k_s.gguf --n-gpu-layers 30 -cnv --interactive-first --simple-io -b 2048 --ctx_size 2048 --temp 0.3 --top_k 10 --multiline-input --repeat_penalty 1.12 -t 6 -r "/n>" --log-disable -p "You are Alice, a large language model. Your purpose is to assist users by providing information, answering questions, and engaging in meaningful conversations based on the data you were trained on."

See more here https://github.com/dspasyuk/llama.cui

2 replies

dspasyuk Jun 10, 2024

My answer above only works for about 4-5 conversations then the LLama 3 model starts to output garbage. I gave up and reverted back to the old version.

superchargez Jun 11, 2024
Author

I'm keeping updated version only for server, main is not working right now.

Galunid · 2024-06-11T13:33:56Z

Galunid
Jun 11, 2024
Collaborator

I'm not sure why -cml/-ins were removed.

2 replies

dspasyuk Jun 11, 2024

Input syntax is not a problem, the output generated by llama 3 now is incoherent after a few messages. It just starts printing line breaks or vertical lines or just garbage. Downgrade is the only solution so far. Setting prefix/suffix does not help.

Galunid Jun 11, 2024
Collaborator

That's a separate issue then. Please update to the latest version and create a new bug ticket.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to chat now, after update since b3087? #7837

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments 4 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to chat now, after update since b3087? #7837

Uh oh!

superchargez Jun 8, 2024

Replies: 4 comments · 4 replies

Uh oh!

dspasyuk Jun 10, 2024

Uh oh!

dspasyuk Jun 10, 2024

Uh oh!

dspasyuk Jun 10, 2024

Uh oh!

dspasyuk Jun 10, 2024

Uh oh!

superchargez Jun 11, 2024 Author

Uh oh!

Galunid Jun 11, 2024 Collaborator

Uh oh!

dspasyuk Jun 11, 2024

Uh oh!

Galunid Jun 11, 2024 Collaborator

superchargez
Jun 8, 2024

Replies: 4 comments 4 replies

dspasyuk
Jun 10, 2024

dspasyuk
Jun 10, 2024

dspasyuk
Jun 10, 2024

superchargez Jun 11, 2024
Author

Galunid
Jun 11, 2024
Collaborator

Galunid Jun 11, 2024
Collaborator