Skip to content

Commit 1a7cc35

Browse files
algorithmhuggingface-web
algorithm
authored and
huggingface-web
committed
Add updated llama.cpp example
Reference: ggml-org/llama.cpp#2304 :)
1 parent 501a3c8 commit 1a7cc35

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ Refer to the Provided Files table below to see what files use which methods, and
111111
I use the following command line; adjust for your tastes and needs:
112112

113113
```
114-
./main -t 10 -ngl 32 -m llama-2-7b-chat.ggmlv3.q4_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: Write a story about llamas\n### Response:"
114+
./main -t 10 -ngl 32 -m llama-2-7b-chat.ggmlv3.q4_0.bin --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 --in-prefix-bos --in-prefix ' [INST] ' --in-suffix ' [/INST]' -i -p "[INST] <<SYS>> You are a helpful, respectful and honest assistant. <</SYS>> Write a story about llamas. [/INST]"
115115
```
116116
Change `-t 10` to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use `-t 8`.
117117

0 commit comments

Comments
 (0)