Skip to content

Commit ea851bc

Browse files
SCheekatitjruwase
andauthored
Fixed mistake in readme (#933)
Co-authored-by: Olatunji Ruwase <[email protected]>
1 parent dd2cda0 commit ea851bc

File tree

1 file changed

+1
-1
lines changed
  • inference/huggingface/zero_inference

1 file changed

+1
-1
lines changed

inference/huggingface/zero_inference/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ deepspeed --num_gpus 1 run_model.py --model bigscience/bloom-7b1 --batch-size 8
9090
Here is an example of running `meta-llama/Llama-2-7b-hf` with Zero-Inference using 4-bit model weights and offloading kv cache to CPU:
9191

9292
```sh
93-
deepspeed --num_gpus 1 run_model.py --model meta-llama/Llama-2-7b-hf` --batch-size 8 --prompt-len 512 --gen-len 32 --cpu-offload --quant-bits 4 --kv-offload
93+
deepspeed --num_gpus 1 run_model.py --model meta-llama/Llama-2-7b-hf --batch-size 8 --prompt-len 512 --gen-len 32 --cpu-offload --quant-bits 4 --kv-offload
9494
```
9595

9696
## Performance Tuning Tips

0 commit comments

Comments
 (0)