Skip to content

Commit 625ff14

Browse files
jianliaoJian Liao
authored andcommitted
readme : improve readme for Llava-1.6 example (ggml-org#6044)
Co-authored-by: Jian Liao <[email protected]>
1 parent 49a6a9f commit 625ff14

File tree

1 file changed

+13
-5
lines changed

1 file changed

+13
-5
lines changed

examples/llava/README.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -63,31 +63,39 @@ Now both the LLaMA part and the image encoder is in the `llava-v1.5-7b` director
6363
```console
6464
git clone https://huggingface.co/liuhaotian/llava-v1.6-vicuna-7b
6565
```
66-
2) Use `llava-surgery-v2.py` which also supports llava-1.5 variants pytorch as well as safetensor models:
66+
67+
2) Install the required Python packages:
68+
69+
```sh
70+
pip install -r examples/llava/requirements.txt
71+
```
72+
73+
3) Use `llava-surgery-v2.py` which also supports llava-1.5 variants pytorch as well as safetensor models:
6774
```console
6875
python examples/llava/llava-surgery-v2.py -C -m ../llava-v1.6-vicuna-7b/
6976
```
7077
- you will find a llava.projector and a llava.clip file in your model directory
71-
3) Copy the llava.clip file into a subdirectory (like vit), rename it to pytorch_model.bin and add a fitting vit configuration to the directory:
78+
79+
4) Copy the llava.clip file into a subdirectory (like vit), rename it to pytorch_model.bin and add a fitting vit configuration to the directory:
7280
```console
7381
mkdir vit
7482
cp ../llava-v1.6-vicuna-7b/llava.clip vit/pytorch_model.bin
7583
cp ../llava-v1.6-vicuna-7b/llava.projector vit/
7684
curl -s -q https://huggingface.co/cmp-nct/llava-1.6-gguf/raw/main/config_vit.json -o vit/config.json
7785
```
7886

79-
4) Create the visual gguf model:
87+
5) Create the visual gguf model:
8088
```console
8189
python ./examples/llava/convert-image-encoder-to-gguf.py -m vit --llava-projector vit/llava.projector --output-dir vit --clip-model-is-vision
8290
```
8391
- This is similar to llava-1.5, the difference is that we tell the encoder that we are working with the pure vision model part of CLIP
8492

85-
5) Then convert the model to gguf format:
93+
6) Then convert the model to gguf format:
8694
```console
8795
python ./convert.py ../llava-v1.6-vicuna-7b/ --skip-unknown
8896
```
8997

90-
6) And finally we can run the llava-cli using the 1.6 model version:
98+
7) And finally we can run the llava-cli using the 1.6 model version:
9199
```console
92100
./llava-cli -m ../llava-v1.6-vicuna-7b/ggml-model-f16.gguf --mmproj vit/mmproj-model-f16.gguf --image some-image.jpg -c 4096
93101
```

0 commit comments

Comments
 (0)