Converting Ilama 4bit GPTQ Model from HF does not work

Hi! I tried to use the 13B Model from https://huggingface.co/maderix/llama-65b-4bit/

I converted the model using 

`python convert-gptq-to-ggml.py models/llama13b-4bit.pt models/tokenizer.model models/llama13b-4bit.bin`

If I understand it correctly I still need to migrate the model and I tried it using

`python migrate-ggml-2023-03-30-pr613.py models/llama13b-4bit.bin models/llama13b-4bit-new.bin`

But after a few seconds this breaks with the following error:

```
Processing part 1 of 1

Processing tensor b'tok_embeddings.weight' with shape: [32000, 5120] and type: F16
Traceback (most recent call last):
  File "/home/dust/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 311, in <module>
    main()
  File "/home/dust/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 306, in main
    copy_tensors(fin, fout, part_id, n_parts)
  File "/home/dust/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 169, in copy_tensors
    assert n_dims in (1, 2)
AssertionError
```

Is it an error or am I the one to blame?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Converting Ilama 4bit GPTQ Model from HF does not work #746

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Converting Ilama 4bit GPTQ Model from HF does not work #746

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions