Skip to content

Converting Ilama 4bit GPTQ Model from HF does not work #746

Closed
@xonfour

Description

@xonfour

Hi! I tried to use the 13B Model from https://huggingface.co/maderix/llama-65b-4bit/

I converted the model using

python convert-gptq-to-ggml.py models/llama13b-4bit.pt models/tokenizer.model models/llama13b-4bit.bin

If I understand it correctly I still need to migrate the model and I tried it using

python migrate-ggml-2023-03-30-pr613.py models/llama13b-4bit.bin models/llama13b-4bit-new.bin

But after a few seconds this breaks with the following error:

Processing part 1 of 1

Processing tensor b'tok_embeddings.weight' with shape: [32000, 5120] and type: F16
Traceback (most recent call last):
  File "/home/dust/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 311, in <module>
    main()
  File "/home/dust/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 306, in main
    copy_tensors(fin, fout, part_id, n_parts)
  File "/home/dust/llama.cpp/migrate-ggml-2023-03-30-pr613.py", line 169, in copy_tensors
    assert n_dims in (1, 2)
AssertionError

Is it an error or am I the one to blame?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghigh priorityVery important issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions