Skip to content

Convert.py fails on falcon-7b #5426

@cmwilki

Description

@cmwilki

I cloned llama.cpp today and also performed a git checkout of falcon-7b from hugging face. Attempting to run convert fails similarly to #2717 which was supposedly closed and merged to master.

Here is my python:

python
Python 3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0] on linux

And the error:

~/git/llama.cpp/convert.py ~/models/falcon-7b/ --outtype f16 --outfile falcon-7b.f16.bin
Loading model file /home/cwilkinson/models/falcon-7b/pytorch_model-00001-of-00002.bin
Loading model file /home/cwilkinson/models/falcon-7b/pytorch_model-00001-of-00002.bin
Loading model file /home/cwilkinson/models/falcon-7b/pytorch_model-00002-of-00002.bin
Traceback (most recent call last):
  File "/home/cwilkinson/git/llama.cpp/convert.py", line 1478, in <module>
    main()
  File "/home/cwilkinson/git/llama.cpp/convert.py", line 1414, in main
    model_plus = load_some_model(args.model)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cwilkinson/git/llama.cpp/convert.py", line 1276, in load_some_model
    model_plus = merge_multifile_models(models_plus)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cwilkinson/git/llama.cpp/convert.py", line 730, in merge_multifile_models
    model = merge_sharded([mp.model for mp in models_plus])
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cwilkinson/git/llama.cpp/convert.py", line 709, in merge_sharded
    return {name: convert(name) for name in names}
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cwilkinson/git/llama.cpp/convert.py", line 709, in <dictcomp>
    return {name: convert(name) for name in names}
                  ^^^^^^^^^^^^^
  File "/home/cwilkinson/git/llama.cpp/convert.py", line 684, in convert
    lazy_tensors: list[LazyTensor] = [model[name] for model in models]
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/cwilkinson/git/llama.cpp/convert.py", line 684, in <listcomp>
    lazy_tensors: list[LazyTensor] = [model[name] for model in models]
                                      ~~~~~^^^^^^
KeyError: 'transformer.word_embeddings.weight'

Any help is appreciated.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions