-
Notifications
You must be signed in to change notification settings - Fork 11.9k
convert.py fails for finetuned llama2 models ( via HF trl library ) #4896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Have you tried using the |
Hi, I am facing a similar issue, I've tried to convert this finetuned model to gguf but got the error as in the screenshot below: |
ok, my apologies ..there's nothing wrong with convert or any other utils from this awesome package. The documentation on finetuning HF models is a little sparse and hard to understand for first time users. I am writing down some steps below for peeps that happen to chance upon the same error ASSUMPTIONyou are finetuning a model from HF using PEFT ..i have NOT tried any other mode of finetuning STEPS
finally a big vote of thanks to @ggerganov 💯 for his service to the community .. i hope someone gets inspired by this and somehow figures out a way to even train the darned LLMs on CPUs 👍 |
as explained in the lengthy post above .. non issue |
Uh oh!
There was an error while loading. Please reload this page.
Please include information about your system, the steps to reproduce the bug, and the version of llama.cpp that you are using. If possible, please provide a minimal code example that reproduces the bug.
NVIDIA T4 GPU along with Intel(R) Xeon(R) ( AWS -> g4dn.xlarge , 4 vCPUs )
i use the simple https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py with some custom data and llama-2-7b-hf as the base model. Post training , it invokes trainer.save_model and the output dir has the following contents
-rw-rw-r-- 1 ubuntu ubuntu 5100 Jan 12 14:04 README.md
-rw-rw-r-- 1 ubuntu ubuntu 134235048 Jan 12 14:04 adapter_model.safetensors
-rw-rw-r-- 1 ubuntu ubuntu 576 Jan 12 14:04 adapter_config.json
-rw-rw-r-- 1 ubuntu ubuntu 1092 Jan 12 14:04 tokenizer_config.json
-rw-rw-r-- 1 ubuntu ubuntu 552 Jan 12 14:04 special_tokens_map.json
-rw-rw-r-- 1 ubuntu ubuntu 1842948 Jan 12 14:04 tokenizer.json
-rw-rw-r-- 1 ubuntu ubuntu 4219 Jan 12 14:04 training_args.bin
-rw-rw-r-- 1 ubuntu ubuntu 4827151012 Jan 12 14:04 adapter_model.bin
as you can see it has no model.safetensors as required by convert.py .. i tried a bunch of other options to save the model ( trainer.model.save_pretrained , for example ) but the file was always adapter_model.safetensors.
i tried convert-hf-to-gguf.py as well and it too complains about model.safetensors ( and that too after suppressing the error which complains about causalLLAMA architecture not supported )
Is there any other convert script that handles such adapter safetensors ( i guess all models finetuned via peft will definitely be called adapter**_ ) ? when i went through the code i also noticed that the MODEL_ARCH only accomodates "LLAMA" and not "LLAMA2" ..is that why it also fails to find param names from adapter_safetensors in the MODEL_ARCH tmap methods ?
The text was updated successfully, but these errors were encountered: