-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Architecture "LlamaForCausalLM" not supported #5142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm using it for AutoModelForCausalLM and AutoTokenizer and I'm getting an error as well since auto tokenizer doesn't create a vocab.json file.
Here's the error:
|
Update I am now running:
And I am getting the same error as @lmxin123
@slaren is this a real bug or am I just stupid haha |
I haven't tried converting many models, but coincidentally I tried it yesterday and got this error as well. To check if it is a regression I checked out commits from this repo going back a week or two, but the error was the same, which makes me think (a) I may be doing something stupid (likely) or (b) some support module may have changed something, breaking the |
|
I initially tried converting using
However I still don't preclude the possibility that this is me doing something silly :) |
I've just spent a couple of hours trying to work out what this was as I had an old PR of llama.cpp from deekseek that actually worked fine so I knew it must be possible... Turns out there used to be a test in
When this option got added:
See if adding |
Yes indeed, |
Yeah, I had to add that too. I've still not fixed my problems fully though: managed to make the GGUF but it crashes with an |
@jukofyork Yes, same issue here as what you're seeing as well. No clue if that's a model problem or a conversion problem, but I guess the main issue in any case would be the vocab type missing code that you mentioned above. |
The 8-bit version of the model is a GPTQ quant, while the 4-bit version is is a AWQ qaunt [1, 2]. For reference, you can find more information on these quantized models in the Yi-34B-Chat repository [3]. I recommend trying out the I plan on testing the original |
I think my problem was just forgetting to use the -pad-vocab option as seems to be getting further now. |
|
I did a bit more digging and found that this issue is 2-fold. The first issue would be add the The second part has a higher difficulty curve. If you use the According to issue #4701:
Feel free to correct me if I'm wrong or misinterpreted anything. |
Hi, I have problem for converting https://huggingface.co/SJ-Donald/SJ-SOLAR-10.7b-DPO , anybody has solve for solar based ? using
generation token is not correct
any hint ? |
There's an open PR that fixes it: |
@felladrin thanks |
This issue is stale because it has been open for 30 days with no activity. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I use python convert-hf-to-gguf.py /fengpo/github/Yi-34B-Chat-8bits.
I get this error:
File "/fengpo/github/llama.cpp/convert-hf-to-gguf.py", line 1335, in main
model_instance = model_class(dir_model, ftype_map[args.outtype], fname_out, args.bigendian)
File "/fengpo/github/llama.cpp/convert-hf-to-gguf.py", line 57, in init
self.model_arch = self._get_model_architecture()
File "/fengpo/github/llama.cpp/convert-hf-to-gguf.py", line 254, in _get_model_architecture
raise NotImplementedError(f'Architecture "{arch}" not supported!')
NotImplementedError: Architecture "LlamaForCausalLM" not supported!.
This also contain LlamaForCausalLM.

The text was updated successfully, but these errors were encountered: