-
Notifications
You must be signed in to change notification settings - Fork 12k
gguf-py: Support 01.AI Yi models #3943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Oh go on then, you twisted my arm! Working fine, thanks very much! Quants uploading now. |
@KerfuffleV2 Is there any chance to support vivo's BlueLM? This model have two tensors named
Someone said this problem is similar with Yi model(a Chinese forum), but I can't fix it. |
@tastypear It looks like that one might need its own model architecture. Yi just had the exact same architecture except two tensors were named differently so that was really easy to fix. I looked at the forum you linked. I have a harder time reading non-unified diffs than the Chinese part! What kind of monster uses non-unified diffs? But it seems like they had to mess around with the C++ code as well. You can try editing Like: # Output norm
MODEL_TENSOR.OUTPUT_NORM: (
"gpt_neox.final_layer_norm", # gptneox
"model.embed_layer_norm", # BlueLM <-- new |
@KerfuffleV2 very close, but...
There are 2 tensors named
|
@tastypear Sorry for the slow reply. I kept meaning to get back to this. Unfortunately, I don't really know what would be required to fix that issue and I probably won't have the time to mess with it. Hopefully you or someone else will manage to fix it. It seems like it needs considerably more to fix than the Yi model which was exactly the same as LLaMA except two tensors had different names. Anyway, not much of a response but I didn't want to just ignore you. |
@KerfuffleV2 I just happened to see someone mention a related model and thought it would be easy to solve, so I took the liberty to ask. Thank you for taking the time to reply to me😉 |
Tiny change to support Yi model layernorm tensor names. Architecturally, it's the same as LLaMA2. See 01-ai/Yi#1
The model has impressively high MMLU results, higher than any 70B models actually. Whether it's valid or translates to real world results, I don't know.
I successfully converted this model: https://huggingface.co/01-ai/Yi-34B (note that there's both Safetensors and PyTorch versions in the same repo, so be careful unless you actually want to download two copies of the model).
Quantized version runs just fine. I didn't test with the 6B but looking at the tensor index it looks the same.
@TheBloke - tagging in case you're interested in trying to convert this one.