-
Notifications
You must be signed in to change notification settings - Fork 13.1k
Closed
Labels
Description
I converted and quantized glm-4-9b-chat-1m in the usual way.
when run with llama I get:
llm_load_print_meta: max token length = 1024
llm_load_tensors: ggml ctx size = 0.14 MiB
llama_model_load: error loading model: check_tensor_dims: tensor 'blk.0.attn_qkv.weight' has wrong shape; expected 4096, 4608, got 4096, 5120, 1, 1
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '/content/glm-4-9b-chat-1m.q5_k.gguf'
main: error: unable to load model
same with all other quants.
I got no errors during conversion nor during quantizations.
daiwt and jiaolongxue