GGML_ASSERT: llama.cpp:13518: (qs.n_attention_wv == 0 || qs.n_attention_wv == (int)model.hparams.n_layer) && "n_attention_wv is unexpected"

I got this when trying to make a Q8_0 quant of https://huggingface.co/Noodlz/DolphinLake-7B

When using main, it also fails:

llama_model_load: error loading model: check_tensor_dims: tensor 'blk.1.ffn_down.weight' not found

Assuming that the model itself is broken (without claiming that it is), maybe convert.py could detect missing tensors or something like that?

If it's a broken model and everything acts as it should, apologies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GGML_ASSERT: llama.cpp:13518: (qs.n_attention_wv == 0 || qs.n_attention_wv == (int)model.hparams.n_layer) && "n_attention_wv is unexpected" #6702

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GGML_ASSERT: llama.cpp:13518: (qs.n_attention_wv == 0 || qs.n_attention_wv == (int)model.hparams.n_layer) && "n_attention_wv is unexpected" #6702

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions