Skip to content

GGML_ASSERT: llama.cpp:13518: (qs.n_attention_wv == 0 || qs.n_attention_wv == (int)model.hparams.n_layer) && "n_attention_wv is unexpected" #6702

Closed
@schmorp

Description

@schmorp

I got this when trying to make a Q8_0 quant of https://huggingface.co/Noodlz/DolphinLake-7B

When using main, it also fails:

llama_model_load: error loading model: check_tensor_dims: tensor 'blk.1.ffn_down.weight' not found

Assuming that the model itself is broken (without claiming that it is), maybe convert.py could detect missing tensors or something like that?

If it's a broken model and everything acts as it should, apologies.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions