Remove .attention from skipped tensors to match more accurately #7051

bartowski1182 · 2024-05-02T20:36:57Z

This change fixes #7046

https://huggingface.co/nvidia/ChatQA-1.5-8B has tensors called model.layers.x.self_attn.rotary_emb.inv_freq instead of model.layers.x.self_attn.attention.rotary_emb.inv_freq, this change will capture both and properly skip them.

compilade

I'd like to note that this is also done in #7031, but I'm fine with this being fixed separately here.

bartowski1182 · 2024-05-02T23:05:10Z

Ah good catch :) I'll let you know if this gets merged so you can avoid conflict

…-org#7051)

Remove .attention from skipped tensors to match more accurately

f700301

compilade approved these changes May 2, 2024

View reviewed changes

slaren merged commit 60325fa into ggml-org:master May 2, 2024

nopperl pushed a commit to nopperl/llama.cpp that referenced this pull request May 5, 2024

Remove .attention from skipped tensors to match more accurately (ggml…

ee8f5ac

…-org#7051)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove .attention from skipped tensors to match more accurately #7051

Remove .attention from skipped tensors to match more accurately #7051

Uh oh!

bartowski1182 commented May 2, 2024

Uh oh!

compilade left a comment

Uh oh!

bartowski1182 commented May 2, 2024

Uh oh!

Uh oh!

Remove .attention from skipped tensors to match more accurately #7051

Remove .attention from skipped tensors to match more accurately #7051

Uh oh!

Conversation

bartowski1182 commented May 2, 2024

Uh oh!

compilade left a comment

Choose a reason for hiding this comment

Uh oh!

bartowski1182 commented May 2, 2024

Uh oh!

Uh oh!