Clean up `num_key_value_heads` accessor #1027

masahi · 2023-10-05T20:09:02Z

I found myself repeatedly copy-pasting the cryptic code

num_key_value_heads = (
    config.num_key_value_heads is None
    and config.num_attention_heads
    or config.num_key_value_heads
)

to get the kv head size correctly. Also hit a bug where config.num_key_value_heads is accessed directly but found to be None (vicuna).

I added a helper function to LlamaConfig to avoid these troubles.

junrushao · 2023-10-05T21:03:18Z

Thanks! In the new nn.Module API, this could be simplified in __post_init__, but we haven't got enough resources migrating to that one yet

masahi added 2 commits October 5, 2023 20:05

clean up num kv heads accessor

d47dc30

more

bb9ce32

junrushao approved these changes Oct 5, 2023

View reviewed changes

junrushao merged commit 46164b4 into mlc-ai:main Oct 5, 2023

jeethu mentioned this pull request Oct 13, 2023

Fix Stable LM 3B build #1061

Merged

Provide feedback