Skip to content

Conversation

masahi
Copy link
Contributor

@masahi masahi commented Oct 5, 2023

I found myself repeatedly copy-pasting the cryptic code

num_key_value_heads = (
    config.num_key_value_heads is None
    and config.num_attention_heads
    or config.num_key_value_heads
) 

to get the kv head size correctly. Also hit a bug where config.num_key_value_heads is accessed directly but found to be None (vicuna).

I added a helper function to LlamaConfig to avoid these troubles.

@junrushao @LeshengJin

@junrushao
Copy link
Member

Thanks! In the new nn.Module API, this could be simplified in __post_init__, but we haven't got enough resources migrating to that one yet

@junrushao junrushao merged commit 46164b4 into mlc-ai:main Oct 5, 2023
@jeethu jeethu mentioned this pull request Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants