-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
Llama 3.1 was just released and it is a significant leg up from the previous series of models: https://huggingface.co/blog/llama31
Whilst the overall architecture is the same, it requires some modelling updates, primarily around RoPE scaling: https://github.com/huggingface/transformers/blob/bc2adb0112b6677b0dfb4105c74570a0f92183eb/src/transformers/modeling_rope_utils.py#L298
It'd be great to add support for those so that the generations are more coherent and make sense.
Motivation
Note: Without the modelling changes, the generation might look coherent, but they are far from great and the true-st potential of the model!
Possible Implementation
Here's the corresponding transformers implementation: https://github.com/huggingface/transformers/blob/bc2adb0112b6677b0dfb4105c74570a0f92183eb/src/transformers/modeling_rope_utils.py#L298