Feature Request: Proper Llama 3.1 Support in llama.cpp

### Prerequisites

- [X] I am running the latest code. Mention the version if possible as well.
- [X] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md).
- [X] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [X] I reviewed the [Discussions](https://github.com/ggerganov/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

Llama 3.1 was just released and it is a significant leg up from the previous series of models: https://huggingface.co/blog/llama31

Whilst the overall architecture is the same, it requires some modelling updates, primarily around RoPE scaling: https://github.com/huggingface/transformers/blob/bc2adb0112b6677b0dfb4105c74570a0f92183eb/src/transformers/modeling_rope_utils.py#L298

It'd be great to add support for those so that the generations are more coherent and make sense.

### Motivation

Note: Without the modelling changes, the generation might look coherent, but they are far from great and the true-st potential of the model!

### Possible Implementation

Here's the corresponding transformers implementation: https://github.com/huggingface/transformers/blob/bc2adb0112b6677b0dfb4105c74570a0f92183eb/src/transformers/modeling_rope_utils.py#L298

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Proper Llama 3.1 Support in llama.cpp #8650

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Proper Llama 3.1 Support in llama.cpp #8650

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions