Feature Request: YuE (music gen)

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggerganov/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggerganov/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

YuE would work similar to the OuteTTS implementation where an LLM (In this case two separate llama models) is involved in the generation of the audio. Yue does not appear to be using wavtokenizer. Instead of just speech YuE is capable of music and sung vocals.

A demo page with links to all the relevant code and models can be found here : https://map-yue.github.io/

### Motivation

For end users: Music generation is a use case currently missing from the llamacpp ecosystem, users can leverage quantized versions of the LLM to generate songs on their own or rented hardware, this model is capable of signing making it more flexible than established non-LLM audio models.
For developers: I think this is an interesting next step in llamacpp's TTS experiments since this is also LLM based. We first saw how language models running in llamacpp could be paired with wavtokenizer to produce audible speech. This would rely on the same existing llama infrastructure but paired with new implementations on the music audio side. It seems similar to Llasa-3B due to both of them using xcodec so the implementation may be sharable between both models.

### Possible Implementation

The llama models should be able to leverage the existing llama implementation, for the audio side this open source project can be used to reference the audio parts : https://github.com/multimodal-art-projection/YuE (paper is pending).

This seems to require xcodec which if compatible would also be progress towards support for Llasa-3B.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: YuE (music gen) #11467

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: YuE (music gen) #11467

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions