Skip to content

Feature Request: YuE (music gen) #11467

Closed
@henk717

Description

@henk717

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

YuE would work similar to the OuteTTS implementation where an LLM (In this case two separate llama models) is involved in the generation of the audio. Yue does not appear to be using wavtokenizer. Instead of just speech YuE is capable of music and sung vocals.

A demo page with links to all the relevant code and models can be found here : https://map-yue.github.io/

Motivation

For end users: Music generation is a use case currently missing from the llamacpp ecosystem, users can leverage quantized versions of the LLM to generate songs on their own or rented hardware, this model is capable of signing making it more flexible than established non-LLM audio models.
For developers: I think this is an interesting next step in llamacpp's TTS experiments since this is also LLM based. We first saw how language models running in llamacpp could be paired with wavtokenizer to produce audible speech. This would rely on the same existing llama infrastructure but paired with new implementations on the music audio side. It seems similar to Llasa-3B due to both of them using xcodec so the implementation may be sharable between both models.

Possible Implementation

The llama models should be able to leverage the existing llama implementation, for the audio side this open source project can be used to reference the audio parts : https://github.com/multimodal-art-projection/YuE (paper is pending).

This seems to require xcodec which if compatible would also be progress towards support for Llasa-3B.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions