Add Note2audio model #544

ArthurZucker · 2022-09-17T11:10:05Z

The note2audio model is pretty complexe, it uses a T5 style EncoderDecoder. During the diffusion process, conditioning can be given to the encoder in two ways, MIDI file and the previous spectrogram. Two seperate network take care of the concatenation and then the Spectrogram Decoder generates a spectrogram.

Finally, SoundStream is used as a Vocoder to convert the MelSpectrogram to a raw audio. We only need to use the decoder part of SoundStream.

HuggingFaceDocBuilderDev · 2022-09-17T11:14:14Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

patrickvonplaten · 2022-09-17T12:16:06Z

Super cool!

Note that we should add the vocoder in the very last step (it'll require some tf graph/onnx hacking )

ArthurZucker · 2022-09-17T12:49:51Z

It will require the conversion from TF's SoundStream 😅

I will focus on the T5v1.1 style encoder decoder now.

BTW tell me if the file where I am putting the model is correct or if it needs changing!

patil-suraj · 2022-09-20T15:04:14Z

Very cool! Let me know if you need any help with T5X and weight conversion.

github-actions · 2022-11-07T15:03:22Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patrickvonplaten · 2022-11-09T19:03:06Z

Closing in favor of #1044

ArthurZucker added 4 commits September 17, 2022 13:05

add vocoders

1577275

draft pipeline

f437397

README

0eab64b

draft conversion script and music transformer

b04cdbc

ArthurZucker added 3 commits September 17, 2022 17:17

add film layer

baf39aa

style

c5b1620

Update, a note tokenizer will be required

d265f70

Merge branch 'main' into note2audio

6f443ec

patrickvonplaten mentioned this pull request Oct 21, 2022

Notes2Audio #320

Closed

2 tasks

github-actions bot added the stale Issues that haven't received updates label Nov 7, 2022

patrickvonplaten closed this Nov 9, 2022

kashif mentioned this pull request Feb 8, 2023

Music Spectrogram diffusion pipeline #1044

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Note2audio model #544

Add Note2audio model #544

Uh oh!

ArthurZucker commented Sep 17, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Sep 17, 2022

Uh oh!

patrickvonplaten commented Sep 17, 2022

Uh oh!

ArthurZucker commented Sep 17, 2022

Uh oh!

patil-suraj commented Sep 20, 2022

Uh oh!

github-actions bot commented Nov 7, 2022

Uh oh!

patrickvonplaten commented Nov 9, 2022

Uh oh!

Uh oh!

Add Note2audio model #544

Add Note2audio model #544

Uh oh!

Conversation

ArthurZucker commented Sep 17, 2022

Uh oh!

HuggingFaceDocBuilderDev commented Sep 17, 2022

Uh oh!

patrickvonplaten commented Sep 17, 2022

Uh oh!

ArthurZucker commented Sep 17, 2022

Uh oh!

patil-suraj commented Sep 20, 2022

Uh oh!

github-actions bot commented Nov 7, 2022

Uh oh!

patrickvonplaten commented Nov 9, 2022

Uh oh!

Uh oh!