Skip to content

Music Spectrogram diffusion pipeline #1044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 159 commits into from
Mar 23, 2023

Conversation

kashif
Copy link
Contributor

@kashif kashif commented Oct 28, 2022

For issue #320 and #544

@kashif kashif marked this pull request as draft October 28, 2022 16:30
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 28, 2022

The documentation is not available anymore as the PR was closed or merged.

@patrickvonplaten
Copy link
Contributor

@patil-suraj @anton-l maybe you could already take a look here if you find some time :-)

@anton-l
Copy link
Member

anton-l commented Nov 2, 2022

There's no pipeline.__call__ yet, so I'll wait for @kashif's ping when it's ready 😄

@kashif
Copy link
Contributor Author

kashif commented Nov 2, 2022

yup getting there...

@patrickvonplaten
Copy link
Contributor

This looks like a great start! @kashif could you add a code snippet showing how to run the model for inference? Similar to #658 (comment) maybe? :-)

Once we can reproduce some results locally, I think we'll have a much easier time getting this PR merged :-)

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I think we're very close to merging this one.

Only thing left to do is:

  • Update notes to adhere to the newest API (e.g. don't pass file name to pipeline, but tokens
  • Update tests and docs to use "google/...." model id instead of "kashif"`

Also could you maybe update the example here: https://huggingface.co/google/music-spectrogram-diffusion#example-usage

@patrickvonplaten
Copy link
Contributor

@patrickvonplaten so the failing docs are because note_seq installs the latest protobuf which causes issues with tensorboard etc. so one solution was to also pin protobuf to 3.20.x which seems to make everyone happy?

Ahh I see - yes let's pin it :-)

@patrickvonplaten
Copy link
Contributor

Think we need a "make style" here and then we should be good to go :-)

@kashif
Copy link
Contributor Author

kashif commented Mar 21, 2023

@patrickvonplaten I could not reproduce the one failed mps test...

@patrickvonplaten
Copy link
Contributor

Think the MPS failure is unrelated - let's merge once the tests now are more or less green :-)

@patrickvonplaten patrickvonplaten merged commit 2ef9bdd into huggingface:main Mar 23, 2023
@kashif kashif deleted the spectrogram-diffusion branch March 23, 2023 13:24
w4ffl35 pushed a commit to w4ffl35/diffusers that referenced this pull request Apr 14, 2023
* initial TokenEncoder and ContinuousEncoder

* initial modules

* added ContinuousContextTransformer

* fix copy paste error

* use numpy for get_sequence_length

* initial terminal relative positional encodings

* fix weights keys

* fix assert

* cross attend style: concat encodings

* make style

* concat once

* fix formatting

* Initial SpectrogramPipeline

* fix input_tokens

* make style

* added mel output

* ignore weights for config

* move mel to numpy

* import pipeline

* fix class names and import

* moved models to models folder

* import ContinuousContextTransformer and SpectrogramDiffusionPipeline

* initial spec diffusion converstion script

* renamed config to t5config

* added weight loading

* use arguments instead of t5config

* broadcast noise time to batch dim

* fix call

* added scale_to_features

* fix weights

* transpose laynorm weight

* scale is a vector

* scale the query outputs

* added comment

* undo scaling

* undo depth_scaling

* inital get_extended_attention_mask

* attention_mask is none in self-attention

* cleanup

* manually invert attention

* nn.linear need bias=False

* added T5LayerFFCond

* remove to fix conflict

* make style and dummy

* remove unsed variables

* remove predict_epsilon

* Move accelerate to a soft-dependency (huggingface#1134)

* finish

* finish

* Update src/diffusers/modeling_utils.py

* Update src/diffusers/pipeline_utils.py

Co-authored-by: Anton Lozhkov <[email protected]>

* more fixes

* fix

Co-authored-by: Anton Lozhkov <[email protected]>

* fix order

* added initial midi to note token data pipeline

* added int to int tokenizer

* remove duplicate

* added logic for segments

* add melgan to pipeline

* move autoregressive gen into pipeline

* added note_representation_processor_chain

* fix dtypes

* remove immutabledict req

* initial doc

* use np.where

* require note_seq

* fix typo

* update dependency

* added note-seq to test

* added is_note_seq_available

* fix import

* added toc

* added example usage

* undo for now

* moved docs

* fix merge

* fix imports

* predict first segment

* avoid un-needed copy to and from cpu

* make style

* Copyright

* fix style

* add test and fix inference steps

* remove bogus files

* reorder models

* up

* remove transformers dependency

* make work with diffusers cross attention

* clean more

* remove @

* improve further

* up

* uP

* Apply suggestions from code review

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

* loop over all tokens

* make style

* Added a section on the model

* fix formatting

* grammer

* formatting

* make fix-copies

* Update src/diffusers/pipelines/__init__.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/spectrogram_diffusion/pipeline_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* added callback ad optional ionnx

* do not squeeze batch dim

* clean up more

* upload

* convert jax to nnumpy

* make style

* fix warning

* make fix-copies

* fix warning

* add initial fast tests

* add initial pipeline_params

* eval mode due to dropout

* skip batch tests as pipeline runs on a single file

* make style

* fix relative path

* fix doc tests

* Update src/diffusers/models/t5_film_transformer.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/models/t5_film_transformer.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update docs/source/en/api/pipelines/spectrogram_diffusion.mdx

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* add MidiProcessor

* format

* fix org

* Apply suggestions from code review

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

* make style

* pin protobuf to <4

* fix formatting

* white space

* tensorboard needs protobuf

---------

Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Anton Lozhkov <[email protected]>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* initial TokenEncoder and ContinuousEncoder

* initial modules

* added ContinuousContextTransformer

* fix copy paste error

* use numpy for get_sequence_length

* initial terminal relative positional encodings

* fix weights keys

* fix assert

* cross attend style: concat encodings

* make style

* concat once

* fix formatting

* Initial SpectrogramPipeline

* fix input_tokens

* make style

* added mel output

* ignore weights for config

* move mel to numpy

* import pipeline

* fix class names and import

* moved models to models folder

* import ContinuousContextTransformer and SpectrogramDiffusionPipeline

* initial spec diffusion converstion script

* renamed config to t5config

* added weight loading

* use arguments instead of t5config

* broadcast noise time to batch dim

* fix call

* added scale_to_features

* fix weights

* transpose laynorm weight

* scale is a vector

* scale the query outputs

* added comment

* undo scaling

* undo depth_scaling

* inital get_extended_attention_mask

* attention_mask is none in self-attention

* cleanup

* manually invert attention

* nn.linear need bias=False

* added T5LayerFFCond

* remove to fix conflict

* make style and dummy

* remove unsed variables

* remove predict_epsilon

* Move accelerate to a soft-dependency (huggingface#1134)

* finish

* finish

* Update src/diffusers/modeling_utils.py

* Update src/diffusers/pipeline_utils.py

Co-authored-by: Anton Lozhkov <[email protected]>

* more fixes

* fix

Co-authored-by: Anton Lozhkov <[email protected]>

* fix order

* added initial midi to note token data pipeline

* added int to int tokenizer

* remove duplicate

* added logic for segments

* add melgan to pipeline

* move autoregressive gen into pipeline

* added note_representation_processor_chain

* fix dtypes

* remove immutabledict req

* initial doc

* use np.where

* require note_seq

* fix typo

* update dependency

* added note-seq to test

* added is_note_seq_available

* fix import

* added toc

* added example usage

* undo for now

* moved docs

* fix merge

* fix imports

* predict first segment

* avoid un-needed copy to and from cpu

* make style

* Copyright

* fix style

* add test and fix inference steps

* remove bogus files

* reorder models

* up

* remove transformers dependency

* make work with diffusers cross attention

* clean more

* remove @

* improve further

* up

* uP

* Apply suggestions from code review

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

* loop over all tokens

* make style

* Added a section on the model

* fix formatting

* grammer

* formatting

* make fix-copies

* Update src/diffusers/pipelines/__init__.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/spectrogram_diffusion/pipeline_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* added callback ad optional ionnx

* do not squeeze batch dim

* clean up more

* upload

* convert jax to nnumpy

* make style

* fix warning

* make fix-copies

* fix warning

* add initial fast tests

* add initial pipeline_params

* eval mode due to dropout

* skip batch tests as pipeline runs on a single file

* make style

* fix relative path

* fix doc tests

* Update src/diffusers/models/t5_film_transformer.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/models/t5_film_transformer.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update docs/source/en/api/pipelines/spectrogram_diffusion.mdx

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* add MidiProcessor

* format

* fix org

* Apply suggestions from code review

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

* make style

* pin protobuf to <4

* fix formatting

* white space

* tensorboard needs protobuf

---------

Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Anton Lozhkov <[email protected]>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* initial TokenEncoder and ContinuousEncoder

* initial modules

* added ContinuousContextTransformer

* fix copy paste error

* use numpy for get_sequence_length

* initial terminal relative positional encodings

* fix weights keys

* fix assert

* cross attend style: concat encodings

* make style

* concat once

* fix formatting

* Initial SpectrogramPipeline

* fix input_tokens

* make style

* added mel output

* ignore weights for config

* move mel to numpy

* import pipeline

* fix class names and import

* moved models to models folder

* import ContinuousContextTransformer and SpectrogramDiffusionPipeline

* initial spec diffusion converstion script

* renamed config to t5config

* added weight loading

* use arguments instead of t5config

* broadcast noise time to batch dim

* fix call

* added scale_to_features

* fix weights

* transpose laynorm weight

* scale is a vector

* scale the query outputs

* added comment

* undo scaling

* undo depth_scaling

* inital get_extended_attention_mask

* attention_mask is none in self-attention

* cleanup

* manually invert attention

* nn.linear need bias=False

* added T5LayerFFCond

* remove to fix conflict

* make style and dummy

* remove unsed variables

* remove predict_epsilon

* Move accelerate to a soft-dependency (huggingface#1134)

* finish

* finish

* Update src/diffusers/modeling_utils.py

* Update src/diffusers/pipeline_utils.py

Co-authored-by: Anton Lozhkov <[email protected]>

* more fixes

* fix

Co-authored-by: Anton Lozhkov <[email protected]>

* fix order

* added initial midi to note token data pipeline

* added int to int tokenizer

* remove duplicate

* added logic for segments

* add melgan to pipeline

* move autoregressive gen into pipeline

* added note_representation_processor_chain

* fix dtypes

* remove immutabledict req

* initial doc

* use np.where

* require note_seq

* fix typo

* update dependency

* added note-seq to test

* added is_note_seq_available

* fix import

* added toc

* added example usage

* undo for now

* moved docs

* fix merge

* fix imports

* predict first segment

* avoid un-needed copy to and from cpu

* make style

* Copyright

* fix style

* add test and fix inference steps

* remove bogus files

* reorder models

* up

* remove transformers dependency

* make work with diffusers cross attention

* clean more

* remove @

* improve further

* up

* uP

* Apply suggestions from code review

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

* loop over all tokens

* make style

* Added a section on the model

* fix formatting

* grammer

* formatting

* make fix-copies

* Update src/diffusers/pipelines/__init__.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/spectrogram_diffusion/pipeline_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* added callback ad optional ionnx

* do not squeeze batch dim

* clean up more

* upload

* convert jax to nnumpy

* make style

* fix warning

* make fix-copies

* fix warning

* add initial fast tests

* add initial pipeline_params

* eval mode due to dropout

* skip batch tests as pipeline runs on a single file

* make style

* fix relative path

* fix doc tests

* Update src/diffusers/models/t5_film_transformer.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/models/t5_film_transformer.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update docs/source/en/api/pipelines/spectrogram_diffusion.mdx

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

Co-authored-by: Patrick von Platen <[email protected]>

* add MidiProcessor

* format

* fix org

* Apply suggestions from code review

* Update tests/pipelines/spectrogram_diffusion/test_spectrogram_diffusion.py

* make style

* pin protobuf to <4

* fix formatting

* white space

* tensorboard needs protobuf

---------

Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Anton Lozhkov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants