Skip to content

unCLIP variant #2297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Feb 14, 2023
Merged

unCLIP variant #2297

merged 16 commits into from
Feb 14, 2023

Conversation

williamberman
Copy link
Contributor

@williamberman williamberman commented Feb 9, 2023

Porting weights

  • Weights are ported with modifications to the existing convert_from_ckpt.py script

Changes to existing models

  • The unet is conditioned on the unCLIP image embedding through the existing class_embeddings. The embedding type is very similar to the existing timestep embeddings type except that the class_embeddings should not be first converted to sinusoidal embeddings and should be projected from an arbitrary input dimension. I added the new class_embed_type "projection" for this.

New models

  • Added the NoiseAugmentor class which handles adding noise to the image embeddings. This had to be a separate class because it needs parameters to store the clip mean and std vectors. The class is configured with a noise schedule just as our scheduler classes are. I considered making this class much lighter weight by removing the noise schedule and just having it hold the clip stats. If we did this, we could use the existing scheduler class on the pipeline to noise the vector. I opted against doing this because it would require both of the noise schedules for the diffusion process and augmenting the image embedding to be configured the same. This would work in the current models we ported as they do use the same squaredcos_cap_v2 betas with 1000 timesteps. However, we can't guarantee that will always be the case, and the noise scheduling code to be duplicated is quite small. Instead added a DDPMScheduler to the pipeline to hold the noising schedule and used StableUnCLIPImageNormalizer to hold the CLIP statistics

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Feb 9, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Let's just add docs now and we're good to merge :-)

@patrickvonplaten
Copy link
Contributor

Happy to merge after the docs are in :-)

@williamberman williamberman force-pushed the pipeline_variant branch 2 times, most recently from d7c18f4 to 79f1f5e Compare February 14, 2023 18:41
@@ -17,7 +17,7 @@ specific language governing permissions and limitations under the License.
The Stable Diffusion model was created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/), [runway](https://github.com/runwayml), and [LAION](https://laion.ai/). The [`StableDiffusionPipeline`] is capable of generating photo-realistic images given any text input using Stable Diffusion.

The original codebase can be found here:
- *Stable Diffusion V1*: [CampVis/stable-diffusion](https://github.com/CompVis/stable-diffusion)
- *Stable Diffusion V1*: [CompVis/stable-diffusion](https://github.com/CompVis/stable-diffusion)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super! Thanks a lot for the addition

@williamberman williamberman merged commit 62b3c9e into huggingface:main Feb 14, 2023
Dango233 pushed a commit to Dango233/diffusers that referenced this pull request Mar 12, 2023
* pipeline_variant

* Add docs for when clip_stats_path is specified

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py

Co-authored-by: Patrick von Platen <[email protected]>

* prepare_latents # Copied from re: @patrickvonplaten

* NoiseAugmentor->ImageNormalizer

* stable_unclip_prior default to None re: @patrickvonplaten

* prepare_prior_extra_step_kwargs

* prior denoising scale model input

* {DDIM,DDPM}Scheduler -> KarrasDiffusionSchedulers re: @patrickvonplaten

* docs

* Update docs/source/en/api/pipelines/stable_unclip.mdx

Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: Patrick von Platen <[email protected]>
@aluo-x
Copy link

aluo-x commented Mar 20, 2023

It appears that the documentation currently points to an invalid checkpoint:

fusing/stable-unclip-2-1-l

so currently if you attempt to run the model according to the documentation it will fail.

The stable unclip model is also conceptually very similar to the versatile diffusion model (they have image variations and text + image conditioned image synthesis). Perhaps a note can be put into the current stable_unclip docs so people can utilize that model until the stable-unclip weights are up?
@williamberman @patrickvonplaten

@patrickvonplaten
Copy link
Contributor

Yeah this is a bit WIP still

mengfei25 pushed a commit to mengfei25/diffusers that referenced this pull request Mar 27, 2023
* pipeline_variant

* Add docs for when clip_stats_path is specified

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py

Co-authored-by: Patrick von Platen <[email protected]>

* prepare_latents # Copied from re: @patrickvonplaten

* NoiseAugmentor->ImageNormalizer

* stable_unclip_prior default to None re: @patrickvonplaten

* prepare_prior_extra_step_kwargs

* prior denoising scale model input

* {DDIM,DDPM}Scheduler -> KarrasDiffusionSchedulers re: @patrickvonplaten

* docs

* Update docs/source/en/api/pipelines/stable_unclip.mdx

Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: Patrick von Platen <[email protected]>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* pipeline_variant

* Add docs for when clip_stats_path is specified

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py

Co-authored-by: Patrick von Platen <[email protected]>

* prepare_latents # Copied from re: @patrickvonplaten

* NoiseAugmentor->ImageNormalizer

* stable_unclip_prior default to None re: @patrickvonplaten

* prepare_prior_extra_step_kwargs

* prior denoising scale model input

* {DDIM,DDPM}Scheduler -> KarrasDiffusionSchedulers re: @patrickvonplaten

* docs

* Update docs/source/en/api/pipelines/stable_unclip.mdx

Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: Patrick von Platen <[email protected]>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* pipeline_variant

* Add docs for when clip_stats_path is specified

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py

Co-authored-by: Patrick von Platen <[email protected]>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_unclip_img2img.py

Co-authored-by: Patrick von Platen <[email protected]>

* prepare_latents # Copied from re: @patrickvonplaten

* NoiseAugmentor->ImageNormalizer

* stable_unclip_prior default to None re: @patrickvonplaten

* prepare_prior_extra_step_kwargs

* prior denoising scale model input

* {DDIM,DDPM}Scheduler -> KarrasDiffusionSchedulers re: @patrickvonplaten

* docs

* Update docs/source/en/api/pipelines/stable_unclip.mdx

Co-authored-by: Patrick von Platen <[email protected]>

---------

Co-authored-by: Patrick von Platen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants