Skip to content

Conversation

patrickvonplaten
Copy link
Contributor

@patrickvonplaten patrickvonplaten commented Jul 13, 2023

What does this PR do?

Ahllow low precision VAEs such as https://huggingface.co/madebyollin/sdxl-vae-fp16-fix

The amazing @madebyollin has fine-tuned a SD-XL VAE checkpoint that works in full fp16! You can try it out as follows:

from diffusers import StableDiffusionXLPipeline, StableDiffusionXLImg2ImgPipeline, AutoencoderKL

vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16, force_upcast=False)
pipe = StableDiffusionXLPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-0.9", vae=vae, torch_dtype=torch.float16, variant="fp16", use_safetensors=True)
pipe.to("cuda")

refiner = StableDiffusionXLImg2ImgPipeline.from_pretrained("stabilityai/stable-diffusion-xl-refiner-0.9", vae=vae, torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
refiner.to("cuda")

This saves 3GB of GPU VRAM and improves speed by 5% for 30 steps

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jul 13, 2023

The documentation is not available anymore as the PR was closed or merged.

@patrickvonplaten
Copy link
Contributor Author

For this PR to work we need to merge: https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/discussions/3

@sayakpaul
Copy link
Member

self.vae.to(dtype)
init_latents = init_latents.to(dtype)
if self.vae.config.force_upcast:
self.vae.to(dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If force_upcast is true, then shouldn't we use upcast_vae()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to possibly move it back here since for the decoder not all layers are upcasted (so we should move it back to fp16 here)

@sayakpaul
Copy link
Member

sayakpaul commented Jul 14, 2023

Looks good to me.

Will all the compatible VAEs have force_upcast in their configs? Maybe a better check would be to first verify if that flag exists in the config no?

if force_upcast in self.vae.config:
...

Am I missing out on something? Should we also add a couple of tests here?

@@ -82,6 +86,7 @@ def __init__(
norm_num_groups: int = 32,
sample_size: int = 32,
scaling_factor: float = 0.18215,
force_upcast: float = True,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @sayakpaul every config has force_upcast

@patrickvonplaten
Copy link
Contributor Author

self.vae.to(dtype)
Member

It's there by default because it's in the constructor arguments: https://github.com/huggingface/diffusers/pull/4083/files#r1263476807

@patrickvonplaten
Copy link
Contributor Author

Merging since it's also needed for inpaint

@patrickvonplaten
Copy link
Contributor Author

Looks good to me.

Will all the compatible VAEs have force_upcast in their configs? Maybe a better check would be to first verify if that flag exists in the config no?

if force_upcast in self.vae.config:
...

Am I missing out on something? Should we also add a couple of tests here?

Tests are a bit tricky here since we don't yet can do slow tests as checkpoint is behind a gate and dummy weights don't have the fp16 issue => I'll set a reminder issue: #4100

@patrickvonplaten patrickvonplaten merged commit ad8f985 into main Jul 14, 2023
@patrickvonplaten patrickvonplaten deleted the allow_low_precision_vae_sd_xl branch July 14, 2023 12:51
@AmericanPresidentJimmyCarter
Copy link
Contributor

Related: #4102

Me and @Birch-san have been just using bfloat16 in our pipelines for the VAE and it works perfectly well without all the extra legwork that the upcast requires, and is faster.

sayakpaul added a commit that referenced this pull request Jul 18, 2023
sayakpaul added a commit that referenced this pull request Jul 18, 2023
* add: controlnet sdxl.

* modifications to controlnet.

* run styling.

* add: __init__.pys

* incorporate #4019 changes.

* run make fix-copies.

* resize the conditioning images.

* remove autocast.

* run styling.

* disable autocast.

* debugging

* device placement.

* back to autocast.

* remove comment.

* save some memory by reusing the vae and unet in the pipeline.

* apply styling.

* Allow low precision sd xl

* finish

* finish

* changes to accommodate the improved VAE.

* modifications to how we handle vae encoding in the training.

* make style

* make existing controlnet fast tests pass.

* change vae checkpoint cli arg.

* fix: vae pretrained paths.

* fix: steps in get_scheduler().

* debugging.

* debugging./

* fix: weight conversion.

* add: docs.

* add: limited tests./

* add: datasets to the requirements.

* update docstrings and incorporate the usage of watermarking.

* incorporate fix from #4083

* fix watermarking dependency handling.

* run make-fix-copies.

* Empty-Commit

* Update requirements_sdxl.txt

* remove vae upcasting part.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make style

* run make fix-copies.

* disable suppot for multicontrolnet.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make fix-copies.

* dtyle/.

* fix-copies.

---------

Co-authored-by: Patrick von Platen <[email protected]>
@wilfrediscoming
Copy link

Does this work for other pipeline? How do I know which VAE will work?

pipe = StableDiffusionInpaintPipeline.from_pretrained(
    runwayml/stable-diffusion-inpainting,
    revision="fp16",
    torch_dtype=torch.float16,
)

@sayakpaul
Copy link
Member

sayakpaul commented Jul 25, 2023

AFAIK The VAE is specifically suited for SDXL.

orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
* Allow low precision sd xl

* finish

* finish

* make style
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
* add: controlnet sdxl.

* modifications to controlnet.

* run styling.

* add: __init__.pys

* incorporate huggingface#4019 changes.

* run make fix-copies.

* resize the conditioning images.

* remove autocast.

* run styling.

* disable autocast.

* debugging

* device placement.

* back to autocast.

* remove comment.

* save some memory by reusing the vae and unet in the pipeline.

* apply styling.

* Allow low precision sd xl

* finish

* finish

* changes to accommodate the improved VAE.

* modifications to how we handle vae encoding in the training.

* make style

* make existing controlnet fast tests pass.

* change vae checkpoint cli arg.

* fix: vae pretrained paths.

* fix: steps in get_scheduler().

* debugging.

* debugging./

* fix: weight conversion.

* add: docs.

* add: limited tests./

* add: datasets to the requirements.

* update docstrings and incorporate the usage of watermarking.

* incorporate fix from huggingface#4083

* fix watermarking dependency handling.

* run make-fix-copies.

* Empty-Commit

* Update requirements_sdxl.txt

* remove vae upcasting part.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make style

* run make fix-copies.

* disable suppot for multicontrolnet.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make fix-copies.

* dtyle/.

* fix-copies.

---------

Co-authored-by: Patrick von Platen <[email protected]>
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
* Allow low precision sd xl

* finish

* finish

* make style
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
* add: controlnet sdxl.

* modifications to controlnet.

* run styling.

* add: __init__.pys

* incorporate huggingface#4019 changes.

* run make fix-copies.

* resize the conditioning images.

* remove autocast.

* run styling.

* disable autocast.

* debugging

* device placement.

* back to autocast.

* remove comment.

* save some memory by reusing the vae and unet in the pipeline.

* apply styling.

* Allow low precision sd xl

* finish

* finish

* changes to accommodate the improved VAE.

* modifications to how we handle vae encoding in the training.

* make style

* make existing controlnet fast tests pass.

* change vae checkpoint cli arg.

* fix: vae pretrained paths.

* fix: steps in get_scheduler().

* debugging.

* debugging./

* fix: weight conversion.

* add: docs.

* add: limited tests./

* add: datasets to the requirements.

* update docstrings and incorporate the usage of watermarking.

* incorporate fix from huggingface#4083

* fix watermarking dependency handling.

* run make-fix-copies.

* Empty-Commit

* Update requirements_sdxl.txt

* remove vae upcasting part.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make style

* run make fix-copies.

* disable suppot for multicontrolnet.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make fix-copies.

* dtyle/.

* fix-copies.

---------

Co-authored-by: Patrick von Platen <[email protected]>
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
* Allow low precision sd xl

* finish

* finish

* make style
orpatashnik pushed a commit to orpatashnik/diffusers that referenced this pull request Aug 1, 2023
* add: controlnet sdxl.

* modifications to controlnet.

* run styling.

* add: __init__.pys

* incorporate huggingface#4019 changes.

* run make fix-copies.

* resize the conditioning images.

* remove autocast.

* run styling.

* disable autocast.

* debugging

* device placement.

* back to autocast.

* remove comment.

* save some memory by reusing the vae and unet in the pipeline.

* apply styling.

* Allow low precision sd xl

* finish

* finish

* changes to accommodate the improved VAE.

* modifications to how we handle vae encoding in the training.

* make style

* make existing controlnet fast tests pass.

* change vae checkpoint cli arg.

* fix: vae pretrained paths.

* fix: steps in get_scheduler().

* debugging.

* debugging./

* fix: weight conversion.

* add: docs.

* add: limited tests./

* add: datasets to the requirements.

* update docstrings and incorporate the usage of watermarking.

* incorporate fix from huggingface#4083

* fix watermarking dependency handling.

* run make-fix-copies.

* Empty-Commit

* Update requirements_sdxl.txt

* remove vae upcasting part.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make style

* run make fix-copies.

* disable suppot for multicontrolnet.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make fix-copies.

* dtyle/.

* fix-copies.

---------

Co-authored-by: Patrick von Platen <[email protected]>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* Allow low precision sd xl

* finish

* finish

* make style
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* add: controlnet sdxl.

* modifications to controlnet.

* run styling.

* add: __init__.pys

* incorporate huggingface#4019 changes.

* run make fix-copies.

* resize the conditioning images.

* remove autocast.

* run styling.

* disable autocast.

* debugging

* device placement.

* back to autocast.

* remove comment.

* save some memory by reusing the vae and unet in the pipeline.

* apply styling.

* Allow low precision sd xl

* finish

* finish

* changes to accommodate the improved VAE.

* modifications to how we handle vae encoding in the training.

* make style

* make existing controlnet fast tests pass.

* change vae checkpoint cli arg.

* fix: vae pretrained paths.

* fix: steps in get_scheduler().

* debugging.

* debugging./

* fix: weight conversion.

* add: docs.

* add: limited tests./

* add: datasets to the requirements.

* update docstrings and incorporate the usage of watermarking.

* incorporate fix from huggingface#4083

* fix watermarking dependency handling.

* run make-fix-copies.

* Empty-Commit

* Update requirements_sdxl.txt

* remove vae upcasting part.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make style

* run make fix-copies.

* disable suppot for multicontrolnet.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make fix-copies.

* dtyle/.

* fix-copies.

---------

Co-authored-by: Patrick von Platen <[email protected]>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* Allow low precision sd xl

* finish

* finish

* make style
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* add: controlnet sdxl.

* modifications to controlnet.

* run styling.

* add: __init__.pys

* incorporate huggingface#4019 changes.

* run make fix-copies.

* resize the conditioning images.

* remove autocast.

* run styling.

* disable autocast.

* debugging

* device placement.

* back to autocast.

* remove comment.

* save some memory by reusing the vae and unet in the pipeline.

* apply styling.

* Allow low precision sd xl

* finish

* finish

* changes to accommodate the improved VAE.

* modifications to how we handle vae encoding in the training.

* make style

* make existing controlnet fast tests pass.

* change vae checkpoint cli arg.

* fix: vae pretrained paths.

* fix: steps in get_scheduler().

* debugging.

* debugging./

* fix: weight conversion.

* add: docs.

* add: limited tests./

* add: datasets to the requirements.

* update docstrings and incorporate the usage of watermarking.

* incorporate fix from huggingface#4083

* fix watermarking dependency handling.

* run make-fix-copies.

* Empty-Commit

* Update requirements_sdxl.txt

* remove vae upcasting part.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make style

* run make fix-copies.

* disable suppot for multicontrolnet.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* run make fix-copies.

* dtyle/.

* fix-copies.

---------

Co-authored-by: Patrick von Platen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants