StableDiffusionLatentUpscalePipeline - positive/negative prompt embeds support #8947

rootonchair · 2024-07-23T17:37:43Z

What does this PR do?

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@yiyixuxu

rootonchair · 2024-07-23T17:38:27Z

Test script

from diffusers import StableDiffusionLatentUpscalePipeline, StableDiffusionPipeline
import torch

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipeline.to("cuda")

upscaler = StableDiffusionLatentUpscalePipeline.from_pretrained("stabilityai/sd-x2-latent-upscaler", torch_dtype=torch.float16)
upscaler.to("cuda")

prompt = "a photo of an astronaut high resolution, unreal engine, ultra realistic"
generator = torch.manual_seed(33)

# we stay in latent space! Let's make sure that Stable Diffusion returns the image
# in latent space
low_res_latents = pipeline(prompt, generator=generator, output_type="latent").images

upscaled_image = upscaler(
    prompt=prompt,
    image=low_res_latents,
    num_inference_steps=20,
    guidance_scale=0,
    generator=generator,
).images[0]

# Let's save the upscaled image under "upscaled_astronaut.png"
upscaled_image.save("astronaut_1024.png")

prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds = upscaler.encode_prompt(
    prompt=prompt,
    device=upscaler._execution_device,
    do_classifier_free_guidance=False,
)

upscaled_image = upscaler(
    image=low_res_latents,
    num_inference_steps=20,
    guidance_scale=0,
    generator=generator,
    prompt_embeds=prompt_embeds,
    pooled_prompt_embeds=pooled_prompt_embeds,
).images[0]

upscaled_image.save("embeds_astronaut_1024.png")

# as a comparison: Let's also save the low-res image
with torch.no_grad():
    image = pipeline.decode_latents(low_res_latents)
image = pipeline.numpy_to_pil(image)[0]

image.save("astronaut_512.png")

HuggingFaceDocBuilderDev · 2024-07-24T01:57:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu

oh thanks! I think it looks really nice!

sayakpaul

LGTM! Could you also share some results stemming from this change?

Let's also add a fast test for this?

rootonchair · 2024-07-25T15:38:12Z

Could you also share some results stemming from this change?

Sure
Here is the original result

And result using embeds

which does not have much different

Let's also add a fast test for this?

Should I open another PR for this @sayakpaul ?

sayakpaul · 2024-07-25T15:39:50Z

Should I open another PR for this @sayakpaul ?

We can do this in this PR

…r/diffusers into latent_upscaler_prompt_embeds

rootonchair · 2024-08-11T16:25:23Z

Hi @sayakpaul, could you help me review this PR?

…r/diffusers into latent_upscaler_prompt_embeds

yiyixuxu · 2024-08-18T19:09:17Z

tests/pipelines/stable_diffusion/test_stable_diffusion_latent_upscaler.py

@@ -0,0 +1,218 @@
+# coding=utf-8


oh actually I think the tests for latent upscaler is here https://github.com/huggingface/diffusers/blob/main/tests/pipelines/stable_diffusion_2/test_stable_diffusion_latent_upscale.py

maybe we add new tests there? very sorry for all the additional work to create a new test from scratch

rootonchair · 2024-08-19T15:34:51Z

@yiyixuxu I have just update the existing test. Could you help me review? Btw, adding new test is kinda interesting too. So no worries

yiyixuxu · 2024-08-19T18:13:59Z

tests/pipelines/stable_diffusion_2/test_stable_diffusion_latent_upscale.py

@@ -173,11 +173,51 @@ def test_inference(self):

        self.assertEqual(image.shape, (1, 256, 256, 3))
        expected_slice = np.array(
-            [0.47222412, 0.41921633, 0.44717434, 0.46874192, 0.42588258, 0.46150726, 0.4677534, 0.45583832, 0.48579055]
+            [0.3970313, 0.3768756, 0.41147298, 0.4716793, 0.5115408, 0.44601366, 0.43763855, 0.46781355, 0.46358708]


why do we need to update the expected_slice here? the results of existing tests should not change, no?

rootonchair · 2024-08-20T15:13:32Z

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py

        if image.shape[1] == 3:
            # encode image if not in latent-space yet
-            image = self.vae.encode(image).latent_dist.sample() * self.vae.config.scaling_factor
+            image = retrieve_latents(self.vae.encode(image), generator=generator) * self.vae.config.scaling_factor


@yiyixuxu I think it's due to this line. The old code does not take in generator

yiyixuxu

I think we got the order of the negative_prompt_embeds and prompt_embeds reversed, that's why the previous test wasn't passing. Let's make the change, and change the test back and make sure it passes :)

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py

tests/pipelines/stable_diffusion_2/test_stable_diffusion_latent_upscale.py

…sion_latent_upscale.py Co-authored-by: YiYi Xu <[email protected]>

…t_upscale.py Co-authored-by: YiYi Xu <[email protected]>

rootonchair · 2024-08-21T03:13:37Z

@yiyixuxu thanks for the catch! Sorry I did not check it thoroughly

yiyixuxu

thanks!

…s support (#8947) * make latent upscaler accept prompt embeds --------- Co-authored-by: Dhruv Nair <[email protected]> Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: YiYi Xu <[email protected]>

make latent upscaler accept prompt embeds

59d220d

rootonchair and others added 2 commits July 24, 2024 09:13

fix style

c1bbb16

Merge branch 'main' into latent_upscaler_prompt_embeds

7e1fd90

yiyixuxu approved these changes Jul 25, 2024

View reviewed changes

yiyixuxu requested a review from sayakpaul July 25, 2024 05:16

sayakpaul reviewed Jul 25, 2024

View reviewed changes

Merge branch 'main' into latent_upscaler_prompt_embeds

dc699d4

Merge branch 'main' into latent_upscaler_prompt_embeds

cae358d

rootonchair and others added 7 commits July 30, 2024 00:10

add base tests

a8593c5

Merge branch 'latent_upscaler_prompt_embeds' of github.com:rootonchai…

a87c08f

…r/diffusers into latent_upscaler_prompt_embeds

Merge branch 'main' into latent_upscaler_prompt_embeds

a33a6bc

remove false typing

543796a

Merge branch 'latent_upscaler_prompt_embeds' of github.com:rootonchai…

de5b3f5

…r/diffusers into latent_upscaler_prompt_embeds

run make style

e181b3e

complete fix fast test

1ac963a

sayakpaul and others added 3 commits August 11, 2024 22:44

Merge branch 'main' into latent_upscaler_prompt_embeds

fd20bb4

fix copies and style

ceeb2f2

Merge branch 'latent_upscaler_prompt_embeds' of github.com:rootonchai…

52c3c78

…r/diffusers into latent_upscaler_prompt_embeds

rootonchair requested a review from sayakpaul August 12, 2024 15:35

yiyixuxu reviewed Aug 18, 2024

View reviewed changes

modify existing tests

e38c554

yiyixuxu reviewed Aug 19, 2024

View reviewed changes

rootonchair commented Aug 20, 2024

View reviewed changes

yiyixuxu reviewed Aug 20, 2024

View reviewed changes

rootonchair and others added 4 commits August 21, 2024 10:03

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffu…

d47333e

…sion_latent_upscale.py Co-authored-by: YiYi Xu <[email protected]>

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffu…

d98a007

…sion_latent_upscale.py Co-authored-by: YiYi Xu <[email protected]>

Update tests/pipelines/stable_diffusion_2/test_stable_diffusion_laten…

4bc27ce

…t_upscale.py Co-authored-by: YiYi Xu <[email protected]>

update expected slice

964f6cd

yiyixuxu approved these changes Aug 21, 2024

View reviewed changes

yiyixuxu merged commit 867e0c9 into huggingface:main Aug 21, 2024
15 checks passed

StableDiffusionLatentUpscalePipeline - positive/negative prompt embeds support #8947

StableDiffusionLatentUpscalePipeline - positive/negative prompt embeds support #8947

Uh oh!

Conversation

rootonchair commented Jul 23, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

rootonchair commented Jul 23, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Jul 24, 2024

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

rootonchair commented Jul 25, 2024

Uh oh!

sayakpaul commented Jul 25, 2024

Uh oh!

rootonchair commented Aug 11, 2024

Uh oh!

yiyixuxu Aug 18, 2024

Choose a reason for hiding this comment

Uh oh!

rootonchair commented Aug 19, 2024

Uh oh!

yiyixuxu Aug 19, 2024

Choose a reason for hiding this comment

Uh oh!

rootonchair Aug 20, 2024

Choose a reason for hiding this comment

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rootonchair commented Aug 21, 2024

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!