Fix StableDiffusionXLPAGInpaintPipeline #9128

gumgood · 2024-08-08T10:23:26Z

What does this PR do?

Fixed the issue preventing model and inference with StableDiffusionXLInpaintPipeline class in UNet where in_channels=9.

1. When loading models of the StableDiffusionXLInpaintPipeline class like diffusers/stable-diffusion-xl-1.0-inpainting-0.1, the load fails if enable_pag=True.

error:

File /hdd/repo/diffusers/src/diffusers/pipelines/auto_pipeline.py:958, in AutoPipelineForInpainting.from_pretrained(cls, pretrained_model_or_path, **kwargs)
    954     if enable_pag:
    955         orig_class_name = config["_class_name"].replace("InpaintPipeline",
    956                                                         "PAG3InpaintPipeline")
--> 958 inpainting_cls = _get_task_class(AUTO_INPAINT_PIPELINES_MAPPING, orig_class_name)
    960 kwargs = {**load_config_kwargs, **kwargs}
    961 return inpainting_cls.from_pretrained(pretrained_model_or_path, **kwargs)
...
    198 if throw_error_if_not_exist:
--> 199     raise ValueError(
    200         f"AutoPipeline can't find a pipeline linked to {pipeline_class_name} for {model_name}")

ValueError: AutoPipeline can't find a pipeline linked to StableDiffusionXLPAG3InpaintPipeline for None

Fixed to find the correct class StableDiffusionXLPAGInpaintPipeline.

2. When calling __call__ with a model where in_channels=9 in the UNet, latent_model_input is not created correctly

error:

File /hdd/repo/diffusers/src/diffusers/pipelines/pag/pipeline_pag_sd_xl_inpaint.py:1623, in StableDiffusionXLPAGInpaintPipeline.__call__(self, prompt, prompt_2, image, mask_image, masked_image_latents, height, width, padding_mask_crop, strength, num_inference_steps, timesteps, sigmas, denoising_start, denoising_end, guidance_scale, negative_prompt, negative_prompt_2, num_images_per_prompt, eta, generator, latents, prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds, ip_adapter_image, ip_adapter_image_embeds, output_type, return_dict, cross_attention_kwargs, guidance_rescale, original_size, crops_coords_top_left, target_size, negative_original_size, negative_crops_coords_top_left, negative_target_size, aesthetic_score, negative_aesthetic_score, clip_skip, callback_on_step_end, callback_on_step_end_tensor_inputs, pag_scale, pag_adaptive_scale)
   1620 latent_model_input = self.scheduler.scale_model_input(latent_model_input, t)
   1622 if num_channels_unet == 9:
-> 1623     latent_model_input = torch.cat([latent_model_input, mask, masked_image_latents], dim=1)
   1625 # predict the noise residual
   1626 added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids}

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 3 but got size 2 for tensor number 1 in the list.

The batch size of mask and masked_image_latents differs from that of latent_model_input. This has been fixed.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@sayakpaul @yiyixuxu @DN6

HuggingFaceDocBuilderDev · 2024-08-12T05:05:31Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

DN6 · 2024-08-12T05:07:02Z

Hi @gumgood PR looks good. Could you please run make style && make quality so the QC checks pass on the CI.

gumgood · 2024-08-13T00:19:00Z

@DN6 Thank you for the review. I've applied the style changes. Before applying the style, I ran the tests and fixed a few additional issues:

Fixed the *Pipeline model to load as *PAGInpaintPipeline.
Fixed batch size of init_mask in the 4-channel UNet.

yiyixuxu

thanks for the PR! I left a comment

yiyixuxu · 2024-08-18T18:03:59Z

src/diffusers/pipelines/pag/pipeline_pag_sd_xl_inpaint.py

        return image_latents

-    # Copied from diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl_inpaint.StableDiffusionXLInpaintPipeline.prepare_mask_latents
+    # Modified from diffusers.pipelines.stable_diffusion_xl.pipeline_stable_diffusion_xl_inpaint.StableDiffusionXLInpaintPipeline.prepare_mask_latents


thanks! is it possible to keep this method Copied from but modify the output instead? e.g. this is how we modify ip_adapter_image_embeds for PAG https://github.com/huggingface/diffusers/blob/f848febacdc54c351ed0ed23fcc4c9349828021e/src/diffusers/pipelines/pag/pipeline_pag_sd_xl_inpaint.py#L1558C1-L1559C1

it's a little bit weird and requires more code but this way it will be easier for us to maintain all the PAG pipelines

in the future, we should make sure these methods that prepare conditions for the denoiser model always return negative condition and condition instead of a combined output

yiyixuxu · 2024-08-18T18:38:06Z

src/diffusers/pipelines/auto_pipeline.py

            enable_pag = kwargs.pop("enable_pag")
            if enable_pag:
-                orig_class_name = config["_class_name"].replace("Pipeline", "PAGPipeline")
+                orig_class_name = (


if we do this, the orig_class_name for StableDiffusionXLPipeline would be StableDiffusionXLPAGInpaintPipeline too - in this case will get the correct result regardless but logic is incorrect

the orig_class_name for StableDiffusionXLPipeline should be StableDiffusionXLPAGPipeline and then we map it to the corresponding to the inpainting class through _get_task_class(...) and get StableDiffusionXLPAGInpaintPipeline

we can maybe do something like

to_replace = "InpaintPipeline" if "Inpaint` in config["_class_name"] else "Pipeline" orig_class_name = config["_class_name"].replace(to_replace, "PAG" + to_replace)

gumgood

@yiyixuxu thank you! I have made updates according to the suggestion.

gumgood · 2024-08-20T09:55:53Z

src/diffusers/pipelines/auto_pipeline.py

-                    .replace("Pipeline", "PAGInpaintPipeline")
-                )
+                to_replace = "InpaintPipeline" if "Inpaint" in config["_class_name"] else "Pipeline"
+                orig_class_name = config["_class_name"].replace(to_replace, "PAG" + to_replace)


if we do this, the orig_class_name for StableDiffusionXLPipeline would be StableDiffusionXLPAGInpaintPipeline too - in this case will get the correct result regardless but logic is incorrect

the orig_class_name for StableDiffusionXLPipeline should be StableDiffusionXLPAGPipeline and then we map it to the corresponding to the inpainting class through _get_task_class(...) and get StableDiffusionXLPAGInpaintPipeline

we can maybe do something like

to_replace = "InpaintPipeline" if "Inpaint` in config["_class_name"] else "Pipeline" orig_class_name = config["_class_name"].replace(to_replace, "PAG" + to_replace)

Nice code. I have appied it and verified that it works well

gumgood · 2024-08-20T10:03:29Z

src/diffusers/pipelines/pag/pipeline_pag_sd_xl_inpaint.py

+            mask = self._prepare_perturbed_attention_guidance(mask, mask, self.do_classifier_free_guidance)
+            masked_image_latents = self._prepare_perturbed_attention_guidance(
+                masked_image_latents, masked_image_latents, self.do_classifier_free_guidance
+            )


Updated to handle batch size without modifying the prepare_mask_latents function.

yiyixuxu

thanks!

gumgood added 2 commits August 8, 2024 18:48

fix: class name in AutoPipelineForInpainting

a5b6529

fix: batch size of mask and masked_image_latents

12bf313

gumgood force-pushed the fix-pag-inpaint-pipeline branch from 2d7ff3b to 12bf313 Compare August 11, 2024 17:42

Merge branch 'huggingface:main' into fix-pag-inpaint-pipeline

ab714f6

gumgood added 3 commits August 13, 2024 09:13

fix: handling loading of DiffusionPipeline in from_pretrained

e8a437a

fix: init_mask in 4-channel UNet

958e184

style: apply formatter

ff5fd9d

gumgood force-pushed the fix-pag-inpaint-pipeline branch from 9d7ef91 to ff5fd9d Compare August 13, 2024 00:16

yiyixuxu reviewed Aug 18, 2024

View reviewed changes

gumgood added 3 commits August 20, 2024 18:38

fix: logic to find pag pipeline

2d0429e

revert: prepare_mask_latents

6846f30

fix: batch size of mask and masked_image_latents

cbb7070

gumgood commented Aug 20, 2024

View reviewed changes

yiyixuxu approved these changes Aug 20, 2024

View reviewed changes

yiyixuxu merged commit 16a3dad into huggingface:main Aug 20, 2024

gumgood deleted the fix-pag-inpaint-pipeline branch August 22, 2024 13:20

gumgood restored the fix-pag-inpaint-pipeline branch August 22, 2024 13:20

gumgood deleted the fix-pag-inpaint-pipeline branch August 22, 2024 13:20

yiyixuxu mentioned this pull request Sep 5, 2024

add flux inpaint + img2img + controlnet to auto pipeline #9367

Merged

sayakpaul pushed a commit that referenced this pull request Dec 23, 2024

Fix StableDiffusionXLPAGInpaintPipeline (#9128)

d9e3c81

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix StableDiffusionXLPAGInpaintPipeline #9128

Fix StableDiffusionXLPAGInpaintPipeline #9128

Uh oh!

gumgood commented Aug 8, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Aug 12, 2024

Uh oh!

DN6 commented Aug 12, 2024

Uh oh!

gumgood commented Aug 13, 2024

Uh oh!

yiyixuxu left a comment

Uh oh!

yiyixuxu Aug 18, 2024

Uh oh!

yiyixuxu Aug 18, 2024

Uh oh!

gumgood left a comment

Uh oh!

gumgood Aug 20, 2024

Uh oh!

gumgood Aug 20, 2024

Uh oh!

yiyixuxu left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix StableDiffusionXLPAGInpaintPipeline #9128

Fix StableDiffusionXLPAGInpaintPipeline #9128

Uh oh!

Conversation

gumgood commented Aug 8, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Aug 12, 2024

Uh oh!

DN6 commented Aug 12, 2024

Uh oh!

gumgood commented Aug 13, 2024

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Aug 18, 2024

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Aug 18, 2024

Choose a reason for hiding this comment

Uh oh!

gumgood left a comment

Choose a reason for hiding this comment

Uh oh!

gumgood Aug 20, 2024

Choose a reason for hiding this comment

Uh oh!

gumgood Aug 20, 2024

Choose a reason for hiding this comment

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants