Add ModelEditing pipeline #2721

bahjat-kawar · 2023-03-17T07:35:43Z

This PR adds a StableDiffusionModelEditingPipeline for Editing Implicit Assumptions in Text-to-Image Diffusion Models, a method that allows users to edit the assumptions implicit in Stable Diffusion using an efficient closed-form solution. The edit should affect related prompts' generations while leaving others unaffected.

import torch
from diffusers import StableDiffusionModelEditingPipeline

model_ckpt = "CompVis/stable-diffusion-v1-4"
pipe = StableDiffusionModelEditingPipeline.from_pretrained(model_ckpt)

pipe = pipe.to("cuda")

source_prompt = "A pack of roses"
destination_prompt = "A pack of blue roses"
pipe.edit_model(source_prompt, destination_prompt)

prompt = "A field of roses"
image = pipe(prompt).images[0]
image.save("field_of_roses.png")

sayakpaul · 2023-03-20T02:51:31Z

@bahjat-kawar thanks for your work here. Could you explain (in the interest for the community to understand it better) how it's different from https://huggingface.co/docs/diffusers/main/en/api/pipelines/semantic_stable_diffusion ?

Of course, it's fine for pipelines to do similar things but it's also important for us to be aware of the ways in which they might be different.

bahjat-kawar · 2023-03-23T08:47:55Z

@bahjat-kawar thanks for your work here. Could you explain (in the interest for the community to understand it better) how it's different from https://huggingface.co/docs/diffusers/main/en/api/pipelines/semantic_stable_diffusion ?

Of course, it's fine for pipelines to do similar things but it's also important for us to be aware of the ways in which they might be different.

Of course. Semantic guidance offers a way to affect specific image generations through a generalization of classifier-free guidance (e.g., editing a specific image of a car to turn into a red car). TIME (the model editing pipeline) serves a different purpose: It allows the editing of the diffusion model weights such that certain assumptions made by the model are changed for all prompts (e.g., turning all roses blue, in all prompt generations).

sayakpaul · 2023-03-23T09:57:46Z

Thanks for explaining this!

It allows the editing of the diffusion model weights such that certain assumptions made by the model are changed for all prompts (e.g., turning all roses blue, in all prompt generations).

So, if I understand correctly, after the model weights have been edited to reflect that the roses are blue, all roses will be generated to have blue colors, right?

bahjat-kawar · 2023-03-23T10:30:24Z

Thanks for explaining this!

It allows the editing of the diffusion model weights such that certain assumptions made by the model are changed for all prompts (e.g., turning all roses blue, in all prompt generations).

So, if I understand correctly, after the model weights have been edited to reflect that the roses are blue, all roses will be generated to have blue colors, right?

Yes, that is the goal :)

docs/source/en/_toctree.yml

docs/source/en/api/pipelines/stable_diffusion/model_editing.mdx

sayakpaul · 2023-03-23T10:50:33Z

docs/source/en/api/pipelines/stable_diffusion/model_editing.mdx

+* [Project Page](https://time-diffusion.github.io/).
+* [Paper](https://arxiv.org/abs/2303.08084).
+* [Original Code](https://github.com/bahjat-kawar/time-diffusion).
+* [Demo](https://huggingface.co/spaces/bahjat-kawar/time-diffusion).


After the pipeline is merged, would be nice to update the demo with diffusers. Maybe @hysts could open a PR :)

docs/source/en/index.mdx

docs/source/en/api/pipelines/stable_diffusion/model_editing.mdx

sayakpaul · 2023-03-23T10:54:58Z

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_model_editing.py

+        text_encoder: CLIPTextModel,
+        tokenizer: CLIPTokenizer,
+        unet: UNet2DConditionModel,
+        scheduler: DDIMScheduler,


Does it only work with DDIMScheduler? Most of the pipelines in diffusers work with different schedulers. Some such as MultiDiffusion don't with stateful schedulers.

No. This was a mistake. Will fix in next commit, thanks!

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_model_editing.py

sayakpaul · 2023-03-23T11:16:26Z

tests/pipelines/stable_diffusion/test_stable_diffusion_model_editing.py

+    def test_stable_diffusion_model_editing_default_case(self):
+        device = "cpu"  # ensure determinism for the device-dependent torch.Generator
+        components = self.get_dummy_components()
+        sd_pipe = StableDiffusionModelEditingPipeline(**components)
+        sd_pipe = sd_pipe.to(device)
+        sd_pipe.set_progress_bar_config(disable=None)
+
+        inputs = self.get_dummy_inputs(device)
+        image = sd_pipe(**inputs).images
+        image_slice = image[0, -3:, -3:, -1]
+        assert image.shape == (1, 64, 64, 3)
+
+        expected_slice = np.array(
+            [0.5217179, 0.50658035, 0.5003239, 0.41109088, 0.3595158, 0.46607107, 0.5323504, 0.5335255, 0.49187922]
+        )
+
+        assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2


Shouldn't we test out the editing capability here as well?

The current model editing code edits layers with the class name "CrossAtention" and input_dim of 768 (like CLIP). These don't exist in the dummy model, so editing would not affect it. The slow tests make sure the editing behavior is valid.

tests/pipelines/stable_diffusion/test_stable_diffusion_model_editing.py

sayakpaul · 2023-03-23T11:20:23Z

tests/pipelines/stable_diffusion/test_stable_diffusion_model_editing.py

+
+        assert image.shape == (1, 512, 512, 3)
+
+        assert np.abs(expected_slice - image_slice).max() > 1e-1


tests/pipelines/stable_diffusion/test_stable_diffusion_model_editing.py

sayakpaul

Excellent work! I just left some nits.

Appreciate your willingness to contribute this so quickly. Well done 🔥

I am running the slow tests locally on my end. Will also take care of resolving the merge conflicts.

sayakpaul · 2023-03-23T11:31:02Z

Update: The slow tests are passing on my end locally for Torch 1.13.

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_model_editing.py

HuggingFaceDocBuilderDev · 2023-03-23T11:44:54Z

The documentation is not available anymore as the PR was closed or merged.

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_model_editing.py

patrickvonplaten

Looks nice think we're almost good to go here! @bahjat-kawar could we maybe just do these last things:

Remvoe the one lms test
Also add an entry to:

diffusers/docs/source/en/index.mdx

Line 52 in dc5b4e2

| Pipeline | Paper/Repository | Tasks |
Think about whether we should make the prompts of the pipeline more customizable

…users into time-diffusion

bahjat-kawar · 2023-03-23T17:49:18Z

Thanks @sayakpaul and @patrickvonplaten !
I removed the lms test, added an entry to diffusers/docs/source/en/index.mdx, and fixed the issues raised.
Fast tests will not test the editing capabilities because the code is limited to CLIP-based cross-attention. Slow tests cover this, though.
If all checks pass, we should be ready to merge. Thanks for the excellent reviews!

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_model_editing.py

sayakpaul

Thanks a lot :)

I would still like for @patrickvonplaten to give it a quick look regarding the customizability of the prompts used in the edit_model() method.

Maybe expose an option in edit_model() -- aug_prompts. If it's None (which could be made default) we use AUGS_CONST. WDYT?

@bahjat-kawar,

If, in your experience, those prompts suffice and exposing an option to customize them doesn't help with the overall performance, we can discard the possibility.

bahjat-kawar · 2023-03-24T06:50:18Z

Thanks a lot :)

I would still like for @patrickvonplaten to give it a quick look regarding the customizability of the prompts used in the edit_model() method.

Maybe expose an option in edit_model() -- aug_prompts. If it's None (which could be made default) we use AUGS_CONST. WDYT?

@bahjat-kawar,

If, in your experience, those prompts suffice and exposing an option to customize them doesn't help with the overall performance, we can discard the possibility.

Thanks! While I haven't tested different augmentations, I am all for customizability. I have changed the with_augs parameter in __init__ from a bool to a list of string augmentations. The list defaults to AUGS_CONST but can be overridden.

sayakpaul

Thanks so much. I will wait for the CI and then let's ship this! 🚀

* TIME first commit * styling. * styling 2. * fixes; tests * apply styling and doc fix. * remove sups. * fixes * remove temp file * move augmentations to const * added doc entry * code quality * customize augmentations * quality * quality --------- Co-authored-by: Sayak Paul <[email protected]>

TIME first commit

ee76208

sayakpaul and others added 3 commits March 20, 2023 08:36

styling.

7b8ebd2

styling 2.

16c9855

fixes; tests

9b5e5ab

bahjat-kawar changed the title ~~[WIP] Add ModelEditing pipeline~~ Add ModelEditing pipeline Mar 23, 2023