Skip to content

Add ModelEditing pipeline #2721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Mar 24, 2023
Merged

Conversation

bahjat-kawar
Copy link
Contributor

This PR adds a StableDiffusionModelEditingPipeline for Editing Implicit Assumptions in Text-to-Image Diffusion Models, a method that allows users to edit the assumptions implicit in Stable Diffusion using an efficient closed-form solution. The edit should affect related prompts' generations while leaving others unaffected.

import torch
from diffusers import StableDiffusionModelEditingPipeline

model_ckpt = "CompVis/stable-diffusion-v1-4"
pipe = StableDiffusionModelEditingPipeline.from_pretrained(model_ckpt)

pipe = pipe.to("cuda")

source_prompt = "A pack of roses"
destination_prompt = "A pack of blue roses"
pipe.edit_model(source_prompt, destination_prompt)

prompt = "A field of roses"
image = pipe(prompt).images[0]
image.save("field_of_roses.png")

@sayakpaul
Copy link
Member

@bahjat-kawar thanks for your work here. Could you explain (in the interest for the community to understand it better) how it's different from https://huggingface.co/docs/diffusers/main/en/api/pipelines/semantic_stable_diffusion ?

Of course, it's fine for pipelines to do similar things but it's also important for us to be aware of the ways in which they might be different.

@bahjat-kawar
Copy link
Contributor Author

@bahjat-kawar thanks for your work here. Could you explain (in the interest for the community to understand it better) how it's different from https://huggingface.co/docs/diffusers/main/en/api/pipelines/semantic_stable_diffusion ?

Of course, it's fine for pipelines to do similar things but it's also important for us to be aware of the ways in which they might be different.

Of course. Semantic guidance offers a way to affect specific image generations through a generalization of classifier-free guidance (e.g., editing a specific image of a car to turn into a red car). TIME (the model editing pipeline) serves a different purpose: It allows the editing of the diffusion model weights such that certain assumptions made by the model are changed for all prompts (e.g., turning all roses blue, in all prompt generations).

@bahjat-kawar bahjat-kawar changed the title [WIP] Add ModelEditing pipeline Add ModelEditing pipeline Mar 23, 2023
@sayakpaul
Copy link
Member

Thanks for explaining this!

It allows the editing of the diffusion model weights such that certain assumptions made by the model are changed for all prompts (e.g., turning all roses blue, in all prompt generations).

So, if I understand correctly, after the model weights have been edited to reflect that the roses are blue, all roses will be generated to have blue colors, right?

@bahjat-kawar
Copy link
Contributor Author

Thanks for explaining this!

It allows the editing of the diffusion model weights such that certain assumptions made by the model are changed for all prompts (e.g., turning all roses blue, in all prompt generations).

So, if I understand correctly, after the model weights have been edited to reflect that the roses are blue, all roses will be generated to have blue colors, right?

Yes, that is the goal :)

* [Project Page](https://time-diffusion.github.io/).
* [Paper](https://arxiv.org/abs/2303.08084).
* [Original Code](https://github.com/bahjat-kawar/time-diffusion).
* [Demo](https://huggingface.co/spaces/bahjat-kawar/time-diffusion).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the pipeline is merged, would be nice to update the demo with diffusers. Maybe @hysts could open a PR :)

text_encoder: CLIPTextModel,
tokenizer: CLIPTokenizer,
unet: UNet2DConditionModel,
scheduler: DDIMScheduler,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it only work with DDIMScheduler? Most of the pipelines in diffusers work with different schedulers. Some such as MultiDiffusion don't with stateful schedulers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. This was a mistake. Will fix in next commit, thanks!

Comment on lines +110 to +126
def test_stable_diffusion_model_editing_default_case(self):
device = "cpu" # ensure determinism for the device-dependent torch.Generator
components = self.get_dummy_components()
sd_pipe = StableDiffusionModelEditingPipeline(**components)
sd_pipe = sd_pipe.to(device)
sd_pipe.set_progress_bar_config(disable=None)

inputs = self.get_dummy_inputs(device)
image = sd_pipe(**inputs).images
image_slice = image[0, -3:, -3:, -1]
assert image.shape == (1, 64, 64, 3)

expected_slice = np.array(
[0.5217179, 0.50658035, 0.5003239, 0.41109088, 0.3595158, 0.46607107, 0.5323504, 0.5335255, 0.49187922]
)

assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we test out the editing capability here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current model editing code edits layers with the class name "CrossAtention" and input_dim of 768 (like CLIP). These don't exist in the dummy model, so editing would not affect it. The slow tests make sure the editing behavior is valid.


assert image.shape == (1, 512, 512, 3)

assert np.abs(expected_slice - image_slice).max() > 1e-1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work! I just left some nits.

Appreciate your willingness to contribute this so quickly. Well done 🔥

I am running the slow tests locally on my end. Will also take care of resolving the merge conflicts.

@sayakpaul
Copy link
Member

Update: The slow tests are passing on my end locally for Torch 1.13.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 23, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice think we're almost good to go here! @bahjat-kawar could we maybe just do these last things:

  • Remvoe the one lms test
  • Also add an entry to:
    | Pipeline | Paper/Repository | Tasks |
  • Think about whether we should make the prompts of the pipeline more customizable

@bahjat-kawar
Copy link
Contributor Author

Thanks @sayakpaul and @patrickvonplaten !
I removed the lms test, added an entry to diffusers/docs/source/en/index.mdx, and fixed the issues raised.
Fast tests will not test the editing capabilities because the code is limited to CLIP-based cross-attention. Slow tests cover this, though.
If all checks pass, we should be ready to merge. Thanks for the excellent reviews!

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot :)

I would still like for @patrickvonplaten to give it a quick look regarding the customizability of the prompts used in the edit_model() method.

Maybe expose an option in edit_model() -- aug_prompts. If it's None (which could be made default) we use AUGS_CONST. WDYT?

@bahjat-kawar,

If, in your experience, those prompts suffice and exposing an option to customize them doesn't help with the overall performance, we can discard the possibility.

@bahjat-kawar
Copy link
Contributor Author

Thanks a lot :)

I would still like for @patrickvonplaten to give it a quick look regarding the customizability of the prompts used in the edit_model() method.

Maybe expose an option in edit_model() -- aug_prompts. If it's None (which could be made default) we use AUGS_CONST. WDYT?

@bahjat-kawar,

If, in your experience, those prompts suffice and exposing an option to customize them doesn't help with the overall performance, we can discard the possibility.

Thanks! While I haven't tested different augmentations, I am all for customizability. I have changed the with_augs parameter in __init__ from a bool to a list of string augmentations. The list defaults to AUGS_CONST but can be overridden.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much. I will wait for the CI and then let's ship this! 🚀

@sayakpaul sayakpaul merged commit 37a44bb into huggingface:main Mar 24, 2023
w4ffl35 pushed a commit to w4ffl35/diffusers that referenced this pull request Apr 14, 2023
* TIME first commit

* styling.

* styling 2.

* fixes; tests

* apply styling and doc fix.

* remove sups.

* fixes

* remove temp file

* move augmentations to const

* added doc entry

* code quality

* customize augmentations

* quality

* quality

---------

Co-authored-by: Sayak Paul <[email protected]>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* TIME first commit

* styling.

* styling 2.

* fixes; tests

* apply styling and doc fix.

* remove sups.

* fixes

* remove temp file

* move augmentations to const

* added doc entry

* code quality

* customize augmentations

* quality

* quality

---------

Co-authored-by: Sayak Paul <[email protected]>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* TIME first commit

* styling.

* styling 2.

* fixes; tests

* apply styling and doc fix.

* remove sups.

* fixes

* remove temp file

* move augmentations to const

* added doc entry

* code quality

* customize augmentations

* quality

* quality

---------

Co-authored-by: Sayak Paul <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants