-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Add ModelEditing pipeline #2721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@bahjat-kawar thanks for your work here. Could you explain (in the interest for the community to understand it better) how it's different from https://huggingface.co/docs/diffusers/main/en/api/pipelines/semantic_stable_diffusion ? Of course, it's fine for pipelines to do similar things but it's also important for us to be aware of the ways in which they might be different. |
Of course. Semantic guidance offers a way to affect specific image generations through a generalization of classifier-free guidance (e.g., editing a specific image of a car to turn into a red car). TIME (the model editing pipeline) serves a different purpose: It allows the editing of the diffusion model weights such that certain assumptions made by the model are changed for all prompts (e.g., turning all roses blue, in all prompt generations). |
Thanks for explaining this!
So, if I understand correctly, after the model weights have been edited to reflect that the roses are blue, all roses will be generated to have blue colors, right? |
Yes, that is the goal :) |
docs/source/en/api/pipelines/stable_diffusion/model_editing.mdx
Outdated
Show resolved
Hide resolved
* [Project Page](https://time-diffusion.github.io/). | ||
* [Paper](https://arxiv.org/abs/2303.08084). | ||
* [Original Code](https://github.com/bahjat-kawar/time-diffusion). | ||
* [Demo](https://huggingface.co/spaces/bahjat-kawar/time-diffusion). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the pipeline is merged, would be nice to update the demo with diffusers
. Maybe @hysts could open a PR :)
text_encoder: CLIPTextModel, | ||
tokenizer: CLIPTokenizer, | ||
unet: UNet2DConditionModel, | ||
scheduler: DDIMScheduler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it only work with DDIMScheduler
? Most of the pipelines in diffusers work with different schedulers. Some such as MultiDiffusion don't with stateful schedulers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. This was a mistake. Will fix in next commit, thanks!
src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_model_editing.py
Outdated
Show resolved
Hide resolved
src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_model_editing.py
Show resolved
Hide resolved
def test_stable_diffusion_model_editing_default_case(self): | ||
device = "cpu" # ensure determinism for the device-dependent torch.Generator | ||
components = self.get_dummy_components() | ||
sd_pipe = StableDiffusionModelEditingPipeline(**components) | ||
sd_pipe = sd_pipe.to(device) | ||
sd_pipe.set_progress_bar_config(disable=None) | ||
|
||
inputs = self.get_dummy_inputs(device) | ||
image = sd_pipe(**inputs).images | ||
image_slice = image[0, -3:, -3:, -1] | ||
assert image.shape == (1, 64, 64, 3) | ||
|
||
expected_slice = np.array( | ||
[0.5217179, 0.50658035, 0.5003239, 0.41109088, 0.3595158, 0.46607107, 0.5323504, 0.5335255, 0.49187922] | ||
) | ||
|
||
assert np.abs(image_slice.flatten() - expected_slice).max() < 1e-2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we test out the editing capability here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current model editing code edits layers with the class name "CrossAtention" and input_dim of 768 (like CLIP). These don't exist in the dummy model, so editing would not affect it. The slow tests make sure the editing behavior is valid.
|
||
assert image.shape == (1, 512, 512, 3) | ||
|
||
assert np.abs(expected_slice - image_slice).max() > 1e-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent!
tests/pipelines/stable_diffusion/test_stable_diffusion_model_editing.py
Outdated
Show resolved
Hide resolved
tests/pipelines/stable_diffusion/test_stable_diffusion_model_editing.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work! I just left some nits.
Appreciate your willingness to contribute this so quickly. Well done 🔥
I am running the slow tests locally on my end. Will also take care of resolving the merge conflicts.
Update: The slow tests are passing on my end locally for Torch 1.13. |
src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_model_editing.py
Outdated
Show resolved
Hide resolved
The documentation is not available anymore as the PR was closed or merged. |
src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_model_editing.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice think we're almost good to go here! @bahjat-kawar could we maybe just do these last things:
- Remvoe the one lms test
- Also add an entry to:
diffusers/docs/source/en/index.mdx
Line 52 in dc5b4e2
| Pipeline | Paper/Repository | Tasks | - Think about whether we should make the prompts of the pipeline more customizable
Thanks @sayakpaul and @patrickvonplaten ! |
src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_model_editing.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot :)
I would still like for @patrickvonplaten to give it a quick look regarding the customizability of the prompts used in the edit_model()
method.
Maybe expose an option in edit_model()
-- aug_prompts
. If it's None
(which could be made default) we use AUGS_CONST
. WDYT?
If, in your experience, those prompts suffice and exposing an option to customize them doesn't help with the overall performance, we can discard the possibility.
Thanks! While I haven't tested different augmentations, I am all for customizability. I have changed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much. I will wait for the CI and then let's ship this! 🚀
* TIME first commit * styling. * styling 2. * fixes; tests * apply styling and doc fix. * remove sups. * fixes * remove temp file * move augmentations to const * added doc entry * code quality * customize augmentations * quality * quality --------- Co-authored-by: Sayak Paul <[email protected]>
* TIME first commit * styling. * styling 2. * fixes; tests * apply styling and doc fix. * remove sups. * fixes * remove temp file * move augmentations to const * added doc entry * code quality * customize augmentations * quality * quality --------- Co-authored-by: Sayak Paul <[email protected]>
* TIME first commit * styling. * styling 2. * fixes; tests * apply styling and doc fix. * remove sups. * fixes * remove temp file * move augmentations to const * added doc entry * code quality * customize augmentations * quality * quality --------- Co-authored-by: Sayak Paul <[email protected]>
This PR adds a StableDiffusionModelEditingPipeline for Editing Implicit Assumptions in Text-to-Image Diffusion Models, a method that allows users to edit the assumptions implicit in Stable Diffusion using an efficient closed-form solution. The edit should affect related prompts' generations while leaving others unaffected.