Skip to content

Commit 7c6e7f7

Browse files
Merge branch 'main' into dev-tensorrt-txt2img-pipeline
2 parents 8e70c0a + 7b2407f commit 7c6e7f7

File tree

170 files changed

+4411
-886
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

170 files changed

+4411
-886
lines changed

.github/workflows/pr_tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ jobs:
4040
framework: pytorch_examples
4141
runner: docker-cpu
4242
image: diffusers/diffusers-pytorch-cpu
43-
report: torch_cpu
43+
report: torch_example_cpu
4444

4545
name: ${{ matrix.config.name }}
4646

.github/workflows/push_tests_fast.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ jobs:
3838
framework: pytorch_examples
3939
runner: docker-cpu
4040
image: diffusers/diffusers-pytorch-cpu
41-
report: torch_cpu
41+
report: torch_example_cpu
4242

4343
name: ${{ matrix.config.name }}
4444

CONTRIBUTING.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -394,8 +394,15 @@ passes. You should run the tests impacted by your changes like this:
394394
```bash
395395
$ pytest tests/<TEST_TO_RUN>.py
396396
```
397+
398+
Before you run the tests, please make sure you install the dependencies required for testing. You can do so
399+
with this command:
397400

398-
You can also run the full suite with the following command, but it takes
401+
```bash
402+
$ pip install -e ".[test]"
403+
```
404+
405+
You can run the full test suite with the following command, but it takes
399406
a beefy machine to produce a result in a decent amount of time now that
400407
Diffusers has grown a lot. Here is the command for it:
401408

docs/source/en/_toctree.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,8 @@
9797
title: ONNX
9898
- local: optimization/open_vino
9999
title: OpenVINO
100+
- local: optimization/coreml
101+
title: Core ML
100102
- local: optimization/mps
101103
title: MPS
102104
- local: optimization/habana
@@ -204,6 +206,8 @@
204206
title: Stochastic Karras VE
205207
- local: api/pipelines/text_to_video
206208
title: Text-to-Video
209+
- local: api/pipelines/text_to_video_zero
210+
title: Text-to-Video Zero
207211
- local: api/pipelines/unclip
208212
title: UnCLIP
209213
- local: api/pipelines/latent_diffusion_uncond

docs/source/en/api/loaders.mdx

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,11 @@ API to load such adapter neural networks via the [`loaders.py` module](https://g
2828
### UNet2DConditionLoadersMixin
2929

3030
[[autodoc]] loaders.UNet2DConditionLoadersMixin
31+
32+
### TextualInversionLoaderMixin
33+
34+
[[autodoc]] loaders.TextualInversionLoaderMixin
35+
36+
### LoraLoaderMixin
37+
38+
[[autodoc]] loaders.LoraLoaderMixin

docs/source/en/api/pipelines/alt_diffusion.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,11 @@ The abstract of the paper is the following:
2828

2929
## Tips
3030

31-
- AltDiffusion is conceptually exactly the same as [Stable Diffusion](./api/pipelines/stable_diffusion/overview).
31+
- AltDiffusion is conceptually exactly the same as [Stable Diffusion](./stable_diffusion/overview).
3232

3333
- *Run AltDiffusion*
3434

35-
AltDiffusion can be tested very easily with the [`AltDiffusionPipeline`], [`AltDiffusionImg2ImgPipeline`] and the `"BAAI/AltDiffusion-m9"` checkpoint exactly in the same way it is shown in the [Conditional Image Generation Guide](./using-diffusers/conditional_image_generation) and the [Image-to-Image Generation Guide](./using-diffusers/img2img).
35+
AltDiffusion can be tested very easily with the [`AltDiffusionPipeline`], [`AltDiffusionImg2ImgPipeline`] and the `"BAAI/AltDiffusion-m9"` checkpoint exactly in the same way it is shown in the [Conditional Image Generation Guide](../../using-diffusers/conditional_image_generation) and the [Image-to-Image Generation Guide](../../using-diffusers/img2img).
3636

3737
- *How to load and use different schedulers.*
3838

docs/source/en/api/pipelines/overview.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ available a colab notebook to directly try them out.
8383
| [versatile_diffusion](./versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Image Variations Generation |
8484
| [versatile_diffusion](./versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Dual Image and Text Guided Generation |
8585
| [vq_diffusion](./vq_diffusion) | [Vector Quantized Diffusion Model for Text-to-Image Synthesis](https://arxiv.org/abs/2111.14822) | Text-to-Image Generation |
86+
| [text_to_video_zero](./text_to_video_zero) | [Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators](https://arxiv.org/abs/2303.13439) | Text-to-Video Generation |
8687

8788

8889
**Note**: Pipelines are simple examples of how to play around with the diffusion systems as described in the corresponding papers.

docs/source/en/api/pipelines/spectrogram_diffusion.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ As depicted above the model takes as input a MIDI file and tokenizes it into a s
3030

3131
| Pipeline | Tasks | Colab
3232
|---|---|:---:|
33-
| [pipeline_spectrogram_diffusion.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/spectrogram_diffusion/pipeline_spectrogram_diffusion) | *Unconditional Audio Generation* | - |
33+
| [pipeline_spectrogram_diffusion.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/spectrogram_diffusion/pipeline_spectrogram_diffusion.py) | *Unconditional Audio Generation* | - |
3434

3535

3636
## Example usage

docs/source/en/api/pipelines/stable_diffusion/self_attention_guidance.mdx

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,25 +14,26 @@ specific language governing permissions and limitations under the License.
1414

1515
## Overview
1616

17-
[Self-Attention Guidance](https://arxiv.org/abs/2210.00939) by Susung Hong et al.
17+
[Improving Sample Quality of Diffusion Models Using Self-Attention Guidance](https://arxiv.org/abs/2210.00939) by Susung Hong et al.
1818

1919
The abstract of the paper is the following:
2020

21-
*Denoising diffusion models (DDMs) have been drawing much attention for their appreciable sample quality and diversity. Despite their remarkable performance, DDMs remain black boxes on which further study is necessary to take a profound step. Motivated by this, we delve into the design of conventional U-shaped diffusion models. More specifically, we investigate the self-attention modules within these models through carefully designed experiments and explore their characteristics. In addition, inspired by the studies that substantiate the effectiveness of the guidance schemes, we present plug-and-play diffusion guidance, namely Self-Attention Guidance (SAG), that can drastically boost the performance of existing diffusion models. Our method, SAG, extracts the intermediate attention map from a diffusion model at every iteration and selects tokens above a certain attention score for masking and blurring to obtain a partially blurred input. Subsequently, we measure the dissimilarity between the predicted noises obtained from feeding the blurred and original input to the diffusion model and leverage it as guidance. With this guidance, we observe apparent improvements in a wide range of diffusion models, e.g., ADM, IDDPM, and Stable Diffusion, and show that the results further improve by combining our method with the conventional guidance scheme. We provide extensive ablation studies to verify our choices.*
21+
*Denoising diffusion models (DDMs) have attracted attention for their exceptional generation quality and diversity. This success is largely attributed to the use of class- or text-conditional diffusion guidance methods, such as classifier and classifier-free guidance. In this paper, we present a more comprehensive perspective that goes beyond the traditional guidance methods. From this generalized perspective, we introduce novel condition- and training-free strategies to enhance the quality of generated images. As a simple solution, blur guidance improves the suitability of intermediate samples for their fine-scale information and structures, enabling diffusion models to generate higher quality samples with a moderate guidance scale. Improving upon this, Self-Attention Guidance (SAG) uses the intermediate self-attention maps of diffusion models to enhance their stability and efficacy. Specifically, SAG adversarially blurs only the regions that diffusion models attend to at each iteration and guides them accordingly. Our experimental results show that our SAG improves the performance of various diffusion models, including ADM, IDDPM, Stable Diffusion, and DiT. Moreover, combining SAG with conventional guidance methods leads to further improvement.*
2222

2323
Resources:
2424

2525
* [Project Page](https://ku-cvlab.github.io/Self-Attention-Guidance).
2626
* [Paper](https://arxiv.org/abs/2210.00939).
2727
* [Original Code](https://github.com/KU-CVLAB/Self-Attention-Guidance).
28-
* [Demo](https://colab.research.google.com/github/SusungHong/Self-Attention-Guidance/blob/main/SAG_Stable.ipynb).
28+
* [Hugging Face Demo](https://huggingface.co/spaces/susunghong/Self-Attention-Guidance).
29+
* [Colab Demo](https://colab.research.google.com/github/SusungHong/Self-Attention-Guidance/blob/main/SAG_Stable.ipynb).
2930

3031

3132
## Available Pipelines:
3233

3334
| Pipeline | Tasks | Demo
3435
|---|---|:---:|
35-
| [StableDiffusionSAGPipeline](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py) | *Text-to-Image Generation* | [Colab](https://colab.research.google.com/github/SusungHong/Self-Attention-Guidance/blob/main/SAG_Stable.ipynb) |
36+
| [StableDiffusionSAGPipeline](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py) | *Text-to-Image Generation* | [🤗 Space](https://huggingface.co/spaces/susunghong/Self-Attention-Guidance) |
3637

3738
## Usage example
3839

docs/source/en/api/pipelines/stable_diffusion_safe.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,11 +28,11 @@ The abstract of the paper is the following:
2828

2929
## Tips
3030

31-
- Safe Stable Diffusion may also be used with weights of [Stable Diffusion](./api/pipelines/stable_diffusion/text2img).
31+
- Safe Stable Diffusion may also be used with weights of [Stable Diffusion](./stable_diffusion/text2img).
3232

3333
### Run Safe Stable Diffusion
3434

35-
Safe Stable Diffusion can be tested very easily with the [`StableDiffusionPipelineSafe`], and the `"AIML-TUDA/stable-diffusion-safe"` checkpoint exactly in the same way it is shown in the [Conditional Image Generation Guide](./using-diffusers/conditional_image_generation).
35+
Safe Stable Diffusion can be tested very easily with the [`StableDiffusionPipelineSafe`], and the `"AIML-TUDA/stable-diffusion-safe"` checkpoint exactly in the same way it is shown in the [Conditional Image Generation Guide](../../using-diffusers/conditional_image_generation).
3636

3737
### Interacting with the Safety Concept
3838

0 commit comments

Comments
 (0)