Skip to content

Commit 1870fb0

Browse files
authored
[docs] Add Colab notebooks and Spaces (#2713)
* add colab notebook and spaces * fix image link
1 parent df91c44 commit 1870fb0

File tree

6 files changed

+135
-77
lines changed

6 files changed

+135
-77
lines changed

docs/source/en/_toctree.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -33,15 +33,15 @@
3333
- local: using-diffusers/pipeline_overview
3434
title: Overview
3535
- local: using-diffusers/unconditional_image_generation
36-
title: Unconditional Image Generation
36+
title: Unconditional image generation
3737
- local: using-diffusers/conditional_image_generation
38-
title: Text-to-Image Generation
38+
title: Text-to-image generation
3939
- local: using-diffusers/img2img
40-
title: Text-Guided Image-to-Image
40+
title: Text-guided image-to-image
4141
- local: using-diffusers/inpaint
42-
title: Text-Guided Image-Inpainting
42+
title: Text-guided image-inpainting
4343
- local: using-diffusers/depth2img
44-
title: Text-Guided Depth-to-Image
44+
title: Text-guided depth-to-image
4545
- local: using-diffusers/reusing_seeds
4646
title: Improve image quality with deterministic generation
4747
- local: using-diffusers/reproducibility

docs/source/en/using-diffusers/conditional_image_generation.mdx

Lines changed: 22 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,22 +10,27 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
1010
specific language governing permissions and limitations under the License.
1111
-->
1212

13-
# Conditional Image Generation
13+
# Conditional image generation
14+
15+
[[open-in-colab]]
16+
17+
Conditional image generation allows you to generate images from a text prompt. The text is converted into embeddings which are used to condition the model to generate an image from noise.
1418

1519
The [`DiffusionPipeline`] is the easiest way to use a pre-trained diffusion system for inference.
1620

17-
Start by creating an instance of [`DiffusionPipeline`] and specify which pipeline checkpoint you would like to download.
18-
You can use the [`DiffusionPipeline`] for any [Diffusers' checkpoint](https://huggingface.co/models?library=diffusers&sort=downloads).
19-
In this guide though, you'll use [`DiffusionPipeline`] for text-to-image generation with [Latent Diffusion](https://huggingface.co/CompVis/ldm-text2im-large-256):
21+
Start by creating an instance of [`DiffusionPipeline`] and specify which pipeline [checkpoint](https://huggingface.co/models?library=diffusers&sort=downloads) you would like to download.
22+
23+
In this guide, you'll use [`DiffusionPipeline`] for text-to-image generation with [Latent Diffusion](https://huggingface.co/CompVis/ldm-text2im-large-256):
2024

2125
```python
2226
>>> from diffusers import DiffusionPipeline
2327

2428
>>> generator = DiffusionPipeline.from_pretrained("CompVis/ldm-text2im-large-256")
2529
```
30+
2631
The [`DiffusionPipeline`] downloads and caches all modeling, tokenization, and scheduling components.
27-
Because the model consists of roughly 1.4 billion parameters, we strongly recommend running it on GPU.
28-
You can move the generator object to GPU, just like you would in PyTorch.
32+
Because the model consists of roughly 1.4 billion parameters, we strongly recommend running it on a GPU.
33+
You can move the generator object to a GPU, just like you would in PyTorch:
2934
3035
```python
3136
>>> generator.to("cuda")
@@ -37,10 +42,19 @@ Now you can use the `generator` on your text prompt:
3742
>>> image = generator("An image of a squirrel in Picasso style").images[0]
3843
```
3944

40-
The output is by default wrapped into a [PIL Image object](https://pillow.readthedocs.io/en/stable/reference/Image.html?highlight=image#the-image-class).
45+
The output is by default wrapped into a [`PIL.Image`](https://pillow.readthedocs.io/en/stable/reference/Image.html?highlight=image#the-image-class) object.
4146

42-
You can save the image by simply calling:
47+
You can save the image by calling:
4348

4449
```python
4550
>>> image.save("image_of_squirrel_painting.png")
4651
```
52+
53+
Try out the Spaces below, and feel free to play around with the guidance scale parameter to see how it affects the image quality!
54+
55+
<iframe
56+
src="https://stabilityai-stable-diffusion.hf.space"
57+
frameborder="0"
58+
width="850"
59+
height="500"
60+
></iframe>

docs/source/en/using-diffusers/depth2img.mdx

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,13 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
1010
specific language governing permissions and limitations under the License.
1111
-->
1212

13-
# Text-Guided Image-to-Image Generation
13+
# Text-guided depth-to-image generation
1414

15-
The [`StableDiffusionDepth2ImgPipeline`] lets you pass a text prompt and an initial image to condition the generation of new images as well as a `depth_map` to preserve the images' structure. If no `depth_map` is provided, the pipeline will automatically predict the depth via an integrated depth-estimation model.
15+
[[open-in-colab]]
16+
17+
The [`StableDiffusionDepth2ImgPipeline`] lets you pass a text prompt and an initial image to condition the generation of new images. In addition, you can also pass a `depth_map` to preserve the image structure. If no `depth_map` is provided, the pipeline automatically predicts the depth via an integrated [depth-estimation model](https://github.com/isl-org/MiDaS).
18+
19+
Start by creating an instance of the [`StableDiffusionDepth2ImgPipeline`]:
1620

1721
```python
1822
import torch
@@ -25,11 +29,28 @@ pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(
2529
"stabilityai/stable-diffusion-2-depth",
2630
torch_dtype=torch.float16,
2731
).to("cuda")
32+
```
2833

34+
Now pass your prompt to the pipeline. You can also pass a `negative_prompt` to prevent certain words from guiding how an image is generated:
2935

36+
```python
3037
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
3138
init_image = Image.open(requests.get(url, stream=True).raw)
3239
prompt = "two tigers"
3340
n_prompt = "bad, deformed, ugly, bad anatomy"
3441
image = pipe(prompt=prompt, image=init_image, negative_prompt=n_prompt, strength=0.7).images[0]
42+
image
3543
```
44+
45+
| Input | Output |
46+
|---------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------|
47+
| <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/coco-cats.png" width="500"/> | <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/depth2img-tigers.png" width="500"/> |
48+
49+
Play around with the Spaces below and see if you notice a difference between generated images with and without a depth map!
50+
51+
<iframe
52+
src="https://radames-stable-diffusion-depth2img.hf.space"
53+
frameborder="0"
54+
width="850"
55+
height="500"
56+
></iframe>

docs/source/en/using-diffusers/img2img.mdx

Lines changed: 28 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -10,39 +10,34 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
1010
specific language governing permissions and limitations under the License.
1111
-->
1212

13-
# Text-Guided Image-to-Image Generation
13+
# Text-guided image-to-image generation
1414

1515
[[open-in-colab]]
1616

17-
The [`StableDiffusionImg2ImgPipeline`] lets you pass a text prompt and an initial image to condition the generation of new images. This tutorial shows how to use it for text-guided image-to-image generation with Stable Diffusion model.
17+
The [`StableDiffusionImg2ImgPipeline`] lets you pass a text prompt and an initial image to condition the generation of new images.
1818

1919
Before you begin, make sure you have all the necessary libraries installed:
2020

2121
```bash
2222
!pip install diffusers transformers ftfy accelerate
2323
```
2424

25-
Get started by creating a [`StableDiffusionImg2ImgPipeline`] with a pretrained Stable Diffusion model.
25+
Get started by creating a [`StableDiffusionImg2ImgPipeline`] with a pretrained Stable Diffusion model like [`nitrosocke/Ghibli-Diffusion`](https://huggingface.co/nitrosocke/Ghibli-Diffusion).
2626

2727
```python
2828
import torch
2929
import requests
3030
from PIL import Image
3131
from io import BytesIO
32-
3332
from diffusers import StableDiffusionImg2ImgPipeline
34-
```
3533

36-
Load the pipeline:
37-
38-
```python
3934
device = "cuda"
40-
pipe = StableDiffusionImg2ImgPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16).to(
35+
pipe = StableDiffusionImg2ImgPipeline.from_pretrained("nitrosocke/Ghibli-Diffusion", torch_dtype=torch.float16).to(
4136
device
4237
)
4338
```
4439

45-
Download an initial image and preprocess it so we can pass it to the pipeline:
40+
Download and preprocess an initial image so you can pass it to the pipeline:
4641

4742
```python
4843
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
@@ -53,61 +48,52 @@ init_image.thumbnail((768, 768))
5348
init_image
5449
```
5550

56-
![img](https://huggingface.co/datasets/YiYiXu/test-doc-assets/resolve/main/image_2_image_using_diffusers_cell_8_output_0.jpeg)
57-
58-
Define the prompt and run the pipeline:
59-
60-
```python
61-
prompt = "A fantasy landscape, trending on artstation"
62-
```
51+
<div class="flex justify-center">
52+
<img src="https://huggingface.co/datasets/YiYiXu/test-doc-assets/resolve/main/image_2_image_using_diffusers_cell_8_output_0.jpeg"/>
53+
</div>
6354

6455
<Tip>
6556

66-
`strength` is a value between 0.0 and 1.0, that controls the amount of noise that is added to the input image. Values that approach 1.0 allow for lots of variations but will also produce images that are not semantically consistent with the input.
57+
💡 `strength` is a value between 0.0 and 1.0 that controls the amount of noise added to the input image. Values that approach 1.0 allow for lots of variations but will also produce images that are not semantically consistent with the input.
6758

6859
</Tip>
6960

70-
Let's generate two images with same pipeline and seed, but with different values for `strength`:
61+
Define the prompt (for this checkpoint finetuned on Ghibli-style art, you need to prefix the prompt with the `ghibli style` tokens) and run the pipeline:
7162

7263
```python
64+
prompt = "ghibli style, a fantasy landscape with castles"
7365
generator = torch.Generator(device=device).manual_seed(1024)
7466
image = pipe(prompt=prompt, image=init_image, strength=0.75, guidance_scale=7.5, generator=generator).images[0]
75-
```
76-
77-
```python
7867
image
7968
```
8069

81-
![img](https://huggingface.co/datasets/YiYiXu/test-doc-assets/resolve/main/image_2_image_using_diffusers_cell_13_output_0.jpeg)
70+
<div class="flex justify-center">
71+
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/ghibli-castles.png"/>
72+
</div>
8273

83-
84-
```python
85-
image = pipe(prompt=prompt, image=init_image, strength=0.5, guidance_scale=7.5, generator=generator).images[0]
86-
image
87-
```
88-
89-
![img](https://huggingface.co/datasets/YiYiXu/test-doc-assets/resolve/main/image_2_image_using_diffusers_cell_14_output_1.jpeg)
90-
91-
92-
As you can see, when using a lower value for `strength`, the generated image is more closer to the original `image`.
93-
94-
Now let's use a different scheduler - [LMSDiscreteScheduler](https://huggingface.co/docs/diffusers/api/schedulers#diffusers.LMSDiscreteScheduler):
74+
You can also try experimenting with a different scheduler to see how that affects the output:
9575

9676
```python
9777
from diffusers import LMSDiscreteScheduler
9878

9979
lms = LMSDiscreteScheduler.from_config(pipe.scheduler.config)
10080
pipe.scheduler = lms
101-
```
102-
103-
```python
10481
generator = torch.Generator(device=device).manual_seed(1024)
10582
image = pipe(prompt=prompt, image=init_image, strength=0.75, guidance_scale=7.5, generator=generator).images[0]
106-
```
107-
108-
```python
10983
image
11084
```
11185

112-
![img](https://huggingface.co/datasets/YiYiXu/test-doc-assets/resolve/main/image_2_image_using_diffusers_cell_19_output_0.jpeg)
86+
<div class="flex justify-center">
87+
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/lms-ghibli.png"/>
88+
</div>
89+
90+
Check out the Spaces below, and try generating images with different values for `strength`. You'll notice that using lower values for `strength` produces images that are more similar to the original image.
91+
92+
Feel free to also switch the scheduler to the [`LMSDiscreteScheduler`] and see how that affects the output.
11393

94+
<iframe
95+
src="https://stevhliu-ghibli-img2img.hf.space"
96+
frameborder="0"
97+
width="850"
98+
height="500"
99+
></iframe>

docs/source/en/using-diffusers/inpaint.mdx

Lines changed: 31 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,13 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
1010
specific language governing permissions and limitations under the License.
1111
-->
1212

13-
# Text-Guided Image-Inpainting
13+
# Text-guided image-inpainting
1414

15-
The [`StableDiffusionInpaintPipeline`] lets you edit specific parts of an image by providing a mask and a text prompt. It uses a version of Stable Diffusion specifically trained for in-painting tasks.
15+
[[open-in-colab]]
16+
17+
The [`StableDiffusionInpaintPipeline`] allows you to edit specific parts of an image by providing a mask and a text prompt. It uses a version of Stable Diffusion, like [`runwayml/stable-diffusion-inpainting`](https://huggingface.co/runwayml/stable-diffusion-inpainting) specifically trained for inpainting tasks.
18+
19+
Get started by loading an instance of the [`StableDiffusionInpaintPipeline`]:
1620

1721
```python
1822
import PIL
@@ -22,7 +26,16 @@ from io import BytesIO
2226

2327
from diffusers import StableDiffusionInpaintPipeline
2428

29+
pipeline = StableDiffusionInpaintPipeline.from_pretrained(
30+
"runwayml/stable-diffusion-inpainting",
31+
torch_dtype=torch.float16,
32+
)
33+
pipeline = pipeline.to("cuda")
34+
```
35+
36+
Download an image and a mask of a dog which you'll eventually replace:
2537

38+
```python
2639
def download_image(url):
2740
response = requests.get(url)
2841
return PIL.Image.open(BytesIO(response.content)).convert("RGB")
@@ -33,24 +46,31 @@ mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data
3346

3447
init_image = download_image(img_url).resize((512, 512))
3548
mask_image = download_image(mask_url).resize((512, 512))
49+
```
3650

37-
pipe = StableDiffusionInpaintPipeline.from_pretrained(
38-
"runwayml/stable-diffusion-inpainting",
39-
torch_dtype=torch.float16,
40-
)
41-
pipe = pipe.to("cuda")
51+
Now you can create a prompt to replace the mask with something else:
4252

53+
```python
4354
prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
4455
image = pipe(prompt=prompt, image=init_image, mask_image=mask_image).images[0]
4556
```
4657

47-
`image` | `mask_image` | `prompt` | **Output** |
58+
`image` | `mask_image` | `prompt` | output |
4859
:-------------------------:|:-------------------------:|:-------------------------:|-------------------------:|
4960
<img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" alt="drawing" width="250"/> | <img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" alt="drawing" width="250"/> | ***Face of a yellow cat, high resolution, sitting on a park bench*** | <img src="https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/in_paint/yellow_cat_sitting_on_a_park_bench.png" alt="drawing" width="250"/> |
5061

5162

52-
You can also run this example on colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/in_painting_with_stable_diffusion_using_diffusers.ipynb)
53-
5463
<Tip warning={true}>
55-
A previous experimental implementation of in-painting used a different, lower-quality process. To ensure backwards compatibility, loading a pretrained pipeline that doesn't contain the new model will still apply the old in-painting method.
64+
65+
A previous experimental implementation of inpainting used a different, lower-quality process. To ensure backwards compatibility, loading a pretrained pipeline that doesn't contain the new model will still apply the old inpainting method.
66+
5667
</Tip>
68+
69+
Check out the Spaces below to try out image inpainting yourself!
70+
71+
<iframe
72+
src="https://runwayml-stable-diffusion-inpainting.hf.space"
73+
frameborder="0"
74+
width="850"
75+
height="500"
76+
></iframe>

docs/source/en/using-diffusers/unconditional_image_generation.mdx

Lines changed: 26 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,43 +10,60 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
1010
specific language governing permissions and limitations under the License.
1111
-->
1212

13+
# Unconditional image generation
1314

15+
[[open-in-colab]]
1416

15-
# Unconditional Image Generation
17+
Unconditional image generation is a relatively straightforward task. The model only generates images - without any additional context like text or an image - resembling the training data it was trained on.
1618

1719
The [`DiffusionPipeline`] is the easiest way to use a pre-trained diffusion system for inference.
1820

1921
Start by creating an instance of [`DiffusionPipeline`] and specify which pipeline checkpoint you would like to download.
20-
You can use the [`DiffusionPipeline`] for any [Diffusers' checkpoint](https://huggingface.co/models?library=diffusers&sort=downloads).
21-
In this guide though, you'll use [`DiffusionPipeline`] for unconditional image generation with [DDPM](https://arxiv.org/abs/2006.11239):
22+
You can use any of the 🧨 Diffusers [checkpoints](https://huggingface.co/models?library=diffusers&sort=downloads) from the Hub (the checkpoint you'll use generates images of butterflies).
23+
24+
<Tip>
25+
26+
💡 Want to train your own unconditional image generation model? Take a look at the training [guide](training/unconditional_training) to learn how to generate your own images.
27+
28+
</Tip>
29+
30+
In this guide, you'll use [`DiffusionPipeline`] for unconditional image generation with [DDPM](https://arxiv.org/abs/2006.11239):
2231

2332
```python
2433
>>> from diffusers import DiffusionPipeline
2534

26-
>>> generator = DiffusionPipeline.from_pretrained("google/ddpm-celebahq-256")
35+
>>> generator = DiffusionPipeline.from_pretrained("anton-l/ddpm-butterflies-128")
2736
```
37+
2838
The [`DiffusionPipeline`] downloads and caches all modeling, tokenization, and scheduling components.
29-
Because the model consists of roughly 1.4 billion parameters, we strongly recommend running it on GPU.
30-
You can move the generator object to GPU, just like you would in PyTorch.
39+
Because the model consists of roughly 1.4 billion parameters, we strongly recommend running it on a GPU.
40+
You can move the generator object to a GPU, just like you would in PyTorch:
3141
3242
```python
3343
>>> generator.to("cuda")
3444
```
3545

36-
Now you can use the `generator` on your text prompt:
46+
Now you can use the `generator` to generate an image:
3747

3848
```python
3949
>>> image = generator().images[0]
4050
```
4151

42-
The output is by default wrapped into a [PIL Image object](https://pillow.readthedocs.io/en/stable/reference/Image.html?highlight=image#the-image-class).
52+
The output is by default wrapped into a [`PIL.Image`](https://pillow.readthedocs.io/en/stable/reference/Image.html?highlight=image#the-image-class) object.
4353

44-
You can save the image by simply calling:
54+
You can save the image by calling:
4555

4656
```python
4757
>>> image.save("generated_image.png")
4858
```
4959

60+
Try out the Spaces below, and feel free to play around with the inference steps parameter to see how it affects the image quality!
5061

62+
<iframe
63+
src="https://stevhliu-ddpm-butterflies-128.hf.space"
64+
frameborder="0"
65+
width="850"
66+
height="500"
67+
></iframe>
5168

5269

0 commit comments

Comments
 (0)