Skip to content

Commit 19a2bbf

Browse files
committed
docs
1 parent c872cbc commit 19a2bbf

File tree

5 files changed

+104
-1
lines changed

5 files changed

+104
-1
lines changed

docs/source/en/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,8 @@
154154
title: Stable Diffusion
155155
- local: api/pipelines/stable_diffusion_2
156156
title: Stable Diffusion 2
157+
- local: api/pipelines/stable_unclip
158+
title: Stable unCLIP
157159
- local: api/pipelines/stochastic_karras_ve
158160
title: Stochastic Karras VE
159161
- local: api/pipelines/unclip

docs/source/en/api/pipelines/overview.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,8 @@ available a colab notebook to directly try them out.
6464
| [stable_diffusion_2](./stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Image Inpainting |
6565
| [stable_diffusion_2](./stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Super Resolution Image-to-Image |
6666
| [stable_diffusion_safe](./stable_diffusion_safe) | [**Safe Stable Diffusion**](https://arxiv.org/abs/2211.05105) | Text-Guided Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ml-research/safe-latent-diffusion/blob/main/examples/Safe%20Latent%20Diffusion.ipynb)
67+
| [stable_unclip](./stable_unclip) | **Stable unCLIP** | Text-to-Image Generation |
68+
| [stable_unclip](./stable_unclip) | **Stable unCLIP** | Image-to-Image Text-Guided Generation |
6769
| [stochastic_karras_ve](./stochastic_karras_ve) | [**Elucidating the Design Space of Diffusion-Based Generative Models**](https://arxiv.org/abs/2206.00364) | Unconditional Image Generation |
6870
| [unclip](./unclip) | [Hierarchical Text-Conditional Image Generation with CLIP Latents](https://arxiv.org/abs/2204.06125) | Text-to-Image Generation |
6971
| [versatile_diffusion](./versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Text-to-Image Generation |

docs/source/en/api/pipelines/stable_diffusion/text2img.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ specific language governing permissions and limitations under the License.
1717
The Stable Diffusion model was created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/), [runway](https://github.com/runwayml), and [LAION](https://laion.ai/). The [`StableDiffusionPipeline`] is capable of generating photo-realistic images given any text input using Stable Diffusion.
1818

1919
The original codebase can be found here:
20-
- *Stable Diffusion V1*: [CampVis/stable-diffusion](https://github.com/CompVis/stable-diffusion)
20+
- *Stable Diffusion V1*: [CompVis/stable-diffusion](https://github.com/CompVis/stable-diffusion)
2121
- *Stable Diffusion v2*: [Stability-AI/stablediffusion](https://github.com/Stability-AI/stablediffusion)
2222

2323
Available Checkpoints are:
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# Stable unCLIP
14+
15+
The stable unCLIP model is [stable diffusion 2.1](./stable_diffusion_2) finetuned to condition on CLIP image embeddings.
16+
Stable unCLIP also still conditions on text embeddings. Given the two separate conditionings, stable unCLIP can be used
17+
for text guided image variation. When combined with an unCLIP prior, it can also be used for full text to image generation.
18+
19+
## Tips
20+
21+
Stable unCLIP takes a `noise_level` as input during inference. `noise_level` determines how much noise is added
22+
to the image embeddings. A higher `noise_level` increases variation in the final un-noised images. By default,
23+
we do not add any additional noise to the image embeddings i.e. `noise_level = 0`.
24+
25+
### Available checkpoints:
26+
27+
TODO
28+
29+
### Text-to-Image Generation
30+
31+
```python
32+
import torch
33+
from diffusers import StableUnCLIPPipeline
34+
35+
pipe = StableUnCLIPPipeline.from_pretrained(
36+
"fusing/stable-unclip-2-1-l", torch_dtype=torch.float16
37+
) # TODO update model path
38+
pipe = pipe.to("cuda")
39+
40+
prompt = "a photo of an astronaut riding a horse on mars"
41+
images = pipe(prompt).images
42+
images[0].save("astronaut_horse.png")
43+
```
44+
45+
46+
### Text guided Image-to-Image Variation
47+
48+
```python
49+
import requests
50+
import torch
51+
from PIL import Image
52+
from io import BytesIO
53+
54+
from diffusers import StableUnCLIPImg2ImgPipeline
55+
56+
pipe = StableUnCLIPImg2ImgPipeline.from_pretrained(
57+
"fusing/stable-unclip-2-1-l-img2img", torch_dtype=torch.float16
58+
) # TODO update model path
59+
pipe = pipe.to("cuda")
60+
61+
url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
62+
63+
response = requests.get(url)
64+
init_image = Image.open(BytesIO(response.content)).convert("RGB")
65+
init_image = init_image.resize((768, 512))
66+
67+
prompt = "A fantasy landscape, trending on artstation"
68+
69+
images = pipe(prompt, init_image).images
70+
images[0].save("fantasy_landscape.png")
71+
```
72+
73+
### StableUnCLIPPipeline
74+
75+
[[autodoc]] StableUnCLIPPipeline
76+
- all
77+
- __call__
78+
- enable_attention_slicing
79+
- disable_attention_slicing
80+
- enable_vae_slicing
81+
- disable_vae_slicing
82+
- enable_xformers_memory_efficient_attention
83+
- disable_xformers_memory_efficient_attention
84+
85+
86+
### StableUnCLIPImg2ImgPipeline
87+
88+
[[autodoc]] StableUnCLIPImg2ImgPipeline
89+
- all
90+
- __call__
91+
- enable_attention_slicing
92+
- disable_attention_slicing
93+
- enable_vae_slicing
94+
- disable_vae_slicing
95+
- enable_xformers_memory_efficient_attention
96+
- disable_xformers_memory_efficient_attention
97+

docs/source/en/index.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,8 @@ available a colab notebook to directly try them out.
5454
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Image Inpainting |
5555
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Super Resolution Image-to-Image |
5656
| [stable_diffusion_safe](./api/pipelines/stable_diffusion_safe) | [**Safe Stable Diffusion**](https://arxiv.org/abs/2211.05105) | Text-Guided Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ml-research/safe-latent-diffusion/blob/main/examples/Safe%20Latent%20Diffusion.ipynb)
57+
| [stable_unclip](./stable_unclip) | **Stable unCLIP** | Text-to-Image Generation |
58+
| [stable_unclip](./stable_unclip) | **Stable unCLIP** | Image-to-Image Text-Guided Generation |
5759
| [stochastic_karras_ve](./api/pipelines/stochastic_karras_ve) | [**Elucidating the Design Space of Diffusion-Based Generative Models**](https://arxiv.org/abs/2206.00364) | Unconditional Image Generation |
5860
| [unclip](./api/pipelines/unclip) | [Hierarchical Text-Conditional Image Generation with CLIP Latents](https://arxiv.org/abs/2204.06125) | Text-to-Image Generation |
5961
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Text-to-Image Generation |

0 commit comments

Comments
 (0)