Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
117 commits
Select commit Hold shift + click to select a range
f7ebe56
Warning for too long prompts in DiffusionPipelines (Resolve #447) (#472)
shirayu Sep 27, 2022
bb0c5d1
Fix docs link to train_unconditional.py (#642)
AbdullahAlfaraj Sep 27, 2022
b671cb0
Remove deprecated `torch_device` kwarg (#623)
pcuenca Sep 27, 2022
b694531
refactor: `custom_init_isort` readability fixups (#631)
ryanrussell Sep 27, 2022
c070e5f
Remove inappropriate docstrings in LMS docstrings. (#634)
pcuenca Sep 27, 2022
ab3fd67
Flax pipeline pndm (#583)
pcuenca Sep 27, 2022
d886e49
Fix `SpatialTransformer` (#578)
ydshieh Sep 27, 2022
3b747de
Add training example for DreamBooth. (#554)
Victarry Sep 27, 2022
bd8df2d
[Pytorch] Pytorch only schedulers (#534)
kashif Sep 27, 2022
ac665b6
[examples/dreambooth] don't pass tensor_format to scheduler. (#649)
patil-suraj Sep 27, 2022
e5eed52
[dreambooth] update install section (#650)
patil-suraj Sep 27, 2022
3304538
[DDIM, DDPM] fix add_noise (#648)
patil-suraj Sep 27, 2022
85494e8
[Pytorch] add dep. warning for pytorch schedulers (#651)
kashif Sep 27, 2022
c0c98df
[CLIPGuidedStableDiffusion] remove set_format from pipeline (#653)
patil-suraj Sep 27, 2022
d8572f2
Fix onnx tensor format (#654)
anton-l Sep 27, 2022
235770d
Fix `main`: stable diffusion pipelines cannot be loaded (#655)
pcuenca Sep 27, 2022
765506c
Fix the LMS pytorch regression (#664)
anton-l Sep 28, 2022
7f31142
Added script to save during textual inversion training. Issue 524 (#645)
isamu-isozaki Sep 28, 2022
c16761e
[CLIPGuidedStableDiffusion] take the correct text embeddings (#667)
patil-suraj Sep 28, 2022
f5b9bc8
Update index.mdx (#670)
tmabraham Sep 29, 2022
210be4f
[examples] update transfomers version (#665)
patil-suraj Sep 29, 2022
84b9df5
[gradient checkpointing] lower tolerance for test (#652)
patil-suraj Sep 29, 2022
f10576a
Flax `from_pretrained`: clean up `mismatched_keys`. (#630)
pcuenca Sep 29, 2022
3dacbb9
`trained_betas` ignored in some schedulers (#635)
vishnu-anirudh Sep 29, 2022
a7058f4
Renamed x -> hidden_states in resnet.py (#676)
daspartho Sep 29, 2022
9ebaea5
Optimize Stable Diffusion (#371)
NouamaneTazi Sep 30, 2022
a784be2
Allow resolutions that are not multiples of 64 (#505)
jachiam Sep 30, 2022
877bec8
refactor: update ldm-bert `config.json` url closes #675 (#680)
ryanrussell Sep 30, 2022
daa2205
[docs] fix table in fp16.mdx (#683)
NouamaneTazi Sep 30, 2022
bb0f2a0
Update README.md
patrickvonplaten Sep 30, 2022
552b967
Update README.md
patrickvonplaten Sep 30, 2022
b2cfc7a
Fix slow tests (#689)
NouamaneTazi Sep 30, 2022
5156acc
Fix BibText citation (#693)
osanseviero Oct 1, 2022
2558977
Add callback parameters for Stable Diffusion pipelines (#521)
jamestiotio Oct 2, 2022
14f4af8
[dreambooth] fix applying clip_grad_norm_ (#686)
patil-suraj Oct 3, 2022
500ca5a
Forgot to add the OG!
patrickvonplaten Oct 3, 2022
249b36c
Flax: add shape argument to `set_timesteps` (#690)
pcuenca Oct 3, 2022
7d0ba59
Fix type annotations on StableDiffusionPipeline.__call__ (#682)
tasercake Oct 3, 2022
688031c
Fix import with Flax but without PyTorch (#688)
pcuenca Oct 3, 2022
b35bac4
[Support PyTorch 1.8] Remove inference mode (#707)
patrickvonplaten Oct 3, 2022
1070e1a
[CI] Speed up slow tests (#708)
anton-l Oct 3, 2022
f1484b8
[Utils] Add deprecate function and move testing_utils under utils (#659)
patrickvonplaten Oct 3, 2022
4ff4d4d
Checkpoint conversion script from Diffusers => Stable Diffusion (Comp…
jachiam Oct 4, 2022
f1b9ee7
[Docs] fix docstring for issue #709 (#710)
kashif Oct 4, 2022
09859a3
Update schedulers README.md (#694)
tmabraham Oct 4, 2022
4d1cce2
add accelerate to load models with smaller memory footprint (#361)
piEsposito Oct 4, 2022
7e92c5b
Fix typos (#718)
shirayu Oct 4, 2022
5ac1f61
Add an argument "negative_prompt" (#549)
shirayu Oct 4, 2022
215bb40
Fix import if PyTorch is not installed (#715)
pcuenca Oct 4, 2022
6b22192
Remove comments no longer appropriate (#716)
pcuenca Oct 4, 2022
14b9754
[train_unconditional] fix applying clip_grad_norm_ (#721)
patil-suraj Oct 4, 2022
7265dd8
renamed x to meaningful variable in resnet.py (#677)
i-am-epic Oct 4, 2022
a8a3a20
[Tests] Add accelerate to testing (#729)
patrickvonplaten Oct 5, 2022
08d4fb6
[dreambooth] Using already created `Path` in dataset (#681)
DrInfiniteExplorer Oct 5, 2022
b9eea06
Include CLIPTextModel parameters in conversion (#695)
kanewallmann Oct 5, 2022
60c9634
Avoid negative strides for tensors (#717)
shirayu Oct 5, 2022
726aba0
[Pytorch] pytorch only timesteps (#724)
kashif Oct 5, 2022
6b09f37
[Scheduler design] The pragmatic approach (#719)
anton-l Oct 5, 2022
3dcc75c
Removing `autocast` for `35-25% speedup`. (`autocast` considered harm…
Narsil Oct 5, 2022
78744b6
No more use_auth_token=True (#733)
patrickvonplaten Oct 5, 2022
19e559d
remove use_auth_token from remaining places (#737)
patil-suraj Oct 5, 2022
5493524
Replace messages that have empty backquotes (#738)
pcuenca Oct 5, 2022
4deb16e
[Docs] Advertise fp16 instead of autocast (#740)
patrickvonplaten Oct 5, 2022
916754e
make style
patrickvonplaten Oct 5, 2022
367a671
remove use_auth_token from for TI test (#747)
patil-suraj Oct 6, 2022
c119dc4
allow multiple generations per prompt (#741)
patil-suraj Oct 6, 2022
df9c070
Add back-compatibility to LMS timesteps (#750)
anton-l Oct 6, 2022
3383f77
update the clip guided PR according to the new API (#751)
patil-suraj Oct 6, 2022
6c64741
Raise an error when moving an fp16 pipeline to CPU (#749)
anton-l Oct 6, 2022
0883968
Better steps deprecation for LMS (#753)
anton-l Oct 6, 2022
f3128c8
Actually fix the grad ckpt test (#734)
patil-suraj Oct 6, 2022
d9c449e
Custome Pipelines (#744)
patrickvonplaten Oct 6, 2022
6613a8c
make CI happy
patrickvonplaten Oct 6, 2022
9c9462f
Python 3.7 doesn't like keys() + keys()
patrickvonplaten Oct 6, 2022
2e209c3
[v0.4.0] Temporarily remove Flax modules from the public API (#755)
anton-l Oct 6, 2022
4581f14
Update clip_guided_stable_diffusion.py
patil-suraj Oct 6, 2022
3b1d2ca
Release: v0.4.0
anton-l Oct 6, 2022
0fe59b6
Merge remote-tracking branch 'origin/main'
anton-l Oct 6, 2022
c15cda0
Bump to v0.4.1.dev0
anton-l Oct 6, 2022
970e306
Revert "[v0.4.0] Temporarily remove Flax modules from the public API …
anton-l Oct 6, 2022
435433c
Update clip_guided_stable_diffusion.py
patil-suraj Oct 6, 2022
737195d
Created using Colaboratory
patil-suraj Oct 6, 2022
9531150
Bump to v0.5.0.dev0
anton-l Oct 6, 2022
2fa55fc
Merge remote-tracking branch 'origin/main'
anton-l Oct 6, 2022
ae672d5
[Tests] Lower required memory for clip guided and fix super edge-case…
patrickvonplaten Oct 6, 2022
d3f1a4c
Revert "Bump to v0.5.0.dev0"
anton-l Oct 6, 2022
fdfa7c8
Change fp16 error to warning (#764)
apolinario Oct 7, 2022
91ddd2a
Release: v0.4.1
patrickvonplaten Oct 7, 2022
9a95414
Bump to v0.5.0dev0
patrickvonplaten Oct 7, 2022
c93a8cc
remove bogus folder
patrickvonplaten Oct 7, 2022
7258dc4
remove bogus folder no.2
patrickvonplaten Oct 7, 2022
906e410
Fix push_to_hub for dreambooth and textual_inversion (#748)
YaYaB Oct 7, 2022
75bb6d2
Fix ONNX conversion script opset argument type (#739)
justinchuby Oct 7, 2022
e0fece2
Add final latent slice checks to SD pipeline intermediate state tests…
jamestiotio Oct 7, 2022
cb0bf0b
fix(DDIM scheduler): use correct dtype for noise (#742)
keturn Oct 7, 2022
ec831b6
[schedulers] hanlde dtype in add_noise (#767)
patil-suraj Oct 7, 2022
92d7086
[img2img, inpainting] fix fp16 inference (#769)
patil-suraj Oct 7, 2022
f3983d1
[Tests] Fix tests (#774)
patrickvonplaten Oct 7, 2022
5af6eed
debug an exception (#638)
LowinLi Oct 10, 2022
a73f8b7
Clean up resnet.py file (#780)
Oct 10, 2022
feaa732
add sigmoid betas (#777)
Oct 10, 2022
fab1752
[Low CPU memory] + device map (#772)
patrickvonplaten Oct 10, 2022
22963ed
Fix gradient checkpointing test (#797)
patrickvonplaten Oct 10, 2022
71ca10c
fix typo docstring in unet2d (#798)
Oct 10, 2022
81bdbb5
DreamBooth DeepSpeed support for under 8 GB VRAM training (#735)
Ttl Oct 10, 2022
797b290
support bf16 for stable diffusion (#792)
patil-suraj Oct 11, 2022
66a5279
stable diffusion fine-tuning (#356)
patil-suraj Oct 11, 2022
a124204
Flax: Trickle down `norm_num_groups` (#789)
akash5474 Oct 11, 2022
e895952
Eventually preserve this typo? :) (#804)
spezialspezial Oct 11, 2022
757babf
Fix indentation in the code example (#802)
osanseviero Oct 11, 2022
24b8b5c
`mps`: Alternative implementation for `repeat_interleave` (#766)
pcuenca Oct 11, 2022
c1b6ea3
Update img2img.mdx
patrickvonplaten Oct 11, 2022
e8b7396
Merge remote-tracking branch 'upstream/main' into euler_a_redesign_merge
AbdullahAlfaraj Oct 12, 2022
558ced5
remove batch_size
AbdullahAlfaraj Oct 12, 2022
3e95a3f
Merge remote-tracking branch 'upstream/main' into euler_a_redesign_merge
AbdullahAlfaraj Oct 12, 2022
8e59861
EulerAScheduler work again but break the redesign
AbdullahAlfaraj Oct 13, 2022
a659e02
passing index to step() to access t and prev_t
AbdullahAlfaraj Oct 15, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/pr_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
runs-on: [ self-hosted, docker-gpu ]
container:
image: python:3.7
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/

steps:
- name: Checkout diffusers
Expand Down
16 changes: 4 additions & 12 deletions .github/workflows/push_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,10 @@ env:
jobs:
run_tests_single_gpu:
name: Diffusers tests
strategy:
fail-fast: false
matrix:
machine_type: [ single-gpu ]
runs-on: [ self-hosted, docker-gpu, '${{ matrix.machine_type }}' ]
runs-on: [ self-hosted, docker-gpu, single-gpu ]
container:
image: nvcr.io/nvidia/pytorch:22.07-py3
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache

steps:
- name: Checkout diffusers
Expand Down Expand Up @@ -66,14 +62,10 @@ jobs:

run_examples_single_gpu:
name: Examples tests
strategy:
fail-fast: false
matrix:
machine_type: [ single-gpu ]
runs-on: [ self-hosted, docker-gpu, '${{ matrix.machine_type }}' ]
runs-on: [ self-hosted, docker-gpu, single-gpu ]
container:
image: nvcr.io/nvidia/pytorch:22.07-py3
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache

steps:
- name: Checkout diffusers
Expand Down
79 changes: 48 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,17 +74,18 @@ You need to accept the model license before downloading or using the Stable Diff

### Text-to-Image generation with Stable Diffusion

We recommend using the model in [half-precision (`fp16`)](https://pytorch.org/blog/accelerating-training-on-nvidia-gpus-with-pytorch-automatic-mixed-precision/) as it gives almost always the same results as full
precision while being roughly twice as fast and requiring half the amount of GPU RAM.

```python
# make sure you're logged in with `huggingface-cli login`
from torch import autocast
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True)
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_type=torch.float16, revision="fp16")
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
with autocast("cuda"):
image = pipe(prompt).images[0]
image = pipe(prompt).images[0]
```

**Note**: If you don't want to use the token, you can also simply download the model weights
Expand All @@ -104,30 +105,27 @@ pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4")
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
with autocast("cuda"):
image = pipe(prompt).images[0]
image = pipe(prompt).images[0]
```

If you are limited by GPU memory, you might want to consider using the model in `fp16` as
well as chunking the attention computation.
If you are limited by GPU memory, you might want to consider chunking the attention computation in addition
to using `fp16`.
The following snippet should result in less than 4GB VRAM.

```python
pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
revision="fp16",
torch_dtype=torch.float16,
use_auth_token=True
)
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
pipe.enable_attention_slicing()
with autocast("cuda"):
image = pipe(prompt).images[0]
image = pipe(prompt).images[0]
```

Finally, if you wish to use a different scheduler, you can simply instantiate
If you wish to use a different scheduler, you can simply instantiate
it before the pipeline and pass it to `from_pretrained`.

```python
Expand All @@ -144,13 +142,29 @@ pipe = StableDiffusionPipeline.from_pretrained(
revision="fp16",
torch_dtype=torch.float16,
scheduler=lms,
use_auth_token=True
)
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
with autocast("cuda"):
image = pipe(prompt).images[0]
image = pipe(prompt).images[0]

image.save("astronaut_rides_horse.png")
```

If you want to run Stable Diffusion on CPU or you want to have maximum precision on GPU,
please run the model in the default *full-precision* setting:

```python
# make sure you're logged in with `huggingface-cli login`
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")

# disable the following line if you run on CPU
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0]

image.save("astronaut_rides_horse.png")
```
Expand All @@ -160,7 +174,6 @@ image.save("astronaut_rides_horse.png")
The `StableDiffusionImg2ImgPipeline` lets you pass a text prompt and an initial image to condition the generation of new images.

```python
from torch import autocast
import requests
import torch
from PIL import Image
Expand All @@ -175,10 +188,9 @@ pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
model_id_or_path,
revision="fp16",
torch_dtype=torch.float16,
use_auth_token=True
)
# or download via git clone https://huggingface.co/CompVis/stable-diffusion-v1-4
# and pass `model_id_or_path="./stable-diffusion-v1-4"` without having to use `use_auth_token=True`.
# and pass `model_id_or_path="./stable-diffusion-v1-4"`.
pipe = pipe.to(device)

# let's download an initial image
Expand All @@ -190,8 +202,7 @@ init_image = init_image.resize((768, 512))

prompt = "A fantasy landscape, trending on artstation"

with autocast("cuda"):
images = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5).images
images = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5).images

images[0].save("fantasy_landscape.png")
```
Expand All @@ -204,7 +215,6 @@ The `StableDiffusionInpaintPipeline` lets you edit specific parts of an image by
```python
from io import BytesIO

from torch import autocast
import torch
import requests
import PIL
Expand All @@ -227,15 +237,13 @@ pipe = StableDiffusionInpaintPipeline.from_pretrained(
model_id_or_path,
revision="fp16",
torch_dtype=torch.float16,
use_auth_token=True
)
# or download via git clone https://huggingface.co/CompVis/stable-diffusion-v1-4
# and pass `model_id_or_path="./stable-diffusion-v1-4"` without having to use `use_auth_token=True`.
# and pass `model_id_or_path="./stable-diffusion-v1-4"`.
pipe = pipe.to(device)

prompt = "a cat sitting on a bench"
with autocast("cuda"):
images = pipe(prompt=prompt, init_image=init_image, mask_image=mask_image, strength=0.75).images
images = pipe(prompt=prompt, init_image=init_image, mask_image=mask_image, strength=0.75).images

images[0].save("cat_on_bench.png")
```
Expand All @@ -258,7 +266,6 @@ If you want to run the code yourself 💻, you can try out:
- [Text-to-Image Latent Diffusion](https://huggingface.co/CompVis/ldm-text2im-large-256)
```python
# !pip install diffusers transformers
from torch import autocast
from diffusers import DiffusionPipeline

device = "cuda"
Expand All @@ -270,16 +277,14 @@ ldm = ldm.to(device)

# run pipeline in inference (sample random noise and denoise)
prompt = "A painting of a squirrel eating a burger"
with autocast(device):
image = ldm([prompt], num_inference_steps=50, eta=0.3, guidance_scale=6).images[0]
image = ldm([prompt], num_inference_steps=50, eta=0.3, guidance_scale=6).images[0]

# save image
image.save("squirrel.png")
```
- [Unconditional Diffusion with discrete scheduler](https://huggingface.co/google/ddpm-celebahq-256)
```python
# !pip install diffusers
from torch import autocast
from diffusers import DDPMPipeline, DDIMPipeline, PNDMPipeline

model_id = "google/ddpm-celebahq-256"
Expand All @@ -290,8 +295,7 @@ ddpm = DDPMPipeline.from_pretrained(model_id) # you can replace DDPMPipeline wi
ddpm.to(device)

# run pipeline in inference (sample random noise and denoise)
with autocast("cuda"):
image = ddpm().images[0]
image = ddpm().images[0]

# save image
image.save("ddpm_generated_image.png")
Expand Down Expand Up @@ -377,3 +381,16 @@ This library concretizes previous work by many different authors and would not h
- @yang-song's Score-VE and Score-VP implementations, available [here](https://github.com/yang-song/score_sde_pytorch)

We also want to thank @heejkoo for the very helpful overview of papers, code and resources on diffusion models, available [here](https://github.com/heejkoo/Awesome-Diffusion-Models) as well as @crowsonkb and @rromb for useful discussions and insights.

## Citation

```bibtex
@misc{von-platen-etal-2022-diffusers,
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf},
title = {Diffusers: State-of-the-art diffusion models},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/huggingface/diffusers}}
}
```
2 changes: 2 additions & 0 deletions docs/source/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
title: "Loading Pipelines, Models, and Schedulers"
- local: using-diffusers/configuration
title: "Configuring Pipelines, Models, and Schedulers"
- local: using-diffusers/custom_pipelines
title: "Loading and Creating Custom Pipelines"
title: "Loading"
- sections:
- local: using-diffusers/unconditional_image_generation
Expand Down
18 changes: 6 additions & 12 deletions docs/source/api/pipelines/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -98,15 +98,13 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing

```python
# make sure you're logged in with `huggingface-cli login`
from torch import autocast
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler

pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True)
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
with autocast("cuda"):
image = pipe(prompt).images[0]
image = pipe(prompt).images[0]

image.save("astronaut_rides_horse.png")
```
Expand All @@ -116,7 +114,6 @@ image.save("astronaut_rides_horse.png")
The `StableDiffusionImg2ImgPipeline` lets you pass a text prompt and an initial image to condition the generation of new images.

```python
from torch import autocast
import requests
from PIL import Image
from io import BytesIO
Expand All @@ -126,7 +123,7 @@ from diffusers import StableDiffusionImg2ImgPipeline
# load the pipeline
device = "cuda"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16, use_auth_token=True
"CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16
).to(device)

# let's download an initial image
Expand All @@ -138,8 +135,7 @@ init_image = init_image.resize((768, 512))

prompt = "A fantasy landscape, trending on artstation"

with autocast("cuda"):
images = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5).images
images = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5).images

images[0].save("fantasy_landscape.png")
```
Expand All @@ -157,7 +153,6 @@ The `StableDiffusionInpaintPipeline` lets you edit specific parts of an image by
```python
from io import BytesIO

from torch import autocast
import requests
import PIL

Expand All @@ -177,12 +172,11 @@ mask_image = download_image(mask_url).resize((512, 512))

device = "cuda"
pipe = StableDiffusionInpaintPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16, use_auth_token=True
"CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16
).to(device)

prompt = "a cat sitting on a bench"
with autocast("cuda"):
images = pipe(prompt=prompt, init_image=init_image, mask_image=mask_image, strength=0.75).images
images = pipe(prompt=prompt, init_image=init_image, mask_image=mask_image, strength=0.75).images

images[0].save("cat_on_bench.png")
```
Expand Down
5 changes: 2 additions & 3 deletions docs/source/api/schedulers.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,16 +36,15 @@ This allows for rapid experimentation and cleaner abstractions in the code, wher
To this end, the design of schedulers is such that:

- Schedulers can be used interchangeably between diffusion models in inference to find the preferred trade-off between speed and generation quality.
- Schedulers are currently by default in PyTorch, but are designed to be framework independent (partial Numpy support currently exists).
- Schedulers are currently by default in PyTorch, but are designed to be framework independent (partial Jax support currently exists).


## API

The core API for any new scheduler must follow a limited structure.
- Schedulers should provide one or more `def step(...)` functions that should be called to update the generated sample iteratively.
- Schedulers should provide a `set_timesteps(...)` method that configures the parameters of a schedule function for a specific inference task.
- Schedulers should be framework-agnostic, but provide a simple functionality to convert the scheduler into a specific framework, such as PyTorch
with a `set_format(...)` method.
- Schedulers should be framework-specific.

The base class [`SchedulerMixin`] implements low level utilities used by multiple schedulers.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ available a colab notebook to directly try them out.
| Pipeline | Paper | Tasks | Colab
|---|---|:---:|:---:|
| [ddpm](./api/pipelines/ddpm) | [**Denoising Diffusion Probabilistic Models**](https://arxiv.org/abs/2006.11239) | Unconditional Image Generation |
| [ddim](./api/pipelines/ddim) | [**Denoising Diffusion Implicit Models**](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
| [ddim](./api/pipelines/ddim) | [**Denoising Diffusion Implicit Models**](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation |
| [latent_diffusion](./api/pipelines/latent_diffusion) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752)| Text-to-Image Generation |
| [latent_diffusion_uncond](./api/pipelines/latent_diffusion_uncond) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752) | Unconditional Image Generation |
| [pndm](./api/pipelines/pndm) | [**Pseudo Numerical Methods for Diffusion Models on Manifolds**](https://arxiv.org/abs/2202.09778) | Unconditional Image Generation |
Expand Down
Loading