Skip to content

Commit a4b233e

Browse files
patrickvonplatensayakpaulpcuenca
authored
Finish docs textual inversion (#3068)
* Finish docs textual inversion * Apply suggestions from code review Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]> --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]>
1 parent 524535b commit a4b233e

File tree

2 files changed

+78
-5
lines changed

2 files changed

+78
-5
lines changed

docs/source/en/training/text_inversion.mdx

Lines changed: 41 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -157,24 +157,61 @@ If you're interested in following along with your model training progress, you c
157157

158158
## Inference
159159

160-
Once you have trained a model, you can use it for inference with the [`StableDiffusionPipeline`]. Make sure you include the `placeholder_token` in your prompt, in this case, it is `<cat-toy>`.
160+
Once you have trained a model, you can use it for inference with the [`StableDiffusionPipeline`].
161+
162+
The textual inversion script will by default only save the textual inversion embedding vector(s) that have
163+
been added to the text encoder embedding matrix and consequently been trained.
161164

162165
<frameworkcontent>
163166
<pt>
167+
<Tip>
168+
169+
💡 The community has created a large library of different textual inversion embedding vectors, called [sd-concepts-library](https://huggingface.co/sd-concepts-library).
170+
Instead of training textual inversion embeddings from scratch you can also see whether a fitting textual inversion embedding has already been added to the libary.
171+
172+
</Tip>
173+
174+
To load the textual inversion embeddings you first need to load the base model that was used when training
175+
your textual inversion embedding vectors. Here we assume that [`runwayml/stable-diffusion-v1-5`](runwayml/stable-diffusion-v1-5)
176+
was used as a base model so we load it first:
164177
```python
165178
from diffusers import StableDiffusionPipeline
179+
import torch
166180

167-
model_id = "path-to-your-trained-model"
181+
model_id = "runwayml/stable-diffusion-v1-5"
168182
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
183+
```
169184

170-
prompt = "A <cat-toy> backpack"
185+
Next, we need to load the textual inversion embedding vector which can be done via the [`TextualInversionLoaderMixin.load_textual_inversion`]
186+
function. Here we'll load the embeddings of the "<cat-toy>" example from before.
187+
```python
188+
pipe.load_textual_inversion("sd-concepts-library/cat-toy")
189+
```
171190

172-
image = pipe(prompt, num_inference_steps=50, guidance_scale=7.5).images[0]
191+
Now we can run the pipeline making sure that the placeholder token `<cat-toy>` is used in our prompt.
173192

193+
```python
194+
prompt = "A <cat-toy> backpack"
195+
196+
image = pipe(prompt, num_inference_steps=50).images[0]
174197
image.save("cat-backpack.png")
175198
```
199+
200+
The function [`TextualInversionLoaderMixin.load_textual_inversion`] can not only
201+
load textual embedding vectors saved in Diffusers' format, but also embedding vectors
202+
saved in [Automatic1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui) format.
203+
To do so, you can first download an embedding vector from [civitAI](https://civitai.com/models/3036?modelVersionId=8387)
204+
and then load it locally:
205+
```python
206+
pipe.load_textual_inversion("./charturnerv2.pt")
207+
```
176208
</pt>
177209
<jax>
210+
Currently there is no `load_textual_inversion` function for Flax so one has to make sure the textual inversion
211+
embedding vector is saved as part of the model after training.
212+
213+
The model can then be run just like any other Flax model:
214+
178215
```python
179216
import jax
180217
import numpy as np

src/diffusers/loaders.py

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -368,7 +368,7 @@ def load_textual_inversion(
368368
):
369369
r"""
370370
Load textual inversion embeddings into the text encoder of stable diffusion pipelines. Both `diffusers` and
371-
`Automatic1111` formats are supported.
371+
`Automatic1111` formats are supported (see example below).
372372
373373
<Tip warning={true}>
374374
@@ -427,6 +427,42 @@ def load_textual_inversion(
427427
models](https://huggingface.co/docs/hub/models-gated#gated-models).
428428
429429
</Tip>
430+
431+
Example:
432+
433+
To load a textual inversion embedding vector in `diffusers` format:
434+
```py
435+
from diffusers import StableDiffusionPipeline
436+
import torch
437+
438+
model_id = "runwayml/stable-diffusion-v1-5"
439+
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
440+
441+
pipe.load_textual_inversion("sd-concepts-library/cat-toy")
442+
443+
prompt = "A <cat-toy> backpack"
444+
445+
image = pipe(prompt, num_inference_steps=50).images[0]
446+
image.save("cat-backpack.png")
447+
```
448+
449+
To load a textual inversion embedding vector in Automatic1111 format, make sure to first download the vector,
450+
e.g. from [civitAI](https://civitai.com/models/3036?modelVersionId=9857) and then load the vector locally:
451+
452+
```py
453+
from diffusers import StableDiffusionPipeline
454+
import torch
455+
456+
model_id = "runwayml/stable-diffusion-v1-5"
457+
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
458+
459+
pipe.load_textual_inversion("./charturnerv2.pt")
460+
461+
prompt = "charturnerv2, multiple views of the same character in the same outfit, a character turnaround of a woman wearing a black jacket and red shirt, best quality, intricate details."
462+
463+
image = pipe(prompt, num_inference_steps=50).images[0]
464+
image.save("character.png")
465+
```
430466
"""
431467
if not hasattr(self, "tokenizer") or not isinstance(self.tokenizer, PreTrainedTokenizer):
432468
raise ValueError(

0 commit comments

Comments
 (0)