Skip to content

Commit 0173323

Browse files
[Community Pipeline] Add multilingual stable diffusion to community pipelines (#1142)
* Add multilingual_stable_diffusion.py file * Add multilingual stable diffusion to examples README file * Update examples/community/README.md Co-authored-by: Patrick von Platen <[email protected]>
1 parent bcdb3d5 commit 0173323

File tree

2 files changed

+507
-1
lines changed

2 files changed

+507
-1
lines changed

examples/community/README.md

Lines changed: 71 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,11 @@ If a community doesn't work as expected, please open an issue and ping the autho
1818
| Composable Stable Diffusion| Stable Diffusion Pipeline that supports prompts that contain "&#124;" in prompts (as an AND condition) and weights (separated by "&#124;" as well) to positively / negatively weight prompts. | [Composable Stable Diffusion](#composable-stable-diffusion) | - | [Mark Rich](https://github.com/MarkRich) |
1919
| Seed Resizing Stable Diffusion| Stable Diffusion Pipeline that supports resizing an image and retaining the concepts of the 512 by 512 generation. | [Seed Resizing](#seed-resizing) | - | [Mark Rich](https://github.com/MarkRich) |
2020
| Imagic Stable Diffusion | Stable Diffusion Pipeline that enables writing a text prompt to edit an existing image| [Imagic Stable Diffusion](#imagic-stable-diffusion) | - | [Mark Rich](https://github.com/MarkRich) |
21+
| Multilingual Stable Diffusion| Stable Diffusion Pipeline that supports prompts in 50 different languages. | [Multilingual Stable Diffusion](#multilingual-stable-diffusion-pipeline) | - | [Juan Carlos Piñeros](https://github.com/juancopi81) |
2122
| Image to Image Inpainting Stable Diffusion | Stable Diffusion Pipeline that enables the overlaying of two images and subsequent inpainting| [Image to Image Inpainting Stable Diffusion](#image-to-image-inpainting-stable-diffusion) | - | [Alex McKinney](https://github.com/vvvm23) |
2223

2324

25+
2426
To load a custom pipeline you just need to pass the `custom_pipeline` argument to `DiffusionPipeline`, as one of the files in `diffusers/examples/community`. Feel free to send a PR with your own pipelines, we will merge them quickly.
2527
```py
2628
pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", custom_pipeline="filename_in_the_community_folder")
@@ -502,6 +504,74 @@ image = res.images[0]
502504
image.save('./seed_resize/seed_resize_{w}_{h}_image_compare.png'.format(w=width, h=height))
503505
```
504506

507+
### Multilingual Stable Diffusion Pipeline
508+
509+
The following code can generate an images from texts in different languages using the pre-trained [mBART-50 many-to-one multilingual machine translation model](https://huggingface.co/facebook/mbart-large-50-many-to-one-mmt) and Stable Diffusion.
510+
511+
```python
512+
from PIL import Image
513+
514+
import torch
515+
516+
from diffusers import DiffusionPipeline
517+
from transformers import (
518+
pipeline,
519+
MBart50TokenizerFast,
520+
MBartForConditionalGeneration,
521+
)
522+
device = "cuda" if torch.cuda.is_available() else "cpu"
523+
device_dict = {"cuda": 0, "cpu": -1}
524+
525+
# helper function taken from: https://huggingface.co/blog/stable_diffusion
526+
def image_grid(imgs, rows, cols):
527+
assert len(imgs) == rows*cols
528+
529+
w, h = imgs[0].size
530+
grid = Image.new('RGB', size=(cols*w, rows*h))
531+
grid_w, grid_h = grid.size
532+
533+
for i, img in enumerate(imgs):
534+
grid.paste(img, box=(i%cols*w, i//cols*h))
535+
return grid
536+
537+
# Add language detection pipeline
538+
language_detection_model_ckpt = "papluca/xlm-roberta-base-language-detection"
539+
language_detection_pipeline = pipeline("text-classification",
540+
model=language_detection_model_ckpt,
541+
device=device_dict[device])
542+
543+
# Add model for language translation
544+
trans_tokenizer = MBart50TokenizerFast.from_pretrained("facebook/mbart-large-50-many-to-one-mmt")
545+
trans_model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-50-many-to-one-mmt").to(device)
546+
547+
diffuser_pipeline = DiffusionPipeline.from_pretrained(
548+
"CompVis/stable-diffusion-v1-4",
549+
custom_pipeline="multilingual_stable_diffusion",
550+
detection_pipeline=language_detection_pipeline,
551+
translation_model=trans_model,
552+
translation_tokenizer=trans_tokenizer,
553+
revision="fp16",
554+
torch_dtype=torch.float16,
555+
)
556+
557+
diffuser_pipeline.enable_attention_slicing()
558+
diffuser_pipeline = diffuser_pipeline.to(device)
559+
560+
prompt = ["a photograph of an astronaut riding a horse",
561+
"Una casa en la playa",
562+
"Ein Hund, der Orange isst",
563+
"Un restaurant parisien"]
564+
565+
output = diffuser_pipeline(prompt)
566+
567+
images = output.images
568+
569+
grid = image_grid(images, rows=2, cols=2)
570+
```
571+
572+
This example produces the following images:
573+
![image](https://user-images.githubusercontent.com/4313860/198328706-295824a4-9856-4ce5-8e66-278ceb42fd29.png)
574+
505575
### Image to Image Inpainting Stable Diffusion
506576

507577
Similar to the standard stable diffusion inpainting example, except with the addition of an `inner_image` argument.
@@ -534,4 +604,4 @@ pipe = pipe.to("cuda")
534604

535605
prompt = "Your prompt here!"
536606
image = pipe(prompt=prompt, image=init_image, inner_image=inner_image, mask_image=mask_image).images[0]
537-
```
607+
```

0 commit comments

Comments
 (0)