diff --git a/docs/source/en/training/kandinsky.md b/docs/source/en/training/kandinsky.md index 2caec1035fa9..a1854d76c492 100644 --- a/docs/source/en/training/kandinsky.md +++ b/docs/source/en/training/kandinsky.md @@ -205,7 +205,7 @@ model_pred = unet(noisy_latents, timesteps, None, added_cond_kwargs=added_cond_k Once youโ€™ve made all your changes or youโ€™re okay with the default configuration, youโ€™re ready to launch the training script! ๐Ÿš€ -You'll train on the [Pokรฉmon BLIP captions](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) dataset to generate your own Pokรฉmon, but you can also create and train on your own dataset by following the [Create a dataset for training](create_dataset) guide. Set the environment variable `DATASET_NAME` to the name of the dataset on the Hub or if you're training on your own files, set the environment variable `TRAIN_DIR` to a path to your dataset. +You'll train on the [Naruto BLIP captions](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions) dataset to generate your own Naruto characters, but you can also create and train on your own dataset by following the [Create a dataset for training](create_dataset) guide. Set the environment variable `DATASET_NAME` to the name of the dataset on the Hub or if you're training on your own files, set the environment variable `TRAIN_DIR` to a path to your dataset. If youโ€™re training on more than one GPU, add the `--multi_gpu` parameter to the `accelerate launch` command. @@ -219,7 +219,7 @@ To monitor training progress with Weights & Biases, add the `--report_to=wandb` ```bash -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" train_text_to_image_prior.py \ --dataset_name=$DATASET_NAME \ @@ -232,17 +232,17 @@ accelerate launch --mixed_precision="fp16" train_text_to_image_prior.py \ --checkpoints_total_limit=3 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ - --validation_prompts="A robot pokemon, 4k photo" \ + --validation_prompts="A robot naruto, 4k photo" \ --report_to="wandb" \ --push_to_hub \ - --output_dir="kandi2-prior-pokemon-model" + --output_dir="kandi2-prior-naruto-model" ``` ```bash -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" train_text_to_image_decoder.py \ --dataset_name=$DATASET_NAME \ @@ -256,10 +256,10 @@ accelerate launch --mixed_precision="fp16" train_text_to_image_decoder.py \ --checkpoints_total_limit=3 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ - --validation_prompts="A robot pokemon, 4k photo" \ + --validation_prompts="A robot naruto, 4k photo" \ --report_to="wandb" \ --push_to_hub \ - --output_dir="kandi2-decoder-pokemon-model" + --output_dir="kandi2-decoder-naruto-model" ``` @@ -279,7 +279,7 @@ prior_components = {"prior_" + k: v for k,v in prior_pipeline.components.items() pipeline = AutoPipelineForText2Image.from_pretrained("kandinsky-community/kandinsky-2-2-decoder", **prior_components, torch_dtype=torch.float16) pipe.enable_model_cpu_offload() -prompt="A robot pokemon, 4k photo" +prompt="A robot naruto, 4k photo" image = pipeline(prompt=prompt, negative_prompt=negative_prompt).images[0] ``` @@ -299,7 +299,7 @@ import torch pipeline = AutoPipelineForText2Image.from_pretrained("path/to/saved/model", torch_dtype=torch.float16) pipeline.enable_model_cpu_offload() -prompt="A robot pokemon, 4k photo" +prompt="A robot naruto, 4k photo" image = pipeline(prompt=prompt).images[0] ``` @@ -313,7 +313,7 @@ unet = UNet2DConditionModel.from_pretrained("path/to/saved/model" + "/checkpoint pipeline = AutoPipelineForText2Image.from_pretrained("kandinsky-community/kandinsky-2-2-decoder", unet=unet, torch_dtype=torch.float16) pipeline.enable_model_cpu_offload() -image = pipeline(prompt="A robot pokemon, 4k photo").images[0] +image = pipeline(prompt="A robot naruto, 4k photo").images[0] ``` diff --git a/docs/source/en/training/lora.md b/docs/source/en/training/lora.md index 78ac8a140e7c..737e6f0dfc32 100644 --- a/docs/source/en/training/lora.md +++ b/docs/source/en/training/lora.md @@ -170,7 +170,7 @@ Aside from setting up the LoRA layers, the training script is more or less the s Once you've made all your changes or you're okay with the default configuration, you're ready to launch the training script! ๐Ÿš€ -Let's train on the [Pokรฉmon BLIP captions](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) dataset to generate our own Pokรฉmon. Set the environment variables `MODEL_NAME` and `DATASET_NAME` to the model and dataset respectively. You should also specify where to save the model in `OUTPUT_DIR`, and the name of the model to save to on the Hub with `HUB_MODEL_ID`. The script creates and saves the following files to your repository: +Let's train on the [Naruto BLIP captions](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions) dataset to generate your own Naruto characters. Set the environment variables `MODEL_NAME` and `DATASET_NAME` to the model and dataset respectively. You should also specify where to save the model in `OUTPUT_DIR`, and the name of the model to save to on the Hub with `HUB_MODEL_ID`. The script creates and saves the following files to your repository: - saved model checkpoints - `pytorch_lora_weights.safetensors` (the trained LoRA weights) @@ -185,9 +185,9 @@ A full training run takes ~5 hours on a 2080 Ti GPU with 11GB of VRAM. ```bash export MODEL_NAME="runwayml/stable-diffusion-v1-5" -export OUTPUT_DIR="/sddata/finetune/lora/pokemon" -export HUB_MODEL_ID="pokemon-lora" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export OUTPUT_DIR="/sddata/finetune/lora/naruto" +export HUB_MODEL_ID="naruto-lora" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" train_text_to_image_lora.py \ --pretrained_model_name_or_path=$MODEL_NAME \ @@ -208,7 +208,7 @@ accelerate launch --mixed_precision="fp16" train_text_to_image_lora.py \ --hub_model_id=${HUB_MODEL_ID} \ --report_to=wandb \ --checkpointing_steps=500 \ - --validation_prompt="A pokemon with blue eyes." \ + --validation_prompt="A naruto with blue eyes." \ --seed=1337 ``` @@ -220,7 +220,7 @@ import torch pipeline = AutoPipelineForText2Image.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16).to("cuda") pipeline.load_lora_weights("path/to/lora/model", weight_name="pytorch_lora_weights.safetensors") -image = pipeline("A pokemon with blue eyes").images[0] +image = pipeline("A naruto with blue eyes").images[0] ``` ## Next steps diff --git a/docs/source/en/training/sdxl.md b/docs/source/en/training/sdxl.md index 0e51e720b48c..78178047d9fd 100644 --- a/docs/source/en/training/sdxl.md +++ b/docs/source/en/training/sdxl.md @@ -176,7 +176,7 @@ If you want to learn more about how the training loop works, check out the [Unde Once youโ€™ve made all your changes or youโ€™re okay with the default configuration, youโ€™re ready to launch the training script! ๐Ÿš€ -Letโ€™s train on the [Pokรฉmon BLIP captions](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) dataset to generate your own Pokรฉmon. Set the environment variables `MODEL_NAME` and `DATASET_NAME` to the model and the dataset (either from the Hub or a local path). You should also specify a VAE other than the SDXL VAE (either from the Hub or a local path) with `VAE_NAME` to avoid numerical instabilities. +Letโ€™s train on the [Naruto BLIP captions](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions) dataset to generate your own Naruto characters. Set the environment variables `MODEL_NAME` and `DATASET_NAME` to the model and the dataset (either from the Hub or a local path). You should also specify a VAE other than the SDXL VAE (either from the Hub or a local path) with `VAE_NAME` to avoid numerical instabilities. @@ -187,7 +187,7 @@ To monitor training progress with Weights & Biases, add the `--report_to=wandb` ```bash export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0" export VAE_NAME="madebyollin/sdxl-vae-fp16-fix" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch train_text_to_image_sdxl.py \ --pretrained_model_name_or_path=$MODEL_NAME \ @@ -211,7 +211,7 @@ accelerate launch train_text_to_image_sdxl.py \ --validation_prompt="a cute Sundar Pichai creature" \ --validation_epochs 5 \ --checkpointing_steps=5000 \ - --output_dir="sdxl-pokemon-model" \ + --output_dir="sdxl-naruto-model" \ --push_to_hub ``` @@ -226,9 +226,9 @@ import torch pipeline = DiffusionPipeline.from_pretrained("path/to/your/model", torch_dtype=torch.float16).to("cuda") -prompt = "A pokemon with green eyes and red legs." +prompt = "A naruto with green eyes and red legs." image = pipeline(prompt, num_inference_steps=30, guidance_scale=7.5).images[0] -image.save("pokemon.png") +image.save("naruto.png") ``` @@ -244,11 +244,11 @@ import torch_xla.core.xla_model as xm device = xm.xla_device() pipeline = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0").to(device) -prompt = "A pokemon with green eyes and red legs." +prompt = "A naruto with green eyes and red legs." start = time() image = pipeline(prompt, num_inference_steps=inference_steps).images[0] print(f'Compilation time is {time()-start} sec') -image.save("pokemon.png") +image.save("naruto.png") start = time() image = pipeline(prompt, num_inference_steps=inference_steps).images[0] diff --git a/docs/source/en/training/text2image.md b/docs/source/en/training/text2image.md index d5c772c9db86..f69e9a710e8f 100644 --- a/docs/source/en/training/text2image.md +++ b/docs/source/en/training/text2image.md @@ -158,7 +158,7 @@ Once you've made all your changes or you're okay with the default configuration, -Let's train on the [Pokรฉmon BLIP captions](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) dataset to generate your own Pokรฉmon. Set the environment variables `MODEL_NAME` and `dataset_name` to the model and the dataset (either from the Hub or a local path). If you're training on more than one GPU, add the `--multi_gpu` parameter to the `accelerate launch` command. +Let's train on the [Naruto BLIP captions](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions) dataset to generate your own Naruto characters. Set the environment variables `MODEL_NAME` and `dataset_name` to the model and the dataset (either from the Hub or a local path). If you're training on more than one GPU, add the `--multi_gpu` parameter to the `accelerate launch` command. @@ -168,7 +168,7 @@ To train on a local dataset, set the `TRAIN_DIR` and `OUTPUT_DIR` environment va ```bash export MODEL_NAME="runwayml/stable-diffusion-v1-5" -export dataset_name="lambdalabs/pokemon-blip-captions" +export dataset_name="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" train_text_to_image.py \ --pretrained_model_name_or_path=$MODEL_NAME \ @@ -183,7 +183,7 @@ accelerate launch --mixed_precision="fp16" train_text_to_image.py \ --max_grad_norm=1 \ --enable_xformers_memory_efficient_attention --lr_scheduler="constant" --lr_warmup_steps=0 \ - --output_dir="sd-pokemon-model" \ + --output_dir="sd-naruto-model" \ --push_to_hub ``` @@ -202,7 +202,7 @@ To train on a local dataset, set the `TRAIN_DIR` and `OUTPUT_DIR` environment va ```bash export MODEL_NAME="runwayml/stable-diffusion-v1-5" -export dataset_name="lambdalabs/pokemon-blip-captions" +export dataset_name="lambdalabs/naruto-blip-captions" python train_text_to_image_flax.py \ --pretrained_model_name_or_path=$MODEL_NAME \ @@ -212,7 +212,7 @@ python train_text_to_image_flax.py \ --max_train_steps=15000 \ --learning_rate=1e-05 \ --max_grad_norm=1 \ - --output_dir="sd-pokemon-model" \ + --output_dir="sd-naruto-model" \ --push_to_hub ``` @@ -231,7 +231,7 @@ import torch pipeline = StableDiffusionPipeline.from_pretrained("path/to/saved_model", torch_dtype=torch.float16, use_safetensors=True).to("cuda") image = pipeline(prompt="yoda").images[0] -image.save("yoda-pokemon.png") +image.save("yoda-naruto.png") ``` @@ -246,7 +246,7 @@ from diffusers import FlaxStableDiffusionPipeline pipeline, params = FlaxStableDiffusionPipeline.from_pretrained("path/to/saved_model", dtype=jax.numpy.bfloat16) -prompt = "yoda pokemon" +prompt = "yoda naruto" prng_seed = jax.random.PRNGKey(0) num_inference_steps = 50 @@ -261,7 +261,7 @@ prompt_ids = shard(prompt_ids) images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:]))) -image.save("yoda-pokemon.png") +image.save("yoda-naruto.png") ``` diff --git a/docs/source/en/training/wuerstchen.md b/docs/source/en/training/wuerstchen.md index c8d2842eb833..cd190639b865 100644 --- a/docs/source/en/training/wuerstchen.md +++ b/docs/source/en/training/wuerstchen.md @@ -131,7 +131,7 @@ If you want to learn more about how the training loop works, check out the [Unde Once youโ€™ve made all your changes or youโ€™re okay with the default configuration, youโ€™re ready to launch the training script! ๐Ÿš€ -Set the `DATASET_NAME` environment variable to the dataset name from the Hub. This guide uses the [Pokรฉmon BLIP captions](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) dataset, but you can create and train on your own datasets as well (see the [Create a dataset for training](create_dataset) guide). +Set the `DATASET_NAME` environment variable to the dataset name from the Hub. This guide uses the [Naruto BLIP captions](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions) dataset, but you can create and train on your own datasets as well (see the [Create a dataset for training](create_dataset) guide). @@ -140,7 +140,7 @@ To monitor training progress with Weights & Biases, add the `--report_to=wandb` ```bash -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch train_text_to_image_prior.py \ --mixed_precision="fp16" \ @@ -156,10 +156,10 @@ accelerate launch train_text_to_image_prior.py \ --checkpoints_total_limit=3 \ --lr_scheduler="constant" \ --lr_warmup_steps=0 \ - --validation_prompts="A robot pokemon, 4k photo" \ + --validation_prompts="A robot naruto, 4k photo" \ --report_to="wandb" \ --push_to_hub \ - --output_dir="wuerstchen-prior-pokemon-model" + --output_dir="wuerstchen-prior-naruto-model" ``` Once training is complete, you can use your newly trained model for inference! @@ -171,7 +171,7 @@ from diffusers.pipelines.wuerstchen import DEFAULT_STAGE_C_TIMESTEPS pipeline = AutoPipelineForText2Image.from_pretrained("path/to/saved/model", torch_dtype=torch.float16).to("cuda") -caption = "A cute bird pokemon holding a shield" +caption = "A cute bird naruto holding a shield" images = pipeline( caption, width=1024, diff --git a/docs/source/ko/training/lora.md b/docs/source/ko/training/lora.md index 5bb8a1e69be4..e9c690d80652 100644 --- a/docs/source/ko/training/lora.md +++ b/docs/source/ko/training/lora.md @@ -49,15 +49,15 @@ huggingface-cli login ### ํ•™์Šต[[dreambooth-training]] -[Pokรฉmon BLIP ์บก์…˜](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ [`stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)๋ฅผ ํŒŒ์ธํŠœ๋‹ํ•ด ๋‚˜๋งŒ์˜ ํฌ์ผ“๋ชฌ์„ ์ƒ์„ฑํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. +[Naruto BLIP ์บก์…˜](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions) ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ [`stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5)๋ฅผ ํŒŒ์ธํŠœ๋‹ํ•ด ๋‚˜๋งŒ์˜ ํฌ์ผ“๋ชฌ์„ ์ƒ์„ฑํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์‹œ์ž‘ํ•˜๋ ค๋ฉด `MODEL_NAME` ๋ฐ `DATASET_NAME` ํ™˜๊ฒฝ ๋ณ€์ˆ˜๊ฐ€ ์„ค์ •๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค. `OUTPUT_DIR` ๋ฐ `HUB_MODEL_ID` ๋ณ€์ˆ˜๋Š” ์„ ํƒ ์‚ฌํ•ญ์ด๋ฉฐ ํ—ˆ๋ธŒ์—์„œ ๋ชจ๋ธ์„ ์ €์žฅํ•  ์œ„์น˜๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ```bash export MODEL_NAME="runwayml/stable-diffusion-v1-5" -export OUTPUT_DIR="/sddata/finetune/lora/pokemon" -export HUB_MODEL_ID="pokemon-lora" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export OUTPUT_DIR="/sddata/finetune/lora/naruto" +export HUB_MODEL_ID="naruto-lora" +export DATASET_NAME="lambdalabs/naruto-blip-captions" ``` ํ•™์Šต์„ ์‹œ์ž‘ํ•˜๊ธฐ ์ „์— ์•Œ์•„์•ผ ํ•  ๋ช‡ ๊ฐ€์ง€ ํ”Œ๋ž˜๊ทธ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. diff --git a/docs/source/ko/training/text2image.md b/docs/source/ko/training/text2image.md index f2ad3bb0719e..8a0463b497f4 100644 --- a/docs/source/ko/training/text2image.md +++ b/docs/source/ko/training/text2image.md @@ -73,12 +73,12 @@ xFormers๋Š” Flax์— ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. -๋‹ค์Œ๊ณผ ๊ฐ™์ด [Pokรฉmon BLIP ์บก์…˜](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) ๋ฐ์ดํ„ฐ์…‹์—์„œ ํŒŒ์ธํŠœ๋‹ ์‹คํ–‰์„ ์œ„ํ•ด [PyTorch ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image.py)๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค: +๋‹ค์Œ๊ณผ ๊ฐ™์ด [Naruto BLIP ์บก์…˜](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions) ๋ฐ์ดํ„ฐ์…‹์—์„œ ํŒŒ์ธํŠœ๋‹ ์‹คํ–‰์„ ์œ„ํ•ด [PyTorch ํ•™์Šต ์Šคํฌ๋ฆฝํŠธ](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image.py)๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค: ```bash export MODEL_NAME="CompVis/stable-diffusion-v1-4" -export dataset_name="lambdalabs/pokemon-blip-captions" +export dataset_name="lambdalabs/naruto-blip-captions" accelerate launch train_text_to_image.py \ --pretrained_model_name_or_path=$MODEL_NAME \ @@ -93,7 +93,7 @@ accelerate launch train_text_to_image.py \ --learning_rate=1e-05 \ --max_grad_norm=1 \ --lr_scheduler="constant" --lr_warmup_steps=0 \ - --output_dir="sd-pokemon-model" + --output_dir="sd-naruto-model" ``` ์ž์ฒด ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํŒŒ์ธํŠœ๋‹ํ•˜๋ ค๋ฉด ๐Ÿค— [Datasets](https://huggingface.co/docs/datasets/index)์—์„œ ์š”๊ตฌํ•˜๋Š” ํ˜•์‹์— ๋”ฐ๋ผ ๋ฐ์ดํ„ฐ์…‹์„ ์ค€๋น„ํ•˜์„ธ์š”. [๋ฐ์ดํ„ฐ์…‹์„ ํ—ˆ๋ธŒ์— ์—…๋กœ๋“œ](https://huggingface.co/docs/datasets/image_dataset#upload-dataset-to-the-hub)ํ•˜๊ฑฐ๋‚˜ [ํŒŒ์ผ๋“ค์ด ์žˆ๋Š” ๋กœ์ปฌ ํด๋”๋ฅผ ์ค€๋น„](https ://huggingface.co/docs/datasets/image_dataset#imagefolder)ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. @@ -136,7 +136,7 @@ pip install -U -r requirements_flax.txt ```bash export MODEL_NAME="runwayml/stable-diffusion-v1-5" -export dataset_name="lambdalabs/pokemon-blip-captions" +export dataset_name="lambdalabs/naruto-blip-captions" python train_text_to_image_flax.py \ --pretrained_model_name_or_path=$MODEL_NAME \ @@ -146,7 +146,7 @@ python train_text_to_image_flax.py \ --max_train_steps=15000 \ --learning_rate=1e-05 \ --max_grad_norm=1 \ - --output_dir="sd-pokemon-model" + --output_dir="sd-naruto-model" ``` ์ž์ฒด ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํŒŒ์ธํŠœ๋‹ํ•˜๋ ค๋ฉด ๐Ÿค— [Datasets](https://huggingface.co/docs/datasets/index)์—์„œ ์š”๊ตฌํ•˜๋Š” ํ˜•์‹์— ๋”ฐ๋ผ ๋ฐ์ดํ„ฐ์…‹์„ ์ค€๋น„ํ•˜์„ธ์š”. [๋ฐ์ดํ„ฐ์…‹์„ ํ—ˆ๋ธŒ์— ์—…๋กœ๋“œ](https://huggingface.co/docs/datasets/image_dataset#upload-dataset-to-the-hub)ํ•˜๊ฑฐ๋‚˜ [ํŒŒ์ผ๋“ค์ด ์žˆ๋Š” ๋กœ์ปฌ ํด๋”๋ฅผ ์ค€๋น„](https ://huggingface.co/docs/datasets/image_dataset#imagefolder)ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. @@ -166,7 +166,7 @@ python train_text_to_image_flax.py \ --max_train_steps=15000 \ --learning_rate=1e-05 \ --max_grad_norm=1 \ - --output_dir="sd-pokemon-model" + --output_dir="sd-naruto-model" ``` @@ -189,7 +189,7 @@ pipe = StableDiffusionPipeline.from_pretrained(model_path, torch_dtype=torch.flo pipe.to("cuda") image = pipe(prompt="yoda").images[0] -image.save("yoda-pokemon.png") +image.save("yoda-naruto.png") ``` @@ -203,7 +203,7 @@ from diffusers import FlaxStableDiffusionPipeline model_path = "path_to_saved_model" pipe, params = FlaxStableDiffusionPipeline.from_pretrained(model_path, dtype=jax.numpy.bfloat16) -prompt = "yoda pokemon" +prompt = "yoda naruto" prng_seed = jax.random.PRNGKey(0) num_inference_steps = 50 @@ -218,7 +218,7 @@ prompt_ids = shard(prompt_ids) images = pipeline(prompt_ids, params, prng_seed, num_inference_steps, jit=True).images images = pipeline.numpy_to_pil(np.asarray(images.reshape((num_samples,) + images.shape[-3:]))) -image.save("yoda-pokemon.png") +image.save("yoda-naruto.png") ``` \ No newline at end of file diff --git a/docs/source/ko/training/unconditional_training.md b/docs/source/ko/training/unconditional_training.md index d0c200ef2daa..de9ae39a7a76 100644 --- a/docs/source/ko/training/unconditional_training.md +++ b/docs/source/ko/training/unconditional_training.md @@ -103,13 +103,13 @@ accelerate launch train_unconditional.py \
-[Pokemon](https://huggingface.co/datasets/huggan/pokemon) ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ: +[Naruto](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions) ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•  ๊ฒฝ์šฐ: ```bash accelerate launch train_unconditional.py \ - --dataset_name="huggan/pokemon" \ + --dataset_name="lambdalabs/naruto-blip-captions" \ --resolution=64 \ - --output_dir="ddpm-ema-pokemon-64" \ + --output_dir="ddpm-ema-naruto-64" \ --train_batch_size=16 \ --num_epochs=100 \ --gradient_accumulation_steps=1 \ @@ -129,9 +129,9 @@ accelerate launch train_unconditional.py \ ```bash accelerate launch --mixed_precision="fp16" --multi_gpu train_unconditional.py \ - --dataset_name="huggan/pokemon" \ + --dataset_name="lambdalabs/naruto-blip-captions" \ --resolution=64 --center_crop --random_flip \ - --output_dir="ddpm-ema-pokemon-64" \ + --output_dir="ddpm-ema-naruto-64" \ --train_batch_size=16 \ --num_epochs=100 \ --gradient_accumulation_steps=1 \ diff --git a/examples/consistency_distillation/README_sdxl.md b/examples/consistency_distillation/README_sdxl.md index d3abaa4ce175..6bd84727cf31 100644 --- a/examples/consistency_distillation/README_sdxl.md +++ b/examples/consistency_distillation/README_sdxl.md @@ -115,11 +115,11 @@ accelerate launch train_lcm_distill_lora_sdxl_wds.py \ We provide another version for LCM LoRA SDXL that follows best practices of `peft` and leverages the `datasets` library for quick experimentation. The script doesn't load two UNets unlike `train_lcm_distill_lora_sdxl_wds.py` which reduces the memory requirements quite a bit. -Below is an example training command that trains an LCM LoRA on the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions): +Below is an example training command that trains an LCM LoRA on the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions): ```bash export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" export VAE_PATH="madebyollin/sdxl-vae-fp16-fix" accelerate launch train_lcm_distill_lora_sdxl.py \ diff --git a/examples/consistency_distillation/train_lcm_distill_lora_sdxl.py b/examples/consistency_distillation/train_lcm_distill_lora_sdxl.py index 9405c238f937..56f83f47b84c 100644 --- a/examples/consistency_distillation/train_lcm_distill_lora_sdxl.py +++ b/examples/consistency_distillation/train_lcm_distill_lora_sdxl.py @@ -71,7 +71,7 @@ logger = get_logger(__name__) DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/kandinsky2_2/text_to_image/README.md b/examples/kandinsky2_2/text_to_image/README.md index 6e5a1835593f..d27ba1a21f0c 100644 --- a/examples/kandinsky2_2/text_to_image/README.md +++ b/examples/kandinsky2_2/text_to_image/README.md @@ -57,7 +57,7 @@ To disable wandb logging, remove the `--report_to=="wandb"` and `--validation_pr ```bash -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" train_text_to_image_decoder.py \ --dataset_name=$DATASET_NAME \ @@ -139,7 +139,7 @@ You can fine-tune the Kandinsky prior model with `train_text_to_image_prior.py` ```bash -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" train_text_to_image_prior.py \ --dataset_name=$DATASET_NAME \ @@ -183,7 +183,7 @@ If you want to use a fine-tuned decoder checkpoint along with your fine-tuned pr for running distributed training with `accelerate`. Here is an example command: ```bash -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" --multi_gpu train_text_to_image_decoder.py \ --dataset_name=$DATASET_NAME \ @@ -227,13 +227,13 @@ on consumer GPUs like Tesla T4, Tesla V100. ### Training -First, you need to set up your development environment as explained in the [installation](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables. Here, we will use [Kandinsky 2.2](https://huggingface.co/kandinsky-community/kandinsky-2-2-decoder) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions). +First, you need to set up your development environment as explained in the [installation](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables. Here, we will use [Kandinsky 2.2](https://huggingface.co/kandinsky-community/kandinsky-2-2-decoder) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions). #### Train decoder ```bash -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" train_text_to_image_decoder_lora.py \ --dataset_name=$DATASET_NAME --caption_column="text" \ @@ -252,7 +252,7 @@ accelerate launch --mixed_precision="fp16" train_text_to_image_decoder_lora.py \ #### Train prior ```bash -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" train_text_to_image_prior_lora.py \ --dataset_name=$DATASET_NAME --caption_column="text" \ diff --git a/examples/kandinsky2_2/text_to_image/train_text_to_image_lora_prior.py b/examples/kandinsky2_2/text_to_image/train_text_to_image_lora_prior.py index e169cf92beb9..f6f3896aaa12 100644 --- a/examples/kandinsky2_2/text_to_image/train_text_to_image_lora_prior.py +++ b/examples/kandinsky2_2/text_to_image/train_text_to_image_lora_prior.py @@ -332,7 +332,7 @@ def parse_args(): DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/kandinsky2_2/text_to_image/train_text_to_image_prior.py b/examples/kandinsky2_2/text_to_image/train_text_to_image_prior.py index bd95aed2939c..54a4d0a397b4 100644 --- a/examples/kandinsky2_2/text_to_image/train_text_to_image_prior.py +++ b/examples/kandinsky2_2/text_to_image/train_text_to_image_prior.py @@ -56,7 +56,7 @@ logger = get_logger(__name__, log_level="INFO") DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/research_projects/lora/README.md b/examples/research_projects/lora/README.md index b5d72403166f..14cd6cd9be56 100644 --- a/examples/research_projects/lora/README.md +++ b/examples/research_projects/lora/README.md @@ -19,7 +19,7 @@ on consumer GPUs like Tesla T4, Tesla V100. ### Training -First, you need to set up your development environment as is explained in the [installation section](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables. Here, we will use [Stable Diffusion v1-4](https://hf.co/CompVis/stable-diffusion-v1-4) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions). +First, you need to set up your development environment as is explained in the [installation section](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables. Here, we will use [Stable Diffusion v1-4](https://hf.co/CompVis/stable-diffusion-v1-4) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions). **___Note: Change the `resolution` to 768 if you are using the [stable-diffusion-2](https://huggingface.co/stabilityai/stable-diffusion-2) 768x768 model.___** @@ -27,7 +27,7 @@ First, you need to set up your development environment as is explained in the [i ```bash export MODEL_NAME="CompVis/stable-diffusion-v1-4" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" ``` For this example we want to directly store the trained LoRA embeddings on the Hub, so diff --git a/examples/research_projects/lora/train_text_to_image_lora.py b/examples/research_projects/lora/train_text_to_image_lora.py index cf00bf270057..1ebc1422b064 100644 --- a/examples/research_projects/lora/train_text_to_image_lora.py +++ b/examples/research_projects/lora/train_text_to_image_lora.py @@ -387,7 +387,7 @@ def parse_args(): DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/research_projects/onnxruntime/text_to_image/README.md b/examples/research_projects/onnxruntime/text_to_image/README.md index 48bce2065444..8b499795746c 100644 --- a/examples/research_projects/onnxruntime/text_to_image/README.md +++ b/examples/research_projects/onnxruntime/text_to_image/README.md @@ -55,7 +55,7 @@ The command to train a DDPM UNetCondition model on the Pokemon dataset with onnx ```bash export MODEL_NAME="CompVis/stable-diffusion-v1-4" -export dataset_name="lambdalabs/pokemon-blip-captions" +export dataset_name="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" train_text_to_image.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --dataset_name=$dataset_name \ diff --git a/examples/research_projects/onnxruntime/text_to_image/train_text_to_image.py b/examples/research_projects/onnxruntime/text_to_image/train_text_to_image.py index ee61f033d34d..126a10b4f9e9 100644 --- a/examples/research_projects/onnxruntime/text_to_image/train_text_to_image.py +++ b/examples/research_projects/onnxruntime/text_to_image/train_text_to_image.py @@ -59,7 +59,7 @@ logger = get_logger(__name__, log_level="INFO") DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image.py b/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image.py index 0f4cc6c50b5e..d3bf95305dad 100644 --- a/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image.py +++ b/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image.py @@ -61,7 +61,7 @@ logger = get_logger(__name__, log_level="INFO") DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_lora.py b/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_lora.py index f22519b02e2b..a4b4d69bb892 100644 --- a/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_lora.py +++ b/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_lora.py @@ -406,7 +406,7 @@ def parse_args(): DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_lora_sdxl.py b/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_lora_sdxl.py index e5ff9d39e8ba..d7f2dcaa3442 100644 --- a/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_lora_sdxl.py +++ b/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_lora_sdxl.py @@ -468,7 +468,7 @@ def parse_args(input_args=None): DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_sdxl.py b/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_sdxl.py index 1dac573fce4c..a056bcfc8cb1 100644 --- a/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_sdxl.py +++ b/examples/research_projects/scheduled_huber_loss_training/text_to_image/train_text_to_image_sdxl.py @@ -60,7 +60,7 @@ DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/text_to_image/README.md b/examples/text_to_image/README.md index fd6e50bc3710..9a8410604878 100644 --- a/examples/text_to_image/README.md +++ b/examples/text_to_image/README.md @@ -57,7 +57,7 @@ With `gradient_checkpointing` and `mixed_precision` it should be possible to fin ```bash export MODEL_NAME="CompVis/stable-diffusion-v1-4" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" train_text_to_image.py \ --pretrained_model_name_or_path=$MODEL_NAME \ @@ -136,7 +136,7 @@ for running distributed training with `accelerate`. Here is an example command: ```bash export MODEL_NAME="CompVis/stable-diffusion-v1-4" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch --mixed_precision="fp16" --multi_gpu train_text_to_image.py \ --pretrained_model_name_or_path=$MODEL_NAME \ @@ -192,7 +192,7 @@ on consumer GPUs like Tesla T4, Tesla V100. ### Training -First, you need to set up your development environment as is explained in the [installation section](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables. Here, we will use [Stable Diffusion v1-4](https://hf.co/CompVis/stable-diffusion-v1-4) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions). +First, you need to set up your development environment as is explained in the [installation section](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables. Here, we will use [Stable Diffusion v1-4](https://hf.co/CompVis/stable-diffusion-v1-4) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions). **___Note: Change the `resolution` to 768 if you are using the [stable-diffusion-2](https://huggingface.co/stabilityai/stable-diffusion-2) 768x768 model.___** @@ -200,7 +200,7 @@ First, you need to set up your development environment as is explained in the [i ```bash export MODEL_NAME="CompVis/stable-diffusion-v1-4" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" ``` For this example we want to directly store the trained LoRA embeddings on the Hub, so @@ -282,7 +282,7 @@ pip install -U -r requirements_flax.txt ```bash export MODEL_NAME="duongna/stable-diffusion-v1-4-flax" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" python train_text_to_image_flax.py \ --pretrained_model_name_or_path=$MODEL_NAME \ diff --git a/examples/text_to_image/README_sdxl.md b/examples/text_to_image/README_sdxl.md index 349feef5008e..35ea0091c4f3 100644 --- a/examples/text_to_image/README_sdxl.md +++ b/examples/text_to_image/README_sdxl.md @@ -52,7 +52,7 @@ Note also that we use PEFT library as backend for LoRA training, make sure to ha ```bash export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0" export VAE_NAME="madebyollin/sdxl-vae-fp16-fix" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch train_text_to_image_sdxl.py \ --pretrained_model_name_or_path=$MODEL_NAME \ @@ -76,7 +76,7 @@ accelerate launch train_text_to_image_sdxl.py \ **Notes**: -* The `train_text_to_image_sdxl.py` script pre-computes text embeddings and the VAE encodings and keeps them in memory. While for smaller datasets like [`lambdalabs/pokemon-blip-captions`](https://hf.co/datasets/lambdalabs/pokemon-blip-captions), it might not be a problem, it can definitely lead to memory problems when the script is used on a larger dataset. For those purposes, you would want to serialize these pre-computed representations to disk separately and load them during the fine-tuning process. Refer to [this PR](https://github.com/huggingface/diffusers/pull/4505) for a more in-depth discussion. +* The `train_text_to_image_sdxl.py` script pre-computes text embeddings and the VAE encodings and keeps them in memory. While for smaller datasets like [`lambdalabs/naruto-blip-captions`](https://hf.co/datasets/lambdalabs/naruto-blip-captions), it might not be a problem, it can definitely lead to memory problems when the script is used on a larger dataset. For those purposes, you would want to serialize these pre-computed representations to disk separately and load them during the fine-tuning process. Refer to [this PR](https://github.com/huggingface/diffusers/pull/4505) for a more in-depth discussion. * The training script is compute-intensive and may not run on a consumer GPU like Tesla T4. * The training command shown above performs intermediate quality validation in between the training epochs and logs the results to Weights and Biases. `--report_to`, `--validation_prompt`, and `--validation_epochs` are the relevant CLI arguments here. * SDXL's VAE is known to suffer from numerical instability issues. This is why we also expose a CLI argument namely `--pretrained_vae_model_name_or_path` that lets you specify the location of a better VAE (such as [this one](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix)). @@ -142,14 +142,14 @@ on consumer GPUs like Tesla T4, Tesla V100. ### Training -First, you need to set up your development environment as is explained in the [installation section](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables and, optionally, the `VAE_NAME` variable. Here, we will use [Stable Diffusion XL 1.0-base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions). +First, you need to set up your development environment as is explained in the [installation section](#installing-the-dependencies). Make sure to set the `MODEL_NAME` and `DATASET_NAME` environment variables and, optionally, the `VAE_NAME` variable. Here, we will use [Stable Diffusion XL 1.0-base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and the [Pokemons dataset](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions). **___Note: It is quite useful to monitor the training progress by regularly generating sample images during training. [Weights and Biases](https://docs.wandb.ai/quickstart) is a nice solution to easily see generating images during training. All you need to do is to run `pip install wandb` before training to automatically log images.___** ```bash export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0" export VAE_NAME="madebyollin/sdxl-vae-fp16-fix" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" ``` For this example we want to directly store the trained LoRA embeddings on the Hub, so @@ -219,7 +219,7 @@ You need to save the mentioned configuration as an `accelerate_config.yaml` file ```shell export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0" export VAE_NAME="madebyollin/sdxl-vae-fp16-fix" -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" export ACCELERATE_CONFIG_FILE="your accelerate_config.yaml" accelerate launch --config_file $ACCELERATE_CONFIG_FILE train_text_to_image_lora_sdxl.py \ diff --git a/examples/text_to_image/train_text_to_image.py b/examples/text_to_image/train_text_to_image.py index aa704ba8ca38..13ee0f2cc4c7 100644 --- a/examples/text_to_image/train_text_to_image.py +++ b/examples/text_to_image/train_text_to_image.py @@ -62,7 +62,7 @@ logger = get_logger(__name__, log_level="INFO") DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/text_to_image/train_text_to_image_flax.py b/examples/text_to_image/train_text_to_image_flax.py index 557923c52e00..c3a08a90b4e5 100644 --- a/examples/text_to_image/train_text_to_image_flax.py +++ b/examples/text_to_image/train_text_to_image_flax.py @@ -250,7 +250,7 @@ def parse_args(): dataset_name_mapping = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/text_to_image/train_text_to_image_lora.py b/examples/text_to_image/train_text_to_image_lora.py index 7164ac909cb2..37b10cfd1bad 100644 --- a/examples/text_to_image/train_text_to_image_lora.py +++ b/examples/text_to_image/train_text_to_image_lora.py @@ -387,7 +387,7 @@ def parse_args(): DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/text_to_image/train_text_to_image_lora_sdxl.py b/examples/text_to_image/train_text_to_image_lora_sdxl.py index 3604e755c62a..c9883252d14b 100644 --- a/examples/text_to_image/train_text_to_image_lora_sdxl.py +++ b/examples/text_to_image/train_text_to_image_lora_sdxl.py @@ -454,7 +454,7 @@ def parse_args(input_args=None): DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/text_to_image/train_text_to_image_sdxl.py b/examples/text_to_image/train_text_to_image_sdxl.py index 88adbb995531..90602ad597a9 100644 --- a/examples/text_to_image/train_text_to_image_sdxl.py +++ b/examples/text_to_image/train_text_to_image_sdxl.py @@ -61,7 +61,7 @@ DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/wuerstchen/text_to_image/README.md b/examples/wuerstchen/text_to_image/README.md index d655259088e4..7583296e66d1 100644 --- a/examples/wuerstchen/text_to_image/README.md +++ b/examples/wuerstchen/text_to_image/README.md @@ -37,7 +37,7 @@ You can fine-tune the Wรผrstchen prior model with the `train_text_to_image_prior ```bash -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch train_text_to_image_prior.py \ --mixed_precision="fp16" \ @@ -72,10 +72,10 @@ In a nutshell, LoRA allows adapting pretrained models by adding pairs of rank-de ### Prior Training -First, you need to set up your development environment as explained in the [installation](#Running-locally-with-PyTorch) section. Make sure to set the `DATASET_NAME` environment variable. Here, we will use the [Pokemon captions dataset](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions). +First, you need to set up your development environment as explained in the [installation](#Running-locally-with-PyTorch) section. Make sure to set the `DATASET_NAME` environment variable. Here, we will use the [Pokemon captions dataset](https://huggingface.co/datasets/lambdalabs/naruto-blip-captions). ```bash -export DATASET_NAME="lambdalabs/pokemon-blip-captions" +export DATASET_NAME="lambdalabs/naruto-blip-captions" accelerate launch train_text_to_image_lora_prior.py \ --mixed_precision="fp16" \ diff --git a/examples/wuerstchen/text_to_image/train_text_to_image_lora_prior.py b/examples/wuerstchen/text_to_image/train_text_to_image_lora_prior.py index 76eaf6423960..79f7d8576ff4 100644 --- a/examples/wuerstchen/text_to_image/train_text_to_image_lora_prior.py +++ b/examples/wuerstchen/text_to_image/train_text_to_image_lora_prior.py @@ -55,7 +55,7 @@ logger = get_logger(__name__, log_level="INFO") DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), } diff --git a/examples/wuerstchen/text_to_image/train_text_to_image_prior.py b/examples/wuerstchen/text_to_image/train_text_to_image_prior.py index 49cc5d26072d..3e0acfdaf519 100644 --- a/examples/wuerstchen/text_to_image/train_text_to_image_prior.py +++ b/examples/wuerstchen/text_to_image/train_text_to_image_prior.py @@ -56,7 +56,7 @@ logger = get_logger(__name__, log_level="INFO") DATASET_NAME_MAPPING = { - "lambdalabs/pokemon-blip-captions": ("image", "text"), + "lambdalabs/naruto-blip-captions": ("image", "text"), }