diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml index fc101347a6e9..f205046ffc90 100644 --- a/docs/source/en/_toctree.yml +++ b/docs/source/en/_toctree.yml @@ -60,6 +60,8 @@ - sections: - local: training/overview title: Overview + - local: training/create_dataset + title: Create a dataset for training - local: training/unconditional_training title: Unconditional image generation - local: training/text_inversion diff --git a/docs/source/en/training/controlnet.mdx b/docs/source/en/training/controlnet.mdx index 1c91298477c7..476081c88704 100644 --- a/docs/source/en/training/controlnet.mdx +++ b/docs/source/en/training/controlnet.mdx @@ -69,6 +69,8 @@ The original dataset is hosted in the ControlNet [repo](https://huggingface.co/l Our training examples use [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) because that is what the original set of ControlNet models was trained on. However, ControlNet can be trained to augment any compatible Stable Diffusion model (such as [`CompVis/stable-diffusion-v1-4`](https://huggingface.co/CompVis/stable-diffusion-v1-4)) or [`stabilityai/stable-diffusion-2-1`](https://huggingface.co/stabilityai/stable-diffusion-2-1). +To use your own dataset, take a look at the [Create a dataset for training](create_dataset) guide. + ## Training Download the following images to condition our training with: @@ -79,7 +81,9 @@ wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/ma wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png ``` -Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`~diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path`] argument. +Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`pretrained_model_name_or_path`](https://huggingface.co/docs/diffusers/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path) argument. + +The training script creates and saves a `diffusion_pytorch_model.bin` file in your repository. ```bash export MODEL_DIR="runwayml/stable-diffusion-v1-5" diff --git a/docs/source/en/training/create_dataset.mdx b/docs/source/en/training/create_dataset.mdx new file mode 100644 index 000000000000..9c4f4de53904 --- /dev/null +++ b/docs/source/en/training/create_dataset.mdx @@ -0,0 +1,90 @@ +# Create a dataset for training + +There are many datasets on the [Hub](https://huggingface.co/datasets?task_categories=task_categories:text-to-image&sort=downloads) to train a model on, but if you can't find one you're interested in or want to use your own, you can create a dataset with the 🤗 [Datasets](hf.co/docs/datasets) library. The dataset structure depends on the task you want to train your model on. The most basic dataset structure is a directory of images for tasks like unconditional image generation. Another dataset structure may be a directory of images and a text file containing their corresponding text captions for tasks like text-to-image generation. + +This guide will show you two ways to create a dataset to finetune on: + +- provide a folder of images to the `--train_data_dir` argument +- upload a dataset to the Hub and pass the dataset repository id to the `--dataset_name` argument + + + +💡 Learn more about how to create an image dataset for training in the [Create an image dataset](https://huggingface.co/docs/datasets/image_dataset) guide. + + + +## Provide a dataset as a folder + +For unconditional generation, you can provide your own dataset as a folder of images. The training script uses the [`ImageFolder`](https://huggingface.co/docs/datasets/en/image_dataset#imagefolder) builder from 🤗 Datasets to automatically build a dataset from the folder. Your directory structure should look like: + +```bash +data_dir/xxx.png +data_dir/xxy.png +data_dir/[...]/xxz.png +``` + +Pass the path to the dataset directory to the `--train_data_dir` argument, and then you can start training: + +```bash +accelerate launch train_unconditional.py \ + --train_data_dir \ + +``` + +## Upload your data to the Hub + + + +💡 For more details and context about creating and uploading a dataset to the Hub, take a look at the [Image search with 🤗 Datasets](https://huggingface.co/blog/image-search-datasets) post. + + + +Start by creating a dataset with the [`ImageFolder`](https://huggingface.co/docs/datasets/image_load#imagefolder) feature, which creates an `image` column containing the PIL-encoded images. + +You can use the `data_dir` or `data_files` parameters to specify the location of the dataset. The `data_files` parameter supports mapping specific files to dataset splits like `train` or `test`: + +```python +from datasets import load_dataset + +# example 1: local folder +dataset = load_dataset("imagefolder", data_dir="path_to_your_folder") + +# example 2: local files (supported formats are tar, gzip, zip, xz, rar, zstd) +dataset = load_dataset("imagefolder", data_files="path_to_zip_file") + +# example 3: remote files (supported formats are tar, gzip, zip, xz, rar, zstd) +dataset = load_dataset( + "imagefolder", + data_files="https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_3367a.zip", +) + +# example 4: providing several splits +dataset = load_dataset( + "imagefolder", data_files={"train": ["path/to/file1", "path/to/file2"], "test": ["path/to/file3", "path/to/file4"]} +) +``` + +Then use the [`~datasets.Dataset.push_to_hub`] method to upload the dataset to the Hub: + +```python +# assuming you have ran the huggingface-cli login command in a terminal +dataset.push_to_hub("name_of_your_dataset") + +# if you want to push to a private repo, simply pass private=True: +dataset.push_to_hub("name_of_your_dataset", private=True) +``` + +Now the dataset is available for training by passing the dataset name to the `--dataset_name` argument: + +```bash +accelerate launch --mixed_precision="fp16" train_text_to_image.py \ + --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \ + --dataset_name="name_of_your_dataset" \ + +``` + +## Next steps + +Now that you've created a dataset, you can plug it into the `train_data_dir` (if your dataset is local) or `dataset_name` (if your dataset is on the Hub) arguments of a training script. + +For your next steps, feel free to try and use your dataset to train a model for [unconditional generation](uncondtional_training) or [text-to-image generation](text2image)! \ No newline at end of file diff --git a/docs/source/en/training/custom_diffusion.mdx b/docs/source/en/training/custom_diffusion.mdx index ee8fb19bd18c..dda9c17c7ebc 100644 --- a/docs/source/en/training/custom_diffusion.mdx +++ b/docs/source/en/training/custom_diffusion.mdx @@ -67,7 +67,7 @@ write_basic_config() ``` ### Cat example 😺 -Now let's get our dataset. Download dataset from [here](https://www.cs.cmu.edu/~custom-diffusion/assets/data.zip) and unzip it. +Now let's get our dataset. Download dataset from [here](https://www.cs.cmu.edu/~custom-diffusion/assets/data.zip) and unzip it. To use your own dataset, take a look at the [Create a dataset for training](create_dataset) guide. We also collect 200 real images using `clip-retrieval` which are combined with the target images in the training dataset as a regularization. This prevents overfitting to the the given target image. The following flags enable the regularization `with_prior_preservation`, `real_prior` with `prior_loss_weight=1.`. The `class_prompt` should be the category name same as target image. The collected real images are with text captions similar to the `class_prompt`. The retrieved image are saved in `class_data_dir`. You can disable `real_prior` to use generated images as regularization. To collect the real images use this command first before training. @@ -79,6 +79,8 @@ python retrieve.py --class_prompt cat --class_data_dir real_reg/samples_cat --nu **___Note: Change the `resolution` to 768 if you are using the [stable-diffusion-2](https://huggingface.co/stabilityai/stable-diffusion-2) 768x768 model.___** +The script creates and saves model checkpoints and a `pytorch_custom_diffusion_weights.bin` file in your repository. + ```bash export MODEL_NAME="CompVis/stable-diffusion-v1-4" export OUTPUT_DIR="path-to-save-model" diff --git a/docs/source/en/training/dreambooth.mdx b/docs/source/en/training/dreambooth.mdx index 09b877c7d0cc..38a3adf9c4f1 100644 --- a/docs/source/en/training/dreambooth.mdx +++ b/docs/source/en/training/dreambooth.mdx @@ -64,6 +64,8 @@ snapshot_download( ) ``` +To use your own dataset, take a look at the [Create a dataset for training](create_dataset) guide. + ## Finetuning @@ -76,7 +78,7 @@ DreamBooth finetuning is very sensitive to hyperparameters and easy to overfit. Set the `INSTANCE_DIR` environment variable to the path of the directory containing the dog images. -Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`~diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path`] argument. +Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`pretrained_model_name_or_path`] argument. The `instance_prompt` argument is a text prompt that contains a unique identifier, such as `sks`, and the class the image belongs to, which in this example is `a photo of a sks dog`. ```bash export MODEL_NAME="CompVis/stable-diffusion-v1-4" @@ -111,7 +113,7 @@ Before running the script, make sure you have the requirements installed: pip install -U -r requirements.txt ``` -Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`~diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path`] argument. +Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`pretrained_model_name_or_path`] argument. The `instance_prompt` argument is a text prompt that contains a unique identifier, such as `sks`, and the class the image belongs to, which in this example is `a photo of a sks dog`. Now you can launch the training script with the following command: diff --git a/docs/source/en/training/instructpix2pix.mdx b/docs/source/en/training/instructpix2pix.mdx index 6b6d4d908673..2a9e99cda1f2 100644 --- a/docs/source/en/training/instructpix2pix.mdx +++ b/docs/source/en/training/instructpix2pix.mdx @@ -77,16 +77,16 @@ write_basic_config() ### Toy example As mentioned before, we'll use a [small toy dataset](https://huggingface.co/datasets/fusing/instructpix2pix-1000-samples) for training. The dataset -is a smaller version of the [original dataset](https://huggingface.co/datasets/timbrooks/instructpix2pix-clip-filtered) used in the InstructPix2Pix paper. +is a smaller version of the [original dataset](https://huggingface.co/datasets/timbrooks/instructpix2pix-clip-filtered) used in the InstructPix2Pix paper. To use your own dataset, take a look at the [Create a dataset for training](create_dataset) guide. -Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`~diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path`] argument. You'll also need to specify the dataset name in `DATASET_ID`: +Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`pretrained_model_name_or_path`](https://huggingface.co/docs/diffusers/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path) argument. You'll also need to specify the dataset name in `DATASET_ID`: ```bash export MODEL_NAME="runwayml/stable-diffusion-v1-5" export DATASET_ID="fusing/instructpix2pix-1000-samples" ``` -Now, we can launch training: +Now, we can launch training. The script saves all the components (`feature_extractor`, `scheduler`, `text_encoder`, `unet`, etc) in a subfolder in your repository. ```bash accelerate launch --mixed_precision="fp16" train_instruct_pix2pix.py \ diff --git a/docs/source/en/training/lora.mdx b/docs/source/en/training/lora.mdx index 3c7cc7ebfeec..5c7286006afa 100644 --- a/docs/source/en/training/lora.mdx +++ b/docs/source/en/training/lora.mdx @@ -17,8 +17,7 @@ specific language governing permissions and limitations under the License. Currently, LoRA is only supported for the attention layers of the [`UNet2DConditionalModel`]. We also -support LoRA fine-tuning of the text encoder for DreamBooth in a limited capacity. For more details on how we support -LoRA fine-tuning of the text encoder, refer to the discussion on [this PR](https://github.com/huggingface/diffusers/pull/2918). +support fine-tuning the text encoder for DreamBooth with LoRA in a limited capacity. Fine-tuning the text encoder for DreamBooth generally yields better results, but it can increase compute usage. @@ -52,7 +51,7 @@ Finetuning a model like Stable Diffusion, which has billions of parameters, can Let's finetune [`stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) on the [Pokémon BLIP captions](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) dataset to generate your own Pokémon. -Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`~diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path`] argument. You'll also need to set the `DATASET_NAME` environment variable to the name of the dataset you want to train on. +Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`pretrained_model_name_or_path`](https://huggingface.co/docs/diffusers/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path) argument. You'll also need to set the `DATASET_NAME` environment variable to the name of the dataset you want to train on. To use your own dataset, take a look at the [Create a dataset for training](create_dataset) guide. The `OUTPUT_DIR` and `HUB_MODEL_ID` variables are optional and specify where to save the model to on the Hub: @@ -69,7 +68,7 @@ There are some flags to be aware of before you start training: * `--report_to=wandb` reports and logs the training results to your Weights & Biases dashboard (as an example, take a look at this [report](https://wandb.ai/pcuenq/text2image-fine-tune/runs/b4k1w0tn?workspace=user-pcuenq)). * `--learning_rate=1e-04`, you can afford to use a higher learning rate than you normally would with LoRA. -Now you're ready to launch the training (you can find the full training script [here](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_lora.py)): +Now you're ready to launch the training (you can find the full training script [here](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_lora.py)). Training takes about 5 hours on a 2080 Ti GPU with 11GB of RAM, and it'll create and save model checkpoints and the `pytorch_lora_weights` in your repository. ```bash accelerate launch --mixed_precision="fp16" train_text_to_image_lora.py \ @@ -159,9 +158,9 @@ pipe = StableDiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch. ### Training[[dreambooth-training]] -Let's finetune [`stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) with DreamBooth and LoRA with some 🐶 [dog images](https://drive.google.com/drive/folders/1BO_dyz-p65qhBRRMRA4TbZ8qW4rB99JZ). Download and save these images to a directory. +Let's finetune [`stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5) with DreamBooth and LoRA with some 🐶 [dog images](https://drive.google.com/drive/folders/1BO_dyz-p65qhBRRMRA4TbZ8qW4rB99JZ). Download and save these images to a directory. To use your own dataset, take a look at the [Create a dataset for training](create_dataset) guide. -To start, specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`~diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path`] argument. You'll also need to set `INSTANCE_DIR` to the path of the directory containing the images. +To start, specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`pretrained_model_name_or_path`](https://huggingface.co/docs/diffusers/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path) argument. You'll also need to set `INSTANCE_DIR` to the path of the directory containing the images. The `OUTPUT_DIR` variables is optional and specifies where to save the model to on the Hub: @@ -177,7 +176,11 @@ There are some flags to be aware of before you start training: * `--report_to=wandb` reports and logs the training results to your Weights & Biases dashboard (as an example, take a look at this [report](https://wandb.ai/pcuenq/text2image-fine-tune/runs/b4k1w0tn?workspace=user-pcuenq)). * `--learning_rate=1e-04`, you can afford to use a higher learning rate than you normally would with LoRA. -Now you're ready to launch the training (you can find the full training script [here](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora.py)): +Now you're ready to launch the training (you can find the full training script [here](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth_lora.py)). The script creates and saves model checkpoints and the `pytorch_lora_weights.bin` file in your repository. + +It's also possible to additionally fine-tune the text encoder with LoRA. This, in most cases, leads +to better results with a slight increase in the compute. To allow fine-tuning the text encoder with LoRA, +specify the `--train_text_encoder` while launching the `train_dreambooth_lora.py` script. ```bash accelerate launch train_dreambooth_lora.py \ @@ -198,12 +201,7 @@ accelerate launch train_dreambooth_lora.py \ --validation_epochs=50 \ --seed="0" \ --push_to_hub -``` - -It's also possible to additionally fine-tune the text encoder with LoRA. This, in most cases, leads -to better results with a slight increase in the compute. To allow fine-tuning the text encoder with LoRA, -specify the `--train_text_encoder` while launching the `train_dreambooth_lora.py` script. - +``` ### Inference[[dreambooth-inference]] diff --git a/docs/source/en/training/text2image.mdx b/docs/source/en/training/text2image.mdx index dabb68397f78..8535e6ffac70 100644 --- a/docs/source/en/training/text2image.mdx +++ b/docs/source/en/training/text2image.mdx @@ -74,7 +74,7 @@ To load a checkpoint to resume training, pass the argument `--resume_from_checkp Launch the [PyTorch training script](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image.py) for a fine-tuning run on the [Pokémon BLIP captions](https://huggingface.co/datasets/lambdalabs/pokemon-blip-captions) dataset like this. -Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`~diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path`] argument. +Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`pretrained_model_name_or_path`](https://huggingface.co/docs/diffusers/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path) argument. {"path": "../../../../examples/text_to_image/README.md", @@ -143,7 +143,7 @@ Before running the script, make sure you have the requirements installed: pip install -U -r requirements_flax.txt ``` -Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`~diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path`] argument. +Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`pretrained_model_name_or_path`](https://huggingface.co/docs/diffusers/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path) argument. Now you can launch the [Flax training script](https://github.com/huggingface/diffusers/blob/main/examples/text_to_image/train_text_to_image_flax.py) like this: diff --git a/docs/source/en/training/text_inversion.mdx b/docs/source/en/training/text_inversion.mdx index 76e7f0dcc8f2..1afecc7b71bb 100644 --- a/docs/source/en/training/text_inversion.mdx +++ b/docs/source/en/training/text_inversion.mdx @@ -81,7 +81,7 @@ To resume training from a saved checkpoint, pass the following argument to the t ## Finetuning -For your training dataset, download these [images of a cat toy](https://huggingface.co/datasets/diffusers/cat_toy_example) and store them in a directory: +For your training dataset, download these [images of a cat toy](https://huggingface.co/datasets/diffusers/cat_toy_example) and store them in a directory. To use your own dataset, take a look at the [Create a dataset for training](create_dataset) guide. ```py from huggingface_hub import snapshot_download @@ -92,9 +92,9 @@ snapshot_download( ) ``` -Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`~diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path`] argument, and the `DATA_DIR` environment variable to the path of the directory containing the images. +Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`pretrained_model_name_or_path`](https://huggingface.co/docs/diffusers/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path) argument, and the `DATA_DIR` environment variable to the path of the directory containing the images. -Now you can launch the [training script](https://github.com/huggingface/diffusers/blob/main/examples/textual_inversion/textual_inversion.py): +Now you can launch the [training script](https://github.com/huggingface/diffusers/blob/main/examples/textual_inversion/textual_inversion.py). The script creates and saves the following files to your repository: `learned_embeds.bin`, `token_identifier.txt`, and `type_of_concept.txt`. @@ -144,7 +144,7 @@ Before you begin, make sure you install the Flax specific dependencies: pip install -U -r requirements_flax.txt ``` -Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`~diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path`] argument. +Specify the `MODEL_NAME` environment variable (either a Hub model repository id or a path to the directory containing the model weights) and pass it to the [`pretrained_model_name_or_path`](https://huggingface.co/docs/diffusers/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained.pretrained_model_name_or_path) argument. Then you can launch the [training script](https://github.com/huggingface/diffusers/blob/main/examples/textual_inversion/textual_inversion_flax.py): diff --git a/docs/source/en/training/unconditional_training.mdx b/docs/source/en/training/unconditional_training.mdx index 514932d4b22d..164b4f599f1e 100644 --- a/docs/source/en/training/unconditional_training.mdx +++ b/docs/source/en/training/unconditional_training.mdx @@ -74,7 +74,9 @@ The full training state is saved in a subfolder in the `output_dir` every 500 st ## Finetuning -You're ready to launch the [training script](https://github.com/huggingface/diffusers/blob/main/examples/unconditional_image_generation/train_unconditional.py) now! Specify the dataset name to finetune on with the `--dataset_name` argument and then save it to the path in `--output_dir`. +You're ready to launch the [training script](https://github.com/huggingface/diffusers/blob/main/examples/unconditional_image_generation/train_unconditional.py) now! Specify the dataset name to finetune on with the `--dataset_name` argument and then save it to the path in `--output_dir`. To use your own dataset, take a look at the [Create a dataset for training](create_dataset) guide. + +The training script creates and saves a `diffusion_pytorch_model.bin` file in your repository. @@ -140,82 +142,4 @@ accelerate launch --mixed_precision="fp16" --multi_gpu train_unconditional.py \ --lr_warmup_steps=500 \ --mixed_precision="fp16" \ --logger="wandb" -``` - -## Finetuning with your own data - -There are two ways to finetune a model on your own dataset: - -- provide your own folder of images to the `--train_data_dir` argument -- upload your dataset to the Hub and pass the dataset repository id to the `--dataset_name` argument. - - - -💡 Learn more about how to create an image dataset for training in the [Create an image dataset](https://huggingface.co/docs/datasets/image_dataset) guide. - - - -Below, we explain both in more detail. - -### Provide the dataset as a folder - -If you provide your own dataset as a folder, the script expects the following directory structure: - -```bash -data_dir/xxx.png -data_dir/xxy.png -data_dir/[...]/xxz.png -``` - -Pass the path to the folder containing the images to the `--train_data_dir` argument and launch the training: - -```bash -accelerate launch train_unconditional.py \ - --train_data_dir \ - -``` - -Internally, the script uses the [`ImageFolder`](https://huggingface.co/docs/datasets/image_load#imagefolder) to automatically build a dataset from the folder. - -### Upload your data to the Hub - - - -💡 For more details and context about creating and uploading a dataset to the Hub, take a look at the [Image search with 🤗 Datasets](https://huggingface.co/blog/image-search-datasets) post. - - - -To upload your dataset to the Hub, you can start by creating one with the [`ImageFolder`](https://huggingface.co/docs/datasets/image_load#imagefolder) feature, which creates an `image` column containing the PIL-encoded images, from 🤗 Datasets: - -```python -from datasets import load_dataset - -# example 1: local folder -dataset = load_dataset("imagefolder", data_dir="path_to_your_folder") - -# example 2: local files (supported formats are tar, gzip, zip, xz, rar, zstd) -dataset = load_dataset("imagefolder", data_files="path_to_zip_file") - -# example 3: remote files (supported formats are tar, gzip, zip, xz, rar, zstd) -dataset = load_dataset( - "imagefolder", - data_files="https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_3367a.zip", -) - -# example 4: providing several splits -dataset = load_dataset( - "imagefolder", data_files={"train": ["path/to/file1", "path/to/file2"], "test": ["path/to/file3", "path/to/file4"]} -) -``` - -Then you can use the [`~datasets.Dataset.push_to_hub`] method to upload it to the Hub: - -```python -# assuming you have ran the huggingface-cli login command in a terminal -dataset.push_to_hub("name_of_your_dataset") - -# if you want to push to a private repo, simply pass private=True: -dataset.push_to_hub("name_of_your_dataset", private=True) -``` - -Now train your model by simply setting the `--dataset_name` argument to the name of your dataset on the Hub. \ No newline at end of file +``` \ No newline at end of file