Skip to content

Commit a140288

Browse files
author
mhh001
committed
Add information about training SD models using DeepSpeed to the README.
1 parent 3b5062c commit a140288

File tree

1 file changed

+60
-0
lines changed

1 file changed

+60
-0
lines changed

examples/text_to_image/README_sdxl.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,66 @@ The above command will also run inference as fine-tuning progresses and log the
183183

184184
* SDXL's VAE is known to suffer from numerical instability issues. This is why we also expose a CLI argument namely `--pretrained_vae_model_name_or_path` that lets you specify the location of a better VAE (such as [this one](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix)).
185185

186+
187+
### Use DeepSpeed to training SDXL model
188+
Using DeepSpeed can reduce the consumption of GPU memory, enabling the training of models on GPUs with smaller memory sizes. DeepSpeed is capable of offloading model parameters to the machine's memory, or it can distribute parameters, gradients, and optimizer states across multiple GPUs. This allows for the training of larger models under the same hardware configuration.
189+
190+
First, you need to use the `accelerate config` command to choose to use DeepSpeed, or manually use the accelerate config file to set up DeepSpeed.
191+
192+
Here is an example of a config file for using DeepSpeed. For more detailed explanations of the configuration, you can refer to this [link](https://huggingface.co/docs/accelerate/usage_guides/deepspeed).
193+
```yaml
194+
compute_environment: LOCAL_MACHINE
195+
debug: true
196+
deepspeed_config:
197+
gradient_accumulation_steps: 1
198+
gradient_clipping: 1.0
199+
offload_optimizer_device: none
200+
offload_param_device: none
201+
zero3_init_flag: false
202+
zero_stage: 2
203+
distributed_type: DEEPSPEED
204+
downcast_bf16: 'no'
205+
machine_rank: 0
206+
main_training_function: main
207+
mixed_precision: fp16
208+
num_machines: 1
209+
num_processes: 1
210+
rdzv_backend: static
211+
same_network: true
212+
tpu_env: []
213+
tpu_use_cluster: false
214+
tpu_use_sudo: false
215+
use_cpu: false
216+
```
217+
You need to save the mentioned configuration as an `accelerate_config.yaml` file. Then, you need to input the path of your `accelerate_config.yaml` file into the `ACCELERATE_CONFIG_FILE` parameter. This way you can use DeepSpeed to train your SDXL model in LoRA. Additionally, you can use DeepSpeed to train other SD models in this way.
218+
219+
```shell
220+
export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0"
221+
export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
222+
export DATASET_NAME="lambdalabs/pokemon-blip-captions"
223+
export ACCELERATE_CONFIG_FILE="your accelerate_config.yaml"
224+
225+
accelerate launch --config_file $ACCELERATE_CONFIG_FILE train_text_to_image_lora_sdxl.py \
226+
--pretrained_model_name_or_path=$MODEL_NAME \
227+
--pretrained_vae_model_name_or_path=$VAE_NAME \
228+
--dataset_name=$DATASET_NAME --caption_column="text" \
229+
--resolution=1024 \
230+
--train_batch_size=1 \
231+
--num_train_epochs=2 \
232+
--checkpointing_steps=2 \
233+
--learning_rate=1e-04 \
234+
--lr_scheduler="constant" \
235+
--lr_warmup_steps=0 \
236+
--mixed_precision="fp16" \
237+
--max_train_steps=20 \
238+
--validation_epochs=20 \
239+
--seed=1234 \
240+
--output_dir="sd-pokemon-model-lora-sdxl" \
241+
--validation_prompt="cute dragon creature"
242+
243+
```
244+
245+
186246
### Finetuning the text encoder and UNet
187247

188248
The script also allows you to finetune the `text_encoder` along with the `unet`.

0 commit comments

Comments
 (0)