Fix InstructPix2Pix training in multi-GPU mode #2978

sayakpaul · 2023-04-05T07:38:18Z

Should close #2966.

Command used for testing:

accelerate launch --mixed_precision="fp16" --multi_gpu train_instruct_pix2pix.py \
 --pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5 \
 --dataset_name=sayakpaul/instructpix2pix-1000-samples \
 --use_ema \
 --enable_xformers_memory_efficient_attention \
 --resolution=512 --random_flip \
 --train_batch_size=4 --gradient_accumulation_steps=4 --gradient_checkpointing \
 --max_train_steps=100 \
 --checkpointing_steps=10 --checkpoints_total_limit=1 \
 --learning_rate=5e-05 --lr_warmup_steps=0 \
 --conditioning_dropout_prob=0.05 \
 --mixed_precision=fp16 \
 --seed=42

HuggingFaceDocBuilderDev · 2023-04-05T07:43:09Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten

Looks good!

sayakpaul · 2023-04-06T14:37:46Z

As pointed out by @whbzju here, we also need to do unwrapping while running validation inference. So, I fixed that and did the testing with the following command:

accelerate launch --mixed_precision="fp16" --multi_gpu train_instruct_pix2pix.py \
--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5 \
--dataset_name=sayakpaul/instructpix2pix-1000-samples \
--use_ema \
--enable_xformers_memory_efficient_attention \
--resolution=512 --random_flip \
--train_batch_size=2 --gradient_accumulation_steps=4 --gradient_checkpointing \
--max_train_steps=20 \
--checkpointing_steps=10 --checkpoints_total_limit=1 \
--learning_rate=5e-05 --lr_warmup_steps=0 \
--conditioning_dropout_prob=0.05 \
--mixed_precision=fp16 \
--val_image_url="https://hf.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png" \
--validation_prompt="make the mountains snowy" \
--seed=42 \
--report_to=wandb

@patrickvonplaten could you take a look again?

examples/instruct_pix2pix/train_instruct_pix2pix.py

* fix: norm group test for UNet3D. * fix: unet rejig. * fix: unwrapping when running validation inputs. * unwrapping the unet too. * fix: device. * better unwrapping. * unwrapping before ema. * unwrapping.

sayakpaul added 4 commits April 4, 2023 09:06

fix: norm group test for UNet3D.

f91f6bd

Merge branch 'main' of https://github.com/huggingface/diffusers

565566f

Merge branch 'main' of https://github.com/huggingface/diffusers

bf837f5

fix: unet rejig.

3249f11

sayakpaul requested review from williamberman and patrickvonplaten April 5, 2023 07:38

sayakpaul mentioned this pull request Apr 5, 2023

train_instruct_pix2pix.py in example has problem when try train with mutil gpus? #2966

Closed

Merge branch 'main' of https://github.com/huggingface/diffusers

aa0afd4

patrickvonplaten approved these changes Apr 6, 2023

View reviewed changes

sayakpaul added 2 commits April 6, 2023 18:49

Merge branch 'main' of https://github.com/huggingface/diffusers

f13ca5d

fix: unwrapping when running validation inputs.

1065572

sayakpaul mentioned this pull request Apr 6, 2023

[Examples] Test and update READMEs on multi-GPU support #2997

Closed

sayakpaul added 5 commits April 6, 2023 19:26

unwrapping the unet too.

65ad3ae

fix: device.

6911aaa

better unwrapping.

1eb59d9

unwrapping before ema.

a61a99f

unwrapping.

06c4a65

sayakpaul requested a review from patrickvonplaten April 6, 2023 14:37

Merge branch 'main' of https://github.com/huggingface/diffusers

89e5217

williamberman reviewed Apr 10, 2023

View reviewed changes

examples/instruct_pix2pix/train_instruct_pix2pix.py Show resolved Hide resolved

williamberman reviewed Apr 10, 2023

View reviewed changes

examples/instruct_pix2pix/train_instruct_pix2pix.py Show resolved Hide resolved

williamberman approved these changes Apr 10, 2023

View reviewed changes

sayakpaul added 2 commits April 11, 2023 06:38

Merge branch 'main' of https://github.com/huggingface/diffusers

a6f66c2

Merge branch 'main' into fix/instructpix2pix-training

91854ff

sayakpaul force-pushed the fix/instructpix2pix-training branch from 9b6a6db to 91854ff Compare April 11, 2023 01:09

patrickvonplaten merged commit 5a7d35e into main Apr 12, 2023

patrickvonplaten deleted the fix/instructpix2pix-training branch April 12, 2023 09:18

sayakpaul mentioned this pull request Apr 17, 2023

feat: verfication of multi-gpu support for select examples. #3126

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix InstructPix2Pix training in multi-GPU mode #2978

Fix InstructPix2Pix training in multi-GPU mode #2978

Uh oh!

sayakpaul commented Apr 5, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Apr 5, 2023 •

edited

Loading

Uh oh!

patrickvonplaten left a comment

Uh oh!

sayakpaul commented Apr 6, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Fix InstructPix2Pix training in multi-GPU mode #2978

Fix InstructPix2Pix training in multi-GPU mode #2978

Uh oh!

Conversation

sayakpaul commented Apr 5, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Apr 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Apr 6, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 5, 2023 •

edited

Loading