-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
I came across this while testing new features from #6691 (many thanks for supporting micro-conditioning!)
Using train_dreambooth_lora_sdxl_advanced.py --with_prior_preservation results in an invalid shape for prediction with the unet_added_conditions['time_ids'] tensor.
It may be related to the way the class_time_ids are computed.
Reproduction
Follow instructions from advanced_diffusion_training README:
- Install from source
- Download dataset for testing:
from huggingface_hub import snapshot_download
local_dir = "./3d_icon"
snapshot_download(
"LinoyTsaban/3d_icon",
local_dir=local_dir, repo_type="dataset",
ignore_patterns=".gitattributes",
)Execute training with prior preservation (see last arguments):
export MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0"
export DATASET_NAME="./3d_icon"
export OUTPUT_DIR="3d-icon-SDXL-LoRA"
export CLASS_DATA_DIR="./class_data_dir/icons"
export VAE_PATH="madebyollin/sdxl-vae-fp16-fix"
accelerate launch train_dreambooth_lora_sdxl_advanced.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--pretrained_vae_model_name_or_path=$VAE_PATH \
--dataset_name=$DATASET_NAME \
--instance_prompt="3d icon in the style of ohwx" \
--validation_prompt="a ohwx icon of an astronaut riding a horse, in the style of ohwx" \
--output_dir=$OUTPUT_DIR \
--caption_column="prompt" \
--mixed_precision="bf16" \
--resolution=1024 \
--train_batch_size=1 \
--repeats=1 \
--gradient_accumulation_steps=1 \
--gradient_checkpointing \
--learning_rate=1.0 \
--text_encoder_lr=1.0 \
--optimizer="prodigy"\
--train_text_encoder \
--train_text_encoder_frac=0.5 \
--snr_gamma=5.0 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--rank=8 \
--max_train_steps=1000 \
--checkpointing_steps=2000 \
--seed="0" \
--with_prior_preservation \
--class_prompt="icon" \
--class_data_dir=$CLASS_DATA_DIR \
--num_class_images=5Logs
02/13/2024 16:04:39 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: bf16
{'image_encoder', 'feature_extractor'} was not found in config. Values will be initialized to default values.
Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]{'rescale_betas_zero_snr', 'sigma_max', 'timestep_type', 'sigma_min'} was not found in config. Values will be initialized to default values.
Loaded scheduler as EulerDiscreteScheduler from `scheduler` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
{'reverse_transformer_layers_per_block', 'attention_type', 'dropout'} was not found in config. Values will be initialized to default values.
Loaded unet as UNet2DConditionModel from `unet` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
Loading pipeline components...: 29%|███████████████████▍ | 2/7 [00:04<00:11, 2.29s/it]Loaded text_encoder_2 as CLIPTextModelWithProjection from `text_encoder_2` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
Loading pipeline components...: 43%|█████████████████████████████▏ | 3/7 [00:05<00:07, 1.89s/it]Loaded tokenizer as CLIPTokenizer from `tokenizer` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
Loading pipeline components...: 57%|██████████████████████████████████████▊ | 4/7 [00:06<00:03, 1.22s/it]Loaded tokenizer_2 as CLIPTokenizer from `tokenizer_2` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
Loaded vae as AutoencoderKL from `vae` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
Loading pipeline components...: 86%|██████████████████████████████████████████████████████████▎ | 6/7 [00:06<00:00, 1.53it/s]Loaded text_encoder as CLIPTextModel from `text_encoder` subfolder of stabilityai/stable-diffusion-xl-base-1.0.
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████| 7/7 [00:06<00:00, 1.06it/s]
02/13/2024 16:04:47 - INFO - __main__ - Number of class images to sample: 5.
Generating class images: 100%|███████████████████████████████████████████████████████████████████████████| 2/2 [00:27<00:00, 13.99s/it]
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'clip_sample_range', 'variance_type', 'rescale_betas_zero_snr', 'thresholding', 'dynamic_thresholding_ratio'} was not found in config. Values will be initialized to default values.
{'reverse_transformer_layers_per_block', 'attention_type', 'dropout'} was not found in config. Values will be initialized to default values.
/home/thomas/code/temp/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_sdxl_advanced.py:1534: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
logger.warn(
02/13/2024 16:05:30 - WARNING - __main__ - Learning rates were provided both for the unet and the text encoder- e.g. text_encoder_lr: 1.0 and learning_rate: 1.0. When using prodigy only learning_rate is used as the initial learning rate.
Using decoupled weight decay
02/13/2024 16:05:30 - INFO - datasets - PyTorch version 2.2.0 available.
Resolving data files: 100%|█████████████████████████████████████████████████████████████████████████| 23/23 [00:00<00:00, 50773.15it/s]
Generating train split: 22 examples [00:00, 2150.22 examples/s]
/home/thomas/code/temp/venv/lib/python3.10/site-packages/PIL/Image.py:3186: DecompressionBombWarning: Image size (122880000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack.
warnings.warn(
02/13/2024 16:05:40 - INFO - __main__ - ***** Running training *****
02/13/2024 16:05:40 - INFO - __main__ - Num examples = 22
02/13/2024 16:05:40 - INFO - __main__ - Num batches each epoch = 22
02/13/2024 16:05:40 - INFO - __main__ - Num Epochs = 46
02/13/2024 16:05:40 - INFO - __main__ - Instantaneous batch size per device = 1
02/13/2024 16:05:40 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 1
02/13/2024 16:05:40 - INFO - __main__ - Gradient Accumulation steps = 1
02/13/2024 16:05:40 - INFO - __main__ - Total optimization steps = 1000
Steps: 0%| | 0/1000 [00:00<?, ?it/s]/home/thomas/code/temp/venv/lib/python3.10/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
Traceback (most recent call last):
File "/home/thomas/code/temp/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_sdxl_advanced.py", line 2196, in <module>
main(args)
File "/home/thomas/code/temp/diffusers/examples/advanced_diffusion_training/train_dreambooth_lora_sdxl_advanced.py", line 1872, in main
model_pred = unet(
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/accelerate/utils/operations.py", line 817, in forward
return model_forward(*args, **kwargs)
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/accelerate/utils/operations.py", line 805, in __call__
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
return func(*args, **kwargs)
File "/home/thomas/code/temp/diffusers/src/diffusers/models/unets/unet_2d_condition.py", line 1027, in forward
aug_emb = self.add_embedding(add_embeds)
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/thomas/code/temp/diffusers/src/diffusers/models/embeddings.py", line 228, in forward
sample = self.linear_1(sample)
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 116, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (2x2048 and 2816x1280)
Steps: 0%| | 0/1000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/thomas/code/temp/venv/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
args.func(args)
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1023, in launch_command
simple_launcher(args)
File "/home/thomas/code/temp/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 643, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/thomas/code/temp/venv/bin/python', 'train_dreambooth_lora_sdxl_advanced.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0', '--pretrained_vae_model_name_or_path=madebyollin/sdxl-vae-fp16-fix', '--dataset_name=./3d_icon', '--instance_prompt=3d icon in the style of ohwx', '--validation_prompt=a ohwx icon of an astronaut riding a horse, in the style of ohwx', '--output_dir=3d-icon-SDXL-LoRA', '--caption_column=prompt', '--mixed_precision=bf16', '--resolution=1024', '--train_batch_size=1', '--repeats=1', '--gradient_accumulation_steps=1', '--gradient_checkpointing', '--learning_rate=1.0', '--text_encoder_lr=1.0', '--optimizer=prodigy', '--train_text_encoder', '--train_text_encoder_frac=0.5', '--snr_gamma=5.0', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--rank=8', '--max_train_steps=1000', '--checkpointing_steps=2000', '--seed=0', '--with_prior_preservation', '--class_prompt=icon', '--class_data_dir=./class_data_dir/icons', '--num_class_images=5']' returned non-zero exit status 1.System Info
- Installed diffusers from source with advanced dreambooth lora sdxl script requirements.
- Python 3.10.12
Who can help?
@linoytsaban It may have been introduced with your last PR? (Thanks again!)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working