-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Closed
Labels
bugSomething isn't workingSomething isn't workingstaleIssues that haven't received updatesIssues that haven't received updates
Description
Describe the bug
I found that if I use CompVis/stable-diffusion-v1-4,it is ok.But when I use stabilityai/stable-diffusion-2,it has this problem. How can I deal it?
when I train dreambooth, I use stabilityai/stable-diffusion-2. It always use GPU:0,but GPU:0 is not my device.GPU:1 is my device, and it is free, I want to use GPU:1 . I had try it:
export MODEL_NAME="stabilityai/stable-diffusion-2"
export INSTANCE_DIR="_xxx_"
export OUTPUT_DIR="_xxx_"
CUDA_VISIBLE_DEVICES=1 accelerate launch train_dreambooth.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--instance_data_dir=$INSTANCE_DIR \
--output_dir=$OUTPUT_DIR \
--instance_prompt="a photo of sks _zzzzzzz_" \
--resolution=768 \
--train_batch_size=1 \
--gradient_accumulation_steps=1 \
--learning_rate=5e-6 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--max_train_steps=400
......
Reproduction
help me please.What should I do?
Logs
Traceback (most recent call last):
File "/home/zhuojunjie/3-oraimo/diffusers/examples/dreambooth/train_dreambooth.py", line 713, in <module>
main(args)
File "/home/zhuojunjie/3-oraimo/diffusers/examples/dreambooth/train_dreambooth.py", line 632, in main
model_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample
File "/root/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.9/site-packages/accelerate/utils/operations.py", line 490, in __call__
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "/root/miniconda3/lib/python3.9/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
return func(*args, **kwargs)
File "/root/miniconda3/lib/python3.9/site-packages/diffusers/models/unet_2d_condition.py", line 367, in forward
sample = upsample_block(
File "/root/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.9/site-packages/diffusers/models/unet_2d_blocks.py", line 1255, in forward
hidden_states = attn(hidden_states, encoder_hidden_states=encoder_hidden_states).sample
File "/root/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.9/site-packages/diffusers/models/attention.py", line 219, in forward
hidden_states = block(hidden_states, context=encoder_hidden_states, timestep=timestep)
File "/root/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.9/site-packages/diffusers/models/attention.py", line 477, in forward
hidden_states = self.attn1(norm_hidden_states) + hidden_states
File "/root/miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.9/site-packages/diffusers/models/attention.py", line 572, in forward
hidden_states = self._attention(query, key, value)
File "/root/miniconda3/lib/python3.9/site-packages/diffusers/models/attention.py", line 593, in _attention
hidden_states = torch.bmm(attention_probs, value)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 810.00 MiB (GPU 0; 23.70 GiB total capacity; 21.84 GiB already allocated; 287.69 MiB free; 22.07 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Steps: 0%| | 0/400 [00:02<?, ?it/s]Traceback (most recent call last):
File "/root/miniconda3/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/root/miniconda3/lib/python3.9/site-packages/accelerate/commands/accelerate_cli.py", line 45, in main
args.func(args)
File "/root/miniconda3/lib/python3.9/site-packages/accelerate/commands/launch.py", line 1104, in launch_command
simple_launcher(args)
File "/root/miniconda3/lib/python3.9/site-packages/accelerate/commands/launch.py", line 567, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/root/miniconda3/bin/python', 'train_dreambooth.py', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-2', '--instance_data_dir=/home/zhuojunjie/datasets/origin', '--output_dir=/home/zhuojunjie/3-oraimo/diffusers/output', '--instance_prompt=a photo of sks electric shave', '--resolution=768', '--train_batch_size=1', '--gradient_accumulation_steps=1', '--learning_rate=5e-6', '--lr_scheduler=constant', '--lr_warmup_steps=0', '--max_train_steps=400']' returned non-zero exit status 1.
System Info
no
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingstaleIssues that haven't received updatesIssues that haven't received updates
