Skip to content

dreambooth upscaling fix added latents #3659

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

williamberman
Copy link
Contributor

re: #3658

Also added some docs re: recent much more successful run doing full fine tuning for stage II with larger batch sizes.
https://wandb.ai/williamberman/dreambooth/runs/4bwrpro7/overview?workspace=user-williamberman

I didn't have as much success with LoRA for larger batch sizes but the docs note we recommend full finetuning stage II for faces anyway
https://wandb.ai/williamberman/dreambooth-lora/runs/cdtpa4l7?workspace=user-williamberman

Unfortunately the docs are a bit "here's all the information" rather than "here's how to do a successful training run". But I think that's the best they can be right now

Comment on lines +1214 to +1215
if unet.config.in_channels == channels * 2:
noisy_model_input = torch.cat([noisy_model_input, noisy_model_input], dim=1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just doubling the existing latents is sufficient as the we use the same timestep for both of the conditioning inputs!

Theoretically, we could do something along the lines of training on two separate noising amounts for each set of channels or bias one towards a lower amount of noise as the pipeline's default is for a quarter of total noise to be added to the upsampled input.

However, this is the simplest implementation for now so I think it's ok!

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 3, 2023

The documentation is not available anymore as the PR was closed or merged.

@williamberman williamberman requested review from pcuenca, sayakpaul, patil-suraj, yiyixuxu and patrickvonplaten and removed request for patil-suraj June 3, 2023 05:40

For stage II, we find that lower learning rates are also needed.

We found experimentally that the DDPM scheduler with the default larger number of denoising steps to sometimes work better than the DPM Solver scheduler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@@ -682,8 +690,8 @@ accelerate launch train_dreambooth.py \
--instance_prompt="a sks dog" \
--resolution=256 \
--train_batch_size=2 \
--gradient_accumulation_steps=2 \
--learning_rate=1e-8 \
--gradient_accumulation_steps=6 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, makes sense that bigger batch size will help here

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice find!

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great experimentation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants