Skip to content

[Dreambooth Example] Attempting to unscale FP16 gradients. #1246

@jpiabrantes

Description

@jpiabrantes

Describe the bug

I had the training script working fine but then I updated diffusers to 0.7.2 and now I get the following error:

Traceback (most recent call last):
  File "/tmp/pycharm_project_990/train_dreambooth.py", line 938, in <module>
    main(args)
  File "/tmp/pycharm_project_990/train_dreambooth.py", line 876, in main
    optimizer.step()
  File "/opt/conda/envs/dreambooth/lib/python3.7/site-packages/accelerate/optimizer.py", line 134, in step
    self.scaler.step(self.optimizer, closure)
  File "/opt/conda/envs/dreambooth/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 337, in step
    self.unscale_(optimizer)
  File "/opt/conda/envs/dreambooth/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 282, in unscale_
    optimizer_state["found_inf_per_device"] = self._unscale_grads_(optimizer, inv_scale, found_inf, False)
  File "/opt/conda/envs/dreambooth/lib/python3.7/site-packages/torch/cuda/amp/grad_scaler.py", line 210, in _unscale_grads_
    raise ValueError("Attempting to unscale FP16 gradients.")
ValueError: Attempting to unscale FP16 gradients.
Steps:   0%|          | 0/800 [00:18<?, ?it/s]

Any ideas, or do I need to downgrade?

Reproduction

No response

Logs

No response

System Info

diffusers 0.7.2
python 3.7.12
accelerate 0.14.0

Metadata

Metadata

Labels

bugSomething isn't workingtraining

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions