-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Update transformer_flux.py. Change float64 to float32 #9133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dtype=torch.float64 is overkill, and float64 is not defined for certain devices such as Apple Silicon mps.
Update transformer_flux.py. Change float64 to float32
|
just a note that macos 14 and pytorch 2.4 or greater still can't do it. but i think macos 15 can, or, pytorch 2.3.1 with macos 14. but then training uses a lot more vram. edit: no, macos 15 still broken - don't upgrade to fix it. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
This fix should make the model runnable on MPS. Tested with https://gist.github.com/hvaara/bc8754b2aab6ef07a95c82c5e436f6d3. Running macOS 14.6. @bghira Does it work for you with the patch from this PR and the code form my gist? Need ~45 GB VRAM. If not, what error are you seeing? |
|
oh roy 🙉 we meet again. i see no error, it's just that the image is all noise |
|
Haha! Indeed we do 😂 Are you using latest diffusers and transformers? Are you sure your weights are not corrupted? With the code change by OP and the script in my gist I get great images using MPS as the accelerator. I actually came here to contrib the exact same change as OP 😅 |
|
all day every day on git branches.. latest and hot off the wire. pytorch nightly, latest diffusers/transformers, macos 15 beta 4. |
|
Issues @bghira experienced has been identified as a bug in PyTorch. I will open a bug and propose a fix upstream. |
|
Follow pytorch/pytorch#133520 for updates on the noisy output image issue. |
|
Thanks!
I would advocate for this myself and also perhaps logging a warning that we're reducing the precision here and results may be unexpected and refer to this PR. WDYT? |
|
Did some testing. I don't know how much of an impact it has, but overall I think the images generated with float64 look the best. A catPrompt
float16float32float64A landscapePrompt
(CLIP cut me off) float16float32float64 |
|
cc @asomoza here my eye couldn't spot any quality difference between the float64 output and float32 outputs context is I'm refactoring flux to use |
|
some models do better with fp16 rope embeds. a transformer config option maybe? |
@yiyixuxu, I did some tests with flux dev and I also don't see a really big difference but if I have to choose one, I also think that float64 is better if I really look into some tiny details. |
* refactor rotary embeds * adding jsmidt as co-author of this PR for #9133 --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Joseph Smidt <[email protected]>
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
|
did anything remove the need for this on mps? |
|
Yes, #9074 solved it. This PR can be closed. |
* refactor rotary embeds * adding jsmidt as co-author of this PR for #9133 --------- Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: Joseph Smidt <[email protected]>






What does this PR do?
dtype=torch.float64 is overkill, and float64 is not defined for certain devices such as Apple Silicon mps. This change enables the flux pipeline to be run on certain devices such as Apple Silicon mps without negative consequences.