Skip to content

[Revision] Don't recommend using revision #1764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 19, 2022

Conversation

patrickvonplaten
Copy link
Contributor

@patrickvonplaten patrickvonplaten commented Dec 19, 2022

We should not recommend using the "fp16" revision anymore since most model checkpoints don't have such a branch and also because it's quite an unintuitive behavior.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Dec 19, 2022

The documentation is not available anymore as the PR was closed or merged.

@patrickvonplaten patrickvonplaten merged commit ce1c27a into main Dec 19, 2022
@patrickvonplaten patrickvonplaten deleted the remove_fp16_revision branch December 19, 2022 15:26
sliard pushed a commit to sliard/diffusers that referenced this pull request Dec 21, 2022
@keturn
Copy link
Contributor

keturn commented Dec 23, 2022

Err, wait, but don't all the flagship Stable Diffusion models serve fp32 weights on their default branch?

The bandwidth and storage advantages of fp16 models are significant. Literally half the space. Pretty sure it's a significant factor in model loading speed too. At least for people using fp16, which I think just about everyone using CUDA for Stable Diffusion is.

I very much hope you continue to provide these half-precision model weights (for the models for which they are viable, anyway), and I encourage you to not underestimate how much people on commodity hardware will appreciate the lower resource requirements!

That said, you have identified a usability issue here: there's no API to specify your architecture when asking the hub for a model, there's not a well-established convention for which branches or subfolders people use for this, and I didn't have an easy go of it when trying to figure out how to detect if a repo has a fp16 branch or not.

@hafriedlander
Copy link
Contributor

Agree with @keturn, the download, disk space and load speed of fp16 versions are significant. It's not a blocker or anything, but it would make me sad if fp16 stopped being provided. (Although caveat: it's easy to accidentally download both fp16 and fp32 models, thereby making two of those worse :) )

@keturn although I don't do it yet, you should be able to use huggingface_hub.model_info with a revision, and catch an error if the fp16 version doesn't exist (unclear: how easy it is to detect "revision doesn't exist" compared to "network / huggingface issue")

@pcuenca
Copy link
Member

pcuenca commented Dec 31, 2022

I agree that we should try to download the fp16 versions if they are available and the user specifies a float16 dtype. I believe we should do this consistently with the other variants repos may contain: Flax weights, safetensors format, etc. My approach for it would be:

  • Standardize file names and extensions for the different variants.
  • Select the best pattern according to the current request.
  • If a file with the desired pattern does not exist for a given model, fall back to another option. For example, if no float16 weights are available in a given repo, we'd download the float32 ones.

We don't currently have a general solution to solve this, just some pieces to deal with safetensors vs PyTorch files. But in my opinion it's something we should work on soon.

@patrickvonplaten
Copy link
Contributor Author

patrickvonplaten commented Jan 2, 2023

Always one step ahead @keturn 😊

We're currently looking into a better way to support "fp16" directly for download.

Our current thoughts are:

  • Deprecate the usage of "revision" in favor of a new "version" argument and instead of using branches on the Hub we put all files in the "main" branch.

This will be a bigger refactor/deprecation PR, but it should be better because:

  • a) More user-friendly, the community cannot really interact and often doesn't know about branches on the Hugging Fave Hub
  • b) Unified way of storing the weight - currently use both the "main" branch to store other weight versions (safetensors) and branches => that's not a good long-term solution
  • c) It's arguably weird to use branches as a way to provide different versions of a model. Analogs GitHub repos don't store different versions of their code on a branch (branches should only be used for development)

We will soon open a bigger PR for this to discuss with the community.

Currently the idea is to create the following naming scheme:

diffusion_model.fp16.bin
and add a "version" argument to from_pretrained

Wdyt about this @keturn @hafriedlander ?

(cc @pcuenca)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants