From b0a04617c84961e11ded94ae501d5a1a90036bf4 Mon Sep 17 00:00:00 2001 From: Nicolas Patry Date: Thu, 26 Jan 2023 15:34:35 +0100 Subject: [PATCH 1/5] Tmp. --- docs/source/en/_toctree.yml | 2 ++ .../en/using-diffusers/using_safetensors | 19 +++++++++++++++++++ 2 files changed, 21 insertions(+) create mode 100644 docs/source/en/using-diffusers/using_safetensors diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml index f2c99db9ab75..609cc9af7a0f 100644 --- a/docs/source/en/_toctree.yml +++ b/docs/source/en/_toctree.yml @@ -38,6 +38,8 @@ title: Community Pipelines - local: using-diffusers/contribute_pipeline title: How to contribute a Pipeline + - local: using-diffusers/using_safetensors + title: Using safetensors title: Pipelines for Inference - sections: - local: using-diffusers/rl diff --git a/docs/source/en/using-diffusers/using_safetensors b/docs/source/en/using-diffusers/using_safetensors new file mode 100644 index 000000000000..b6b165dabc72 --- /dev/null +++ b/docs/source/en/using-diffusers/using_safetensors @@ -0,0 +1,19 @@ +# What is safetensors ? + +[safetensors](https://github.com/huggingface/safetensors) is a different format +from the classic `.bin` which uses Pytorch which uses pickle. + +Pickle is notoriously unsafe which allow any malicious file to execute arbitrary code. +The hub itself tries to prevent issues from it, but it's not a silver bullet. + +`safetensors` first and foremost goal is to make loading machine learning models *safe* +in the sense that no takeover of your computer can be done. + +# Why use safetensors ? + +**Safety** can be one reason, if you're attempting to use a not well known model and +you're not sure about the source of the file. + +And a secondary reason, is **the speed of loading**. Safetensors can load models much faster +than regular pickle files. If you spend a lot of times switching models, this can be +a huge timesave. From 310617db9d7fcbe6ba6c58608eccb31e95764aab Mon Sep 17 00:00:00 2001 From: Nicolas Patry Date: Thu, 26 Jan 2023 17:14:08 +0100 Subject: [PATCH 2/5] Adding more docs. --- .../en/using-diffusers/using_safetensors.mdx | 85 +++++++++++++++++++ 1 file changed, 85 insertions(+) create mode 100644 docs/source/en/using-diffusers/using_safetensors.mdx diff --git a/docs/source/en/using-diffusers/using_safetensors.mdx b/docs/source/en/using-diffusers/using_safetensors.mdx new file mode 100644 index 000000000000..eac8fbf0fd59 --- /dev/null +++ b/docs/source/en/using-diffusers/using_safetensors.mdx @@ -0,0 +1,85 @@ +# What is safetensors ? + +[safetensors](https://github.com/huggingface/safetensors) is a different format +from the classic `.bin` which uses Pytorch which uses pickle. It contains the +exact same data, which is just the model weights (or tensors). + +Pickle is notoriously unsafe which allow any malicious file to execute arbitrary code. +The hub itself tries to prevent issues from it, but it's not a silver bullet. + +`safetensors` first and foremost goal is to make loading machine learning models *safe* +in the sense that no takeover of your computer can be done. + +Hence the name. + +# Why use safetensors ? + +**Safety** can be one reason, if you're attempting to use a not well known model and +you're not sure about the source of the file. + +And a secondary reason, is **the speed of loading**. Safetensors can load models much faster +than regular pickle files. If you spend a lot of times switching models, this can be +a huge timesave. + +TODO: Numbers are not final +``` +from diffusers import StableDiffusionPipeline + +pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", use_safetensors=True) +# Loaded in safetensors 0:00:01.998426 +# Loaded in Pytorch 0:00:05.339772 +# +pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", use_safetensors=True, device=0) +# Loaded in safetensors 0:00:01.998426 <--- this requires a special environment variable SAFETENSORS_FAST_GPU=1 to be set, as this particular fast loading method wasn't properly audited (yet). +# Loaded in Pytorch 0:00:05.339772 +``` + +Performance in general is a tricky business, and there are a few things to understand: + +- If you're using the model for the first time from the hub, you will have to download the weights. + That's extremely likely to be much slower than any loading method, therefore you will not see any difference +- If you're loading the model for the first time (let's say after a reboot) then your machine will have to + actually read the disk. It's likely to be as slow in both cases. Again the speed difference may not be as visible (this depends on hardware and the actual model). +- The best performance benefit is when the model was already loaded previously on your computer and you're switching from one model to another. Your OS, is trying really hard not to read from disk, since this is slow, so it will keep the files around in RAM, making it loading again much faster. Since safetensors is doing zero-copy of the tensors, reloading will be faster than pytorch since it has at least once extra copy to do. + +# How to use safetensors ? + +If you have `safetensors` installed, and all the weights are available in `safetensors` format, \ +then by default it will use that instead of the pytorch weights. + +If you want to make **sure** you're not using pytorch weights, you can use `use_safetensors=True` in your loading method. + +If you are really paranoid about this, the ultimate weapon would be disabling `torch.load`: +```python + +import torch + +def _raise(): + raise RuntimeError("I don't want to use pickle") + +torch.load = lambda *args, **kwargs: _raise() +``` +but in general `diffusers` previous options should be enough. + +# I want to use model X but it doesn't have safetensors weights. + +Just go to this [space](https://huggingface.co/spaces/safetensors/convert). +This will create a new PR with the weights, let's say `refs/pr/22`. + +This space will download the pickled version, convert it, and upload it on the hub as a PR. +If anything bad is contained in the file, it's Huggingface hub that will get issues, not your own computer. +And we're equipped with dealing with it. + +Then in order to use the model, even before the branch gets accepted by the original author you can do: + +```python +from diffusers import StableDiffusionPipeline + +pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", use_safetensors=True, revision="refs/pr/22") +``` + +And that's it ! + +Anything unclear, concerns, or found a bugs ? [Open an issue](https://github.com/huggingface/diffusers/issues/new/choose) + + From 9b74cffd0d0b2dc86b3b33498154c0b16cee6ba4 Mon Sep 17 00:00:00 2001 From: Nicolas Patry Date: Thu, 26 Jan 2023 17:22:10 +0100 Subject: [PATCH 3/5] Doc style. --- docs/source/en/using-diffusers/using_safetensors.mdx | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/source/en/using-diffusers/using_safetensors.mdx b/docs/source/en/using-diffusers/using_safetensors.mdx index eac8fbf0fd59..b42c64f43eb8 100644 --- a/docs/source/en/using-diffusers/using_safetensors.mdx +++ b/docs/source/en/using-diffusers/using_safetensors.mdx @@ -51,12 +51,13 @@ If you want to make **sure** you're not using pytorch weights, you can use `use_ If you are really paranoid about this, the ultimate weapon would be disabling `torch.load`: ```python - import torch + def _raise(): raise RuntimeError("I don't want to use pickle") + torch.load = lambda *args, **kwargs: _raise() ``` but in general `diffusers` previous options should be enough. @@ -75,7 +76,9 @@ Then in order to use the model, even before the branch gets accepted by the orig ```python from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", use_safetensors=True, revision="refs/pr/22") +pipe = StableDiffusionPipeline.from_pretrained( + "stabilityai/stable-diffusion-2-1", use_safetensors=True, revision="refs/pr/22" +) ``` And that's it ! From 552b7805390885a77d61d6f66c3f91ed013d6790 Mon Sep 17 00:00:00 2001 From: Nicolas Patry Date: Fri, 27 Jan 2023 17:28:27 +0100 Subject: [PATCH 4/5] Remove the argument `use_safetensors=True`. --- .../en/using-diffusers/using_safetensors.mdx | 25 ++++++++++--------- 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/docs/source/en/using-diffusers/using_safetensors.mdx b/docs/source/en/using-diffusers/using_safetensors.mdx index b42c64f43eb8..55b27fae2f55 100644 --- a/docs/source/en/using-diffusers/using_safetensors.mdx +++ b/docs/source/en/using-diffusers/using_safetensors.mdx @@ -21,17 +21,21 @@ And a secondary reason, is **the speed of loading**. Safetensors can load models than regular pickle files. If you spend a lot of times switching models, this can be a huge timesave. -TODO: Numbers are not final +Numbers taken AMD EPYC 7742 64-Core Processor ``` from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", use_safetensors=True) -# Loaded in safetensors 0:00:01.998426 -# Loaded in Pytorch 0:00:05.339772 -# -pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", use_safetensors=True, device=0) -# Loaded in safetensors 0:00:01.998426 <--- this requires a special environment variable SAFETENSORS_FAST_GPU=1 to be set, as this particular fast loading method wasn't properly audited (yet). -# Loaded in Pytorch 0:00:05.339772 +pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1") + +# Loaded in safetensors 0:00:02.033658 +# Loaded in Pytorch 0:00:02.663379 +``` + +This is for the entire loading time, the actual weights loading time to load 500MB: + +``` +Safetensors: 3.4873ms +PyTorch: 172.7537ms ``` Performance in general is a tricky business, and there are a few things to understand: @@ -47,8 +51,6 @@ Performance in general is a tricky business, and there are a few things to under If you have `safetensors` installed, and all the weights are available in `safetensors` format, \ then by default it will use that instead of the pytorch weights. -If you want to make **sure** you're not using pytorch weights, you can use `use_safetensors=True` in your loading method. - If you are really paranoid about this, the ultimate weapon would be disabling `torch.load`: ```python import torch @@ -60,7 +62,6 @@ def _raise(): torch.load = lambda *args, **kwargs: _raise() ``` -but in general `diffusers` previous options should be enough. # I want to use model X but it doesn't have safetensors weights. @@ -77,7 +78,7 @@ Then in order to use the model, even before the branch gets accepted by the orig from diffusers import StableDiffusionPipeline pipe = StableDiffusionPipeline.from_pretrained( - "stabilityai/stable-diffusion-2-1", use_safetensors=True, revision="refs/pr/22" + "stabilityai/stable-diffusion-2-1", revision="refs/pr/22" ) ``` From ff9aaeee2a13e75ba506f2c13067851c60216a39 Mon Sep 17 00:00:00 2001 From: Nicolas Patry Date: Fri, 27 Jan 2023 17:41:09 +0100 Subject: [PATCH 5/5] doc-builder --- docs/source/en/using-diffusers/using_safetensors.mdx | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/docs/source/en/using-diffusers/using_safetensors.mdx b/docs/source/en/using-diffusers/using_safetensors.mdx index 55b27fae2f55..029d1e84f7d9 100644 --- a/docs/source/en/using-diffusers/using_safetensors.mdx +++ b/docs/source/en/using-diffusers/using_safetensors.mdx @@ -77,9 +77,7 @@ Then in order to use the model, even before the branch gets accepted by the orig ```python from diffusers import StableDiffusionPipeline -pipe = StableDiffusionPipeline.from_pretrained( - "stabilityai/stable-diffusion-2-1", revision="refs/pr/22" -) +pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", revision="refs/pr/22") ``` And that's it !