-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Discussed in #3437 (comment) . It appears that memory usage increases significantly when using LoRA, even in environments using xFormers. I investigated the cause using this script. The results suggest that even in environments where xFormers is enabled, effectively resulting in the same situation as if xFormers had been deactivated.
As a solution, it seems good to use LoRAXFormersAttnProcessor
instead of LoRAAttnProcessor
if xFormers is enabled in this part.
diffusers/src/diffusers/loaders.py
Lines 275 to 286 in a94977b
if isinstance( | |
attn_processor, (AttnAddedKVProcessor, SlicedAttnAddedKVProcessor, AttnAddedKVProcessor2_0) | |
): | |
cross_attention_dim = value_dict["add_k_proj_lora.down.weight"].shape[1] | |
attn_processor_class = LoRAAttnAddedKVProcessor | |
else: | |
cross_attention_dim = value_dict["to_k_lora.down.weight"].shape[1] | |
attn_processor_class = LoRAAttnProcessor | |
attn_processors[key] = attn_processor_class( | |
hidden_size=hidden_size, cross_attention_dim=cross_attention_dim, rank=rank | |
) |
What do you think?
Reproduction
https://gist.github.com/takuma104/e2139bda7f74cd977350e18500156683
Logs
{"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "OFF", "mem_MB": 3837}
{"width": 512, "height": 512, "batch": 1, "xformers": "OFF", "lora": "ON", "mem_MB": 3837}
{"width": 512, "height": 768, "batch": 1, "xformers": "OFF", "lora": "OFF", "mem_MB": 5878}
{"width": 512, "height": 768, "batch": 1, "xformers": "OFF", "lora": "ON", "mem_MB": 5880}
{"width": 512, "height": 512, "batch": 2, "xformers": "OFF", "lora": "OFF", "mem_MB": 5505}
{"width": 512, "height": 512, "batch": 2, "xformers": "OFF", "lora": "ON", "mem_MB": 5507}
{"width": 512, "height": 768, "batch": 2, "xformers": "OFF", "lora": "OFF", "mem_MB": 9589}
{"width": 512, "height": 768, "batch": 2, "xformers": "OFF", "lora": "ON", "mem_MB": 9591}
{"width": 512, "height": 512, "batch": 4, "xformers": "OFF", "lora": "OFF", "mem_MB": 8842}
{"width": 512, "height": 512, "batch": 4, "xformers": "OFF", "lora": "ON", "mem_MB": 8844}
{"width": 512, "height": 768, "batch": 4, "xformers": "OFF", "lora": "OFF", "mem_MB": 17011}
{"width": 512, "height": 768, "batch": 4, "xformers": "OFF", "lora": "ON", "mem_MB": 17013}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "OFF", "mem_MB": 2806}
{"width": 512, "height": 512, "batch": 1, "xformers": "ON", "lora": "ON", "mem_MB": 3837}
{"width": 512, "height": 768, "batch": 1, "xformers": "ON", "lora": "OFF", "mem_MB": 3125}
{"width": 512, "height": 768, "batch": 1, "xformers": "ON", "lora": "ON", "mem_MB": 5880}
{"width": 512, "height": 512, "batch": 2, "xformers": "ON", "lora": "OFF", "mem_MB": 3243}
{"width": 512, "height": 512, "batch": 2, "xformers": "ON", "lora": "ON", "mem_MB": 5507}
{"width": 512, "height": 768, "batch": 2, "xformers": "ON", "lora": "OFF", "mem_MB": 3780}
{"width": 512, "height": 768, "batch": 2, "xformers": "ON", "lora": "ON", "mem_MB": 9591}
{"width": 512, "height": 512, "batch": 4, "xformers": "ON", "lora": "OFF", "mem_MB": 4317}
{"width": 512, "height": 512, "batch": 4, "xformers": "ON", "lora": "ON", "mem_MB": 8844}
{"width": 512, "height": 768, "batch": 4, "xformers": "ON", "lora": "OFF", "mem_MB": 5392}
{"width": 512, "height": 768, "batch": 4, "xformers": "ON", "lora": "ON", "mem_MB": 17013}
System Info
diffusers
version: 0.16.1- Platform: Linux-5.19.0-41-generic-x86_64-with-glibc2.35
- Python version: 3.10.11
- PyTorch version (GPU RTX3090): 2.0.1+cu117 (True)
- Huggingface_hub version: 0.14.1
- Transformers version: 4.29.1
- Accelerate version: 0.19.0
- xFormers version: 0.0.20
- Using GPU in script?: True
- Using distributed or parallel set-up in script?: Nope
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working