-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Description
For the last few months, we have been collaborating with our contributors to ensure we support LoRA effectively and efficiently from Diffusers:
1. Training support
✅ DreamBooth (letting users perform LoRA fine-tuning of both UNet and text-encoder). There were some issues in the text encoder part which are now being fixed in #3437. Thanks to @takuma104.
✅ Vanilla text-to-image fine-tuning. We support only the fine-tuning of UNet with LoRA purposefully since here we'd assume that the number of image-caption pairs is higher than what is typically used for DreamBooth and therefore, text encoder fine-tuning is probably an overkill.
2. Interoperability
With #3437, we're introducing limited support for loading A1111 CivitAI checkpoints with pipeline.load_lora_weights()
. This has been a widely requested feature (see #3064 as an example).
We do provide a convert_lora_safetensor_to_diffusers.py
script as well that allows for converting A1111 LoRA checkpoints (potentially non-exhaustive) and merging them to the text encoder and the UNet of a DiffusionPipeline
. However, this doesn't allow switching the attention processor back to the default one, unlike how it's currently in Diffusers. Check out https://huggingface.co/docs/diffusers/main/en/training/lora for more details. For inference-only and definitive workflows (where one doesn't need to switch attention processors), it caters to many use cases.
3. xformers support for efficient inference
Once LoRA parameters are loaded into a pipeline, xformers should work seamlessly. There was apparently a problem with that and it's fixed in #3556.
4. PT 2.0 SDPA optimization
See: #3594
5. torch.compile()
compatibility with LoRA
Once 4. is settled, we should be able to take advantage of torch.compile()
.
6. Introduction of scale
for control the contributions from the text encoder LoRA
See #3480. We already support passing scale
as a part of cross_attention_kwargs
for the UNet LoRA.
7. Supporting multiple LoRAs
@takuma104 proposed a hook-based design here: #3064 (comment)
I hope this helps to provide a consolidated view of where we're at regarding supporting LoRA from Diffusers.