Feature request
Implement LoRA-GA (Low-Rank Adaptation with Gradient Approximation), a novel initialization method that uses gradient information to initialize LoRA adapters. Reference: https://arxiv.org/abs/2407.05000
Instead of random initialization, LoRA-GA:
- Estimates gradients on a small set of training samples (typically 64-128 batches)
- Performs SVD on the gradient matrices to extract principal components
- Initializes adapters using these components, aligning the initial update direction with full fine-tuning
This gradient-aligned initialization allows the model to converge much faster during the subsequent training phase.
Your contribution
Submitting a PR that integrates LoRA-GA into PEFT