- 
          
- 
        Couldn't load subscription status. 
- Fork 10.9k
[Misc] Enable vLLM to Dynamically Load LoRA from a Remote Server #10546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these: 
 🚀 | 
| 
 Hey @khluu , this is my first time opening PR to vLLM. Do I need to pass all the test in BuildKite for the code to be reviewed and merged? I also send you email with subject  Thanks for the help! | 
| IIUC, although this PR is related to LoRA loading, it seems you haven't touched the underlying LORA logic. What you might need is to add unit tests similar to #6566. | 
| 
 Oh if it's just for merging, you can just wait until your PR is reviewed and approved, then your PR reviewer can run CI for you. Fastcheck is more like a subset of CI, running short tests and give you the flexibility to run extra specific test if needed (which you need to be in Buildkite org to do). | 
| @youkaichao do you think we can make this a plugin? | 
| This pull request has merge conflicts that must be resolved before it can be | 
| I would love to see this merged in! | 
| For LoRA specific code, @jeejeelee has more context than me to review the code. From the integration perspective, I think it's okay to create a lora resolver registry, and expose a function to register new lora resolvers. We don't need a new plugin category. General plugin should be enough to call the function to register new lora resolvers. One technical detail is, we should collect all registered resolvers, and try them one by one to see which resolver succeeds. We don't need to let users select the resolver. | 
| Thanks for the feedback @youkaichao From what I understand so far, there are three main areas I'd need to modify for LoRA plugin support: 
 
 
 
 
 
 Does this sounds like a reasonable approach @youkaichao @jeejeelee? Thanks in advance! | 
| @joerunde WDYT | 
| I’ve made the concrete change here: #15733 | 
Signed-off-by: Angky William <[email protected]>
7260dd4    to
    397ed64      
    Compare
  
    | Updated the code with: 
 | 
Signed-off-by: Angky William <[email protected]>
Signed-off-by: Angky William <[email protected]>
Signed-off-by: Angky William <[email protected]>
Signed-off-by: Angky William <[email protected]>
Signed-off-by: Angky William <[email protected]>
Signed-off-by: Angky William <[email protected]>
| @joerunde I’ve added the test for LoRA loading flow via LoRAResolver, along with the corresponding documentation. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks for the plugin support, @jeejeelee @youkaichao what do y'all think about getting this merged?
…test Signed-off-by: Angky William <[email protected]>
Signed-off-by: Angky William <[email protected]>
| Can you add an example in the documentation for such plugin? so we have some examples people can leverage and knowing how to use it. | 
Signed-off-by: Angky William <[email protected]>
Signed-off-by: Angky William <[email protected]>
| @simon-mo I’ve added an example of the  | 
| also if you want to see what a resolver implementation would look like, i have my PR converted to a plugin up here: https://github.com/jberkhahn/filesystem_resolver | 
…m-project#10546) Signed-off-by: Angky William <[email protected]> Co-authored-by: Angky William <[email protected]>
…m-project#10546) Signed-off-by: Angky William <[email protected]> Co-authored-by: Angky William <[email protected]> Signed-off-by: Yang Wang <[email protected]>
…m-project#10546) Signed-off-by: Angky William <[email protected]> Co-authored-by: Angky William <[email protected]>
…m-project#10546) Signed-off-by: Angky William <[email protected]> Co-authored-by: Angky William <[email protected]>
…m-project#10546) Signed-off-by: Angky William <[email protected]> Co-authored-by: Angky William <[email protected]> Signed-off-by: Mu Huai <[email protected]>
| @angkywilliam hey! I've been trying to use LoRA resolver on local file system with no luck. I've detailed my attempt here: #18630. Any help is welcome :) After it works locally, I want to try loading from s3. | 


In the current implementation, vLLM only supports loading LoRA from local storage.
At OpenPipe, we are extending the serving engine's capabilities by introducing a LoRAResolver. LoRAResolver enables vLLM users to implement custom logic for fetching LoRA files from remote servers.
For example, in OpenPipe's case, we are dynamically loading LoRA for our customer from S3.