-
-
Notifications
You must be signed in to change notification settings - Fork 10.2k
[V0][V1][Core] Add outlines integration for V1, and update V0 integration. #15975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V0][V1][Core] Add outlines integration for V1, and update V0 integration. #15975
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
NOTE: Can't be merged until next version of outlines_core is released. |
Thank you for the PR! I will review it this week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the v0 code path. One ask is to add tests for this for disabling cache path.
And we should update the requirements/common.txt to the lowest version of outlines-core supported.
vllm/model_executor/guided_decoding/outlines_logits_processors.py
Outdated
Show resolved
Hide resolved
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of review. A few things needs to be addressed here. but great progress so far.
vllm/model_executor/guided_decoding/outlines_logits_processors.py
Outdated
Show resolved
Hide resolved
vllm/model_executor/guided_decoding/outlines_logits_processors.py
Outdated
Show resolved
Hide resolved
vllm/model_executor/guided_decoding/outlines_logits_processors.py
Outdated
Show resolved
Hide resolved
vllm/model_executor/guided_decoding/outlines_logits_processors.py
Outdated
Show resolved
Hide resolved
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Nathan Hoos <[email protected]>
Signed-off-by: Nathan Hoos <[email protected]>
@russellb sorry about the hold up, I had to go to Mexico for a family situation last week. V1 tests are added and passing. The failing speculative decoding and quantization tests are currently failing on main according to this ci run. The LoRA test which fails seems to be unrelated to anything modified here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the hard work and perseverance! The CI failure is not relevant and will be fixed separately.
but still will wait for @russellb to take a look once he's back
Apologies for the delay. Let's see if CI passes now if you merge from main. |
Signed-off-by: Nathan Hoos <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've talked about some next steps, but I don't want to risk you having to deal with large conflicts again. Some things that would be good to see:
- Some benchmarking so we know how this compares to the other backends and which use cases it does best with.
- Updates to
docs/
Thank you for all of the hard work and diligence on this PR!
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]> Signed-off-by: Patrick von Platen <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]> Signed-off-by: avigny <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]> Signed-off-by: Paul Pak <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]> Signed-off-by: Diego-Castan <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]>
…tion. (vllm-project#15975) Signed-off-by: Nathan Hoos <[email protected]>
Adds outlines as a guided decoding backend for V1, and updates the integration for V0.
The aim of this is three fold:
outlines
, and only useoutlines_core
write_mask_into
method onGuide
to write a bitmask in-place for use in logits masking.Because the dependency on
outlines
will be removed, support for grammar based decoding with the outlines backend will also be removed (CFG classes reside in theoutlines
package)cc @aarnphm