Skip to content

Conversation

unaidedelf8777
Copy link
Contributor

@unaidedelf8777 unaidedelf8777 commented Apr 3, 2025

Adds outlines as a guided decoding backend for V1, and updates the integration for V0.

The aim of this is three fold:

  1. Remove the dependency on outlines, and only use outlines_core
  2. performance gains for V0 using the write_mask_into method on Guide to write a bitmask in-place for use in logits masking.
  3. outlines backend for V1

Because the dependency on outlines will be removed, support for grammar based decoding with the outlines backend will also be removed (CFG classes reside in the outlines package)

cc @aarnphm

Copy link

github-actions bot commented Apr 3, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@unaidedelf8777 unaidedelf8777 changed the title [V0][V1][Core] Add outlines integration for V1, and update V0 integration. [V0][V1][Core] Add outlines integration for V1, and update V0 integration. [DO NOT MERGE] Apr 3, 2025
@mergify mergify bot added the v1 label Apr 3, 2025
@unaidedelf8777 unaidedelf8777 marked this pull request as draft April 3, 2025 00:46
@unaidedelf8777 unaidedelf8777 changed the title [V0][V1][Core] Add outlines integration for V1, and update V0 integration. [DO NOT MERGE] [V0][V1][Core] Add outlines integration for V1, and update V0 integration. Apr 3, 2025
@unaidedelf8777
Copy link
Contributor Author

NOTE: Can't be merged until next version of outlines_core is released.

@russellb
Copy link
Member

russellb commented Apr 7, 2025

Thank you for the PR! I will review it this week.

Copy link
Collaborator

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the v0 code path. One ask is to add tests for this for disabling cache path.

And we should update the requirements/common.txt to the lowest version of outlines-core supported.

@mergify mergify bot added tpu Related to Google TPUs ci/build and removed tpu Related to Google TPUs labels Apr 9, 2025
Copy link

mergify bot commented Apr 11, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @unaidedelf8777.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@unaidedelf8777
Copy link
Contributor Author

@russellb @aarnphm All code is done. If you guys approve of it I’ll go ahead and clean up all the linter complaints, and then it should be ready.

Also outlines-core update has been pushed to pypi and pinned here (v0.2.9)

@unaidedelf8777 unaidedelf8777 marked this pull request as ready for review April 13, 2025 17:55
Copy link
Collaborator

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of review. A few things needs to be addressed here. but great progress so far.

Copy link

mergify bot commented Apr 18, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @unaidedelf8777.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@unaidedelf8777
Copy link
Contributor Author

unaidedelf8777 commented Jun 22, 2025

@russellb sorry about the hold up, I had to go to Mexico for a family situation last week. V1 tests are added and passing. The failing speculative decoding and quantization tests are currently failing on main according to this ci run. The LoRA test which fails seems to be unrelated to anything modified here.

@aarnphm aarnphm requested a review from russellb June 23, 2025 19:54
Copy link
Collaborator

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the hard work and perseverance! The CI failure is not relevant and will be fixed separately.

but still will wait for @russellb to take a look once he's back

@russellb
Copy link
Member

russellb commented Jul 8, 2025

Apologies for the delay. Let's see if CI passes now if you merge from main.

Copy link
Member

@russellb russellb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've talked about some next steps, but I don't want to risk you having to deal with large conflicts again. Some things that would be good to see:

  1. Some benchmarking so we know how this compares to the other backends and which use cases it does best with.
  2. Updates to docs/

Thank you for all of the hard work and diligence on this PR!

@russellb russellb merged commit d6902ce into vllm-project:main Jul 10, 2025
98 checks passed
@github-project-automation github-project-automation bot moved this from In review to Done in Structured Output Jul 10, 2025
Chen-zexi pushed a commit to Chen-zexi/vllm that referenced this pull request Jul 13, 2025
patrickvonplaten pushed a commit to patrickvonplaten/vllm that referenced this pull request Jul 15, 2025
LyrisZhong pushed a commit to LyrisZhong/vllm that referenced this pull request Jul 23, 2025
avigny pushed a commit to avigny/vllm that referenced this pull request Jul 31, 2025
Pradyun92 pushed a commit to Pradyun92/vllm that referenced this pull request Aug 6, 2025
npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025
jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025
paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025
taneem-ibrahim pushed a commit to taneem-ibrahim/vllm that referenced this pull request Aug 14, 2025
diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025
epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 27, 2025
googlercolin pushed a commit to googlercolin/vllm that referenced this pull request Aug 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/build ready ONLY add when PR is ready to merge/full CI is needed structured-output tool-calling v1
Projects
Status: Done
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants