-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
[Frontend][Doc][5/N] Improve all pooling task | Polish encode (pooling) api & Document. #25524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
We continue our previous discussion here following #25370 Split the encode task into two tasks: token_embed and token_classify
For online scenarios (/pooling):
|
|
I think for online API, the user should be able to pass the task in the request. |
|
Documentation preview: https://vllm--25524.org.readthedocs.build/en/25524/ |
18b0002 to
b6b2e12
Compare
Signed-off-by: wang.yuqi <[email protected]>
976793c to
d98bf46
Compare
Signed-off-by: wang.yuqi <[email protected]>
e326cde to
dd06fe1
Compare
Signed-off-by: wang.yuqi <[email protected]>
Signed-off-by: wang.yuqi <[email protected]>
Signed-off-by: wang.yuqi <[email protected]>
|
/gemini review |
Signed-off-by: wang.yuqi <[email protected]>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the pooling API by splitting the encode task into more specific tasks (token_embed, token_classify) and renaming the activation parameter to use_activation for clarity. The documentation, examples, and tests are updated to reflect these changes. The changes are generally well-executed and improve the API's usability. However, I've identified a critical issue in the task inference logic for the new generic /pooling endpoint that could lead to unexpected failures. My review includes a suggestion to fix this logic.
Signed-off-by: wang.yuqi <[email protected]>
Signed-off-by: wang.yuqi <[email protected]>
hmellor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since activation was user facing, we should probably deprecate it when changing it to use_activation
Anyway, I’d like to merge this PR quickly before the release. It’s mainly documentation changes. This feature has been a bit of a mess, and it’s been changing with every recent release. It should finally be stable now after this latest change. |
DarkLight1337
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @hmellor , we should still keep the old field with a deprecation
|
Otherwise we would break back-compatibility, which is definitely not what a "doc PR" should do |
Signed-off-by: wang.yuqi <[email protected]>
|
Let’s land this. |
…g) api & Document. (vllm-project#25524) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <[email protected]>
…g) api & Document. (vllm-project#25524) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <[email protected]>
…g) api & Document. (vllm-project#25524) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <[email protected]>
…g) api & Document. (vllm-project#25524) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <[email protected]>
…g) api & Document. (vllm-project#25524) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <[email protected]> Signed-off-by: Eldar Kurtic <[email protected]>
…g) api & Document. (vllm-project#25524) Signed-off-by: wang.yuqi <[email protected]> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Co-authored-by: Cyrus Leung <[email protected]>
TL;DR
Improve all pooling task (0.11.1 cut)
These PRs are mostly conflicting with each other, so combining them into a series would better inform reviewers about what happened. And what else needs to be done after that?
Purpose
Following #25370
Address: #27413 (comment)
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.