-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
[Misc] Clean up flags in vllm bench serve
#25138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Roger Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to clean up redundant CLI flags by deprecating --backend in favor of --endpoint-type. The changes are in the right direction, but I've found a critical issue that will cause the script to crash, and another issue related to the deprecation logic that could lead to silent misconfigurations. Please see my detailed comments for suggestions on how to fix these.
Signed-off-by: Roger Wang <[email protected]>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to clean up CLI flags in vllm bench serve by deprecating --backend in favor of --endpoint-type. The changes are in the right direction, replacing usages of the old flag and adding deprecation warnings. However, I've found a critical bug in the deprecation logic that causes the user-provided --endpoint-type to be incorrectly ignored. My review includes a comment with a detailed explanation and a code suggestion to fix this issue, ensuring that flag precedence is handled correctly while maintaining backward compatibility for the default behavior.
Signed-off-by: Roger Wang <[email protected]>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request does a good job of cleaning up the command-line flags for vllm bench serve by deprecating --backend in favor of --endpoint-type. The changes are consistent and improve clarity.
I have one suggestion to make the deprecation handling more robust. Currently, if both flags are provided, --backend unconditionally overrides --endpoint-type, which could be surprising. I've proposed a change to raise an error in case of conflicting values to avoid ambiguity and prevent users from running benchmarks against an unintended endpoint.
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
|
On a second thought - it seems that |
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
vllm/benchmarks/serve.py
Outdated
| current_dt = datetime.now().strftime("%Y%m%d-%H%M%S") | ||
| result_json["date"] = current_dt | ||
| result_json["endpoint_type"] = args.endpoint_type | ||
| result_json["endpoint_type"] = args.backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "endpoint_type" key here is not modified on purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you wanna dump it to "backend" too. i guess this result is only used by @huydhn ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I'm not sure if this will break PyTorch's dashboard - but let me also dump it to "backend" too
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request cleans up the command-line flags for vllm bench serve by deprecating --endpoint-type in favor of --backend. The changes are applied across documentation, tests, and the benchmark implementation. My review identifies a critical issue in the deprecation logic that could cause user-provided --endpoint-type values to be ignored. I've also suggested a refactoring to improve code clarity by renaming a variable to align with its new purpose.
| type=str, | ||
| default=None, | ||
| choices=list(ASYNC_REQUEST_FUNCS.keys()), | ||
| help="'--endpoint-type' is deprecated and will be removed in v0.11.0. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can throw a warning with customized action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See if you're okay with the current version
yeqcharlotte
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other than the warning LGTM
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]> Signed-off-by: charlifu <[email protected]>
Signed-off-by: Roger Wang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
Purpose
For some reason we have two flags to specify the backend/endpoint type and this can cause confusion since we're going to deprecate the benchmark scripts. This PR cleans them up.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.