-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI #21355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI #21355
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the benchmark invocation by replacing direct script calls (python benchmarks/benchmark_*.py
) with the new vllm bench
command-line interface. The changes are applied consistently across CI scripts, documentation, and examples. Additionally, the old benchmark scripts are now marked as deprecated, which is a good practice for a smooth transition. The changes improve maintainability by centralizing the benchmark entry points. I've reviewed the changes and found no high or critical issues.
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would even recommend just nuke this file, or some sort of tombstone, maybe too aggressive 😓
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can do that as a next step. wait for 1 weekto verify i don't break CI/CD in any weird ways lol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, one more PR is good.
benchmarks/auto_tune/auto_tune.sh
Outdated
prefix_len=$(( INPUT_LEN * MIN_CACHE_HIT_PCT / 100 )) | ||
adjusted_input_len=$(( INPUT_LEN - prefix_len )) | ||
python3 benchmarks/benchmark_serving.py \ | ||
vllm3 bench serve \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vllm
@deprecated( | ||
"benchmark_serving.py is deprecated and will be removed in a future " | ||
"version. Please use 'vllm bench serve' instead.", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yeqcharlotte you are going to break all of our benchmark scripts 😆
I didn't realize that vllm serve
used different code from benchmark_serving.py
-- One downside of this change is that it will require installing vllm in order to run benchmarks IIUC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you use it? This should just throw warning at the beginning instead of crashing everything instantly.
For most single host benchmark, before running benchmark_serving.py
you would probably have already run vllm serve that requires install vllm anyway?
If you are benchmarking against remote, then … this client script itself is probably not good enough anyway 😅
Potentially we can split the packaging later to allow bench script to be installed separately pip install vllm[bench]
that won’t install all dependency if vllm serve?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm personally okay with this change: right now it's a bit weird that we have two versions of benchmark code and I do want to get rid of one as soon as possible to avoid confusion and maintenance overhead, and I think Charlotte made some good points on why we should move them into vllm
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's very difficult to guard the standalone scripts to prevent it from importing vllm. also, i was able to install and run the bench commands on CPU node as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tlrmchlsmth given the discussion, are you ok if we merge this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's very difficult to guard the standalone scripts to prevent it from importing vllm. also, i was able to install and run the bench commands on CPU node as well.
This is true. Even though benchmark_serving.py tries to support this case it's broken 😆
If you are benchmarking against remote, then … this client script itself is probably not good enough anyway 😅
@yeqcharlotte the usecase I'm thinking of is in-cluster benchmarking from a container that doesn't have the CUDA runtime or GPUs. Do you see any problems with using it in that case?
Signed-off-by: Ye (Charlotte) Qi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I like this move, consolidated code path.
…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>
…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>
…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: x22x22 <[email protected]>
…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>
…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>
…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>
…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: Paul Pak <[email protected]>
…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: Diego-Castan <[email protected]>
…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>
…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.Purpose
We are planning for a few more changes in benchmarking script. Before doing that, move existing reference in documentation, CI and examples on benchmark_.*.py so we don't have to add them in 2 places. FIX #21206.
Test Plan
Rely on CI for the rest.
Test Result
It ran
(Optional) Documentation Update