[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI #21355

yeqcharlotte · 2025-07-22T07:23:22Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

We are planning for a few more changes in benchmarking script. Before doing that, move existing reference in documentation, CI and examples on benchmark_.*.py so we don't have to add them in 2 places. FIX #21206.

Test Plan

sh .buildkite/scripts/run-benchmarks.sh

Rely on CI for the rest.

Test Result

It ran

(Optional) Documentation Update

github-actions · 2025-07-22T07:25:17Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This pull request refactors the benchmark invocation by replacing direct script calls (python benchmarks/benchmark_*.py) with the new vllm bench command-line interface. The changes are applied consistently across CI scripts, documentation, and examples. Additionally, the old benchmark scripts are now marked as deprecated, which is a good practice for a smooth transition. The changes improve maintainability by centralizing the benchmark entry points. I've reviewed the changes and found no high or critical issues.

benchmarks/benchmark_latency.py

mergify · 2025-07-23T04:41:24Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @yeqcharlotte.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

simon-mo · 2025-07-24T07:26:32Z

benchmarks/benchmark_latency.py

I would even recommend just nuke this file, or some sort of tombstone, maybe too aggressive 😓

we can do that as a next step. wait for 1 weekto verify i don't break CI/CD in any weird ways lol.

Yeah, one more PR is good.

simon-mo · 2025-07-24T13:39:19Z

benchmarks/auto_tune/auto_tune.sh

    prefix_len=$(( INPUT_LEN * MIN_CACHE_HIT_PCT / 100 ))
 adjusted_input_len=$(( INPUT_LEN - prefix_len ))
-    python3 benchmarks/benchmark_serving.py \
+    vllm3 bench serve \


tlrmchlsmth · 2025-07-24T15:09:02Z

benchmarks/benchmark_serving.py

+@deprecated(
+    "benchmark_serving.py is deprecated and will be removed in a future "
+    "version. Please use 'vllm bench serve' instead.",
+)


@yeqcharlotte you are going to break all of our benchmark scripts 😆

I didn't realize that vllm serve used different code from benchmark_serving.py -- One downside of this change is that it will require installing vllm in order to run benchmarks IIUC.

How did you use it? This should just throw warning at the beginning instead of crashing everything instantly.

For most single host benchmark, before running benchmark_serving.py you would probably have already run vllm serve that requires install vllm anyway?

If you are benchmarking against remote, then … this client script itself is probably not good enough anyway 😅

Potentially we can split the packaging later to allow bench script to be installed separately pip install vllm[bench] that won’t install all dependency if vllm serve?

I'm personally okay with this change: right now it's a bit weird that we have two versions of benchmark code and I do want to get rid of one as soon as possible to avoid confusion and maintenance overhead, and I think Charlotte made some good points on why we should move them into vllm.

it's very difficult to guard the standalone scripts to prevent it from importing vllm. also, i was able to install and run the bench commands on CPU node as well.

@tlrmchlsmth given the discussion, are you ok if we merge this?

it's very difficult to guard the standalone scripts to prevent it from importing vllm. also, i was able to install and run the bench commands on CPU node as well.

This is true. Even though benchmark_serving.py tries to support this case it's broken 😆

If you are benchmarking against remote, then … this client script itself is probably not good enough anyway 😅

@yeqcharlotte the usecase I'm thinking of is in-cluster benchmarking from a container that doesn't have the CUDA runtime or GPUs. Do you see any problems with using it in that case?

Signed-off-by: Ye (Charlotte) Qi <[email protected]>

houseroad

Looks good. I like this move, consolidated code path.

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: x22x22 <[email protected]>

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: Paul Pak <[email protected]>

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]> Signed-off-by: Diego-Castan <[email protected]>

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>

yeqcharlotte requested a review from hmellor as a code owner July 22, 2025 07:23

mergify bot added documentation Improvements or additions to documentation ci/build performance Performance-related issues tpu Related to Google TPUs labels Jul 22, 2025

yeqcharlotte requested review from houseroad, simon-mo and ywang96 July 22, 2025 07:24

gemini-code-assist bot reviewed Jul 22, 2025

View reviewed changes

ywang96 self-assigned this Jul 22, 2025

hmellor reviewed Jul 22, 2025

View reviewed changes

benchmarks/benchmark_latency.py Outdated Show resolved Hide resolved

mergify bot added the needs-rebase label Jul 23, 2025

yeqcharlotte requested review from DarkLight1337, LucasWilkinson, ProExpertProg, WoosukKwon, aarnphm, alexm-redhat, comaniac, jeejeelee, mgoin, njhill, robertgshaw2-redhat, tlrmchlsmth, youkaichao and zou3519 as code owners July 23, 2025 05:03

mergify bot added deepseek Related to DeepSeek models frontend labels Jul 23, 2025

simon-mo approved these changes Jul 24, 2025

View reviewed changes

hmellor approved these changes Jul 24, 2025

View reviewed changes

hmellor enabled auto-merge (squash) July 24, 2025 11:42

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 24, 2025

simon-mo reviewed Jul 24, 2025

View reviewed changes

tlrmchlsmth reviewed Jul 24, 2025

View reviewed changes

hmellor disabled auto-merge July 24, 2025 16:07

vllm3 -> vllm

ef5a8ff

Signed-off-by: Ye (Charlotte) Qi <[email protected]>

yeqcharlotte removed this from Tool Calling Jul 25, 2025

tlrmchlsmth approved these changes Jul 25, 2025

View reviewed changes

houseroad approved these changes Jul 26, 2025

View reviewed changes

houseroad enabled auto-merge (squash) July 26, 2025 04:39

vllm-bot merged commit e7c4f9e into vllm-project:main Jul 26, 2025
45 of 47 checks passed

huydhn mentioned this pull request Jul 27, 2025

Handle non-serializable objects in vllm bench #21665

Merged

3 tasks

yeqcharlotte mentioned this pull request Jul 27, 2025

[CI/Build][Doc] Clean up more docs that point to old bench scripts #21667

Merged

4 tasks

liuyumoye pushed a commit to liuyumoye/vllm that referenced this pull request Jul 31, 2025

[CI/Build][Doc] Move existing benchmark scripts in CI/document/exampl…

abfc0eb

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>

HsChen-sys pushed a commit to HsChen-sys/vllm that referenced this pull request Aug 1, 2025

[CI/Build][Doc] Move existing benchmark scripts in CI/document/exampl…

9395a3e

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>

Pradyun92 pushed a commit to Pradyun92/vllm that referenced this pull request Aug 6, 2025

[CI/Build][Doc] Move existing benchmark scripts in CI/document/exampl…

60900b4

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>

npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025

[CI/Build][Doc] Move existing benchmark scripts in CI/document/exampl…

7a30a36

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

[CI/Build][Doc] Move existing benchmark scripts in CI/document/exampl…

fe774eb

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

[CI/Build][Doc] Move existing benchmark scripts in CI/document/exampl…

a3e370d

…e to vllm bench CLI (vllm-project#21355) Signed-off-by: Ye (Charlotte) Qi <[email protected]>

yeqcharlotte mentioned this pull request Sep 8, 2025

[CI/Build][Doc] Fully deprecate old bench scripts for serving / throughput / latency #24411

Merged

5 tasks

Uh oh!

[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI #21355

[CI/Build][Doc] Move existing benchmark scripts in CI/document/example to vllm bench CLI #21355

Uh oh!

Conversation

yeqcharlotte commented Jul 22, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

github-actions bot commented Jul 22, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Jul 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yeqcharlotte Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

houseroad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

yeqcharlotte commented Jul 22, 2025 •

edited by github-actions bot

Loading

yeqcharlotte Jul 24, 2025 •

edited

Loading