[Feature] Add `vllm bench` CLI #13993

randyjhc · 2025-02-27T21:19:55Z

This PR adds CLI commands of vllm bench. We only support vllm bench serve in this PR to align the interface, the following benchmark modes will be added in follow-up PRs (contributions are welcome).

vllm bench latency
vllm bench throughput

What has been covered in this PR:

All metrics in vllm bench serve.
OpenAI endpoint request function.
Random dataset

Future work:

Serve
- Support other request functions (e.g., TGI).
- Support other datasets (e.g., ShareGPT, sonnet, etc).
Latency
- Support vllm bench latency.
Throughput
- Support vllm bench throughput.
Refactor benchmarks/ to use this CLI.

github-actions · 2025-02-27T21:20:04Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

comaniac · 2025-02-27T21:39:53Z

Thanks for the PR! My two cents:

We could still use vllm benchmark [throughput|latency|serving] with hierarchal CLIs. For example, it shows different arguments/options with vllm benchmark serve --help and vllm benchmark latency --help. If the current CLISubcommand cannot support this we should consider improving it.
I agree with that. We could move all common utilities under vllm/benchmark. Ideally once this CLI is landed, the current benchmarks folder should just have some shell scripts that use the CLI.
We should not put datasets to the package. Instead, we could upload them to the vLLM S3 buckets and put the link to the default value of the dataset argument.

cc @ywang96 @simon-mo

randyjhc · 2025-02-27T23:12:13Z

Thanks for the suggestions! I’ll keep working on improving points 1 and 3.

randyjhc · 2025-03-01T06:08:58Z

The new commits introduce the nested version for benchmark CLI.

To invoke the benchmark:

vllm bench [throughput|latency|serving] --opts

To view help messages:

vllm bench --help
vllm bench [throughput|latency|serving] --help

mergify · 2025-03-01T06:11:34Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @randyjhc.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

comaniac · 2025-03-01T06:27:23Z

@khluu do you know why the changes made by local pre-commit is different from the CI and how should we fix it?

khluu · 2025-03-02T05:12:39Z

@khluu do you know why the changes made by local pre-commit is different from the CI and how should we fix it?

A bug fix related to precommit was merged not too long ago.. maybe merge this branch with main and try again?

Signed-off-by: Randy Chen <[email protected]>

randyjhc · 2025-03-03T20:02:59Z

Reframing the PR

Currently, this PR only supports vllm bench serving --opts at this time, but it can be easily extended to support other benchmark types like:

vllm bench [bench_type] --opts

At this stage, the vllm bench serving command only supports random requests, leaving sonnet.txt in place. In addition, the supported backends are limited to:

vllm
lmdeploy
openai
scalellm
sglang

Instead of moving the entire benchmarks folder, I have copied only the necessary functions into vllm/benchmarks. This approach helps us merge new changes in benchmarks in subsequent PRs while preventing conflicts for now.

Follow-up Items

Extending support for additional benchmark command types within this framework, such as:

vllm bench [latency|throughput]

Enabling full support for options in vllm bench serving --opts, including:

Existing datasets such as sonnet, sharegpt, etc.
Backend options, like tgi, mii, etc.

Signed-off-by: Randy Chen <[email protected]>

This reverts commit d5bc88b. Signed-off-by: Randy Chen <[email protected]>

Signed-off-by: Randy Chen <[email protected]>

comaniac

It's in a good shape in general. Will take a deep look tomorrow.
Meanwhile please update the PR description with the latest scope of this PR and the remaining follow-up tasks.

vllm/benchmarks/benchmark_serving.py

…project#13840) Signed-off-by: Randy Chen <[email protected]>

Signed-off-by: Randy Chen <[email protected]>

Signed-off-by: Cody Yu <[email protected]>

ywang96

Hey @randyjhc! Thanks for working on this PR!

Now that #14036 has landed, could you include the dataset file into vllm/benchmarks too and update this PR accordingly?

comaniac · 2025-03-10T17:19:03Z

Hey @randyjhc! Thanks for working on this PR!

Now that #14036 has landed, could you include the dataset file into vllm/benchmarks too and update this PR accordingly?

We have to options to move forward:

We only support random sampling in this PR to keep it short, and focus on the interface/CLI.
We support datasets in this PR too to deliver a fully function benchmark serving capability.

I actually prefer (1) because it should take less time to merge this PR, but if you feel supporting datasets won't postponing the review process (because the logic is mostly copied), I'm also ok with it.

ywang96 · 2025-03-10T17:24:24Z

Hey @randyjhc! Thanks for working on this PR!
Now that #14036 has landed, could you include the dataset file into vllm/benchmarks too and update this PR accordingly?

We have to options to move forward:

We only support random sampling in this PR to keep it short, and focus on the interface/CLI.

We support datasets in this PR too to deliver a fully function benchmark serving capability.

I actually prefer (1) because it should take less time to merge this PR, but if you feel supporting datasets won't postponing the review process (because the logic is mostly copied), I'm also ok with it.

yea I'm also okay with (1) just to get this PR in!

ywang96

LGTM!

vllm/benchmarks/serve.py

vllm/entrypoints/cli/benchmark/main.py

Signed-off-by: Randy Chen <[email protected]>

Signed-off-by: Randy Chen <[email protected]> Signed-off-by: Cody Yu <[email protected]> Co-authored-by: Cody Yu <[email protected]> Signed-off-by: Richard Liu <[email protected]>

Signed-off-by: Randy Chen <[email protected]> Signed-off-by: Cody Yu <[email protected]> Co-authored-by: Cody Yu <[email protected]> Signed-off-by: Louis Ulmer <[email protected]>

Signed-off-by: Randy Chen <[email protected]> Signed-off-by: Cody Yu <[email protected]> Co-authored-by: Cody Yu <[email protected]>

Signed-off-by: Randy Chen <[email protected]> Signed-off-by: Cody Yu <[email protected]> Co-authored-by: Cody Yu <[email protected]> Signed-off-by: Mu Huai <[email protected]>

mergify bot added ci/build frontend labels Feb 27, 2025

randyjhc changed the title ~~[Draft][Feature] Add CLI Commands for Benchmarking #13840~~ [Draft][Feature] Add CLI Commands for Benchmarking Feb 27, 2025

comaniac marked this pull request as draft February 27, 2025 22:58

randyjhc force-pushed the feature-cli-for-benchmarking branch from 6e27b04 to 49267b5 Compare March 1, 2025 04:01

mergify bot added the needs-rebase label Mar 1, 2025

randyjhc closed this Mar 3, 2025

randyjhc force-pushed the feature-cli-for-benchmarking branch from 80b1aed to 872db2b Compare March 3, 2025 18:43

[Feature] Add CLI Commands for Benchmarking (vllm-project#13840)

139d41f

Signed-off-by: Randy Chen <[email protected]>

randyjhc reopened this Mar 3, 2025

mergify bot removed the needs-rebase label Mar 3, 2025

randyjhc added 6 commits March 3, 2025 14:55

Removed functions other than random requests (vllm-project#13840)

b137f36

Signed-off-by: Randy Chen <[email protected]>

Aligned CI requirements-test.txt (vllm-project#13840)

b3f4e03

Signed-off-by: Randy Chen <[email protected]>

Removed a whitespace (vllm-project#13840)

b9bbd7d

Signed-off-by: Randy Chen <[email protected]>

Compacted utility functions (vllm-project#13840)

d5bc88b

Signed-off-by: Randy Chen <[email protected]>

Revert "Compacted utility functions (vllm-project#13840)"

ff38453

This reverts commit d5bc88b. Signed-off-by: Randy Chen <[email protected]>

Compacted backend_request_func.py (vllm-project#13840)

ca01485

Signed-off-by: Randy Chen <[email protected]>

comaniac reviewed Mar 3, 2025

View reviewed changes

vllm/benchmarks/benchmark_serving.py Show resolved Hide resolved

vllm/benchmarks/benchmark_serving.py Outdated Show resolved Hide resolved

vllm/benchmarks/benchmark_serving.py Outdated Show resolved Hide resolved

randyjhc and others added 4 commits March 3, 2025 16:01

Fixed default value for random-input-len and random-output-len (vllm-…

add4f27

…project#13840) Signed-off-by: Randy Chen <[email protected]>

Refined file names, docstrings, and function name (vllm-project#13840)

b3dec92

Signed-off-by: Randy Chen <[email protected]>

refactor

3f62bd0

Signed-off-by: Cody Yu <[email protected]>

comments

d10f840

Signed-off-by: Cody Yu <[email protected]>

comaniac changed the title ~~[Draft][Feature] Add CLI Commands for Benchmarking~~ [Feature] Add vllm bench CLI Mar 4, 2025

comaniac self-assigned this Mar 4, 2025

comaniac marked this pull request as ready for review March 4, 2025 16:53

comaniac requested review from ywang96 and simon-mo March 4, 2025 16:56

fix

39731b1

Signed-off-by: Cody Yu <[email protected]>

ywang96 self-assigned this Mar 5, 2025

ywang96 reviewed Mar 10, 2025

View reviewed changes

ywang96 approved these changes Mar 11, 2025

View reviewed changes

vllm/benchmarks/serve.py Show resolved Hide resolved

vllm/entrypoints/cli/benchmark/main.py Show resolved Hide resolved

comaniac added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 11, 2025

comaniac approved these changes Mar 11, 2025

View reviewed changes

Add suggested TODO comments (vllm-project#13840)

3fa7b77

Signed-off-by: Randy Chen <[email protected]>

comaniac enabled auto-merge (squash) March 11, 2025 22:29

comaniac merged commit 36e0c8f into vllm-project:main Mar 12, 2025
32 checks passed

russellb mentioned this pull request Mar 18, 2025

[Feature]: Ensure benchmark serving do not import vLLM #14923

Closed

1 task

Potabk mentioned this pull request Mar 21, 2025

[Doc]Add benchmark scripts vllm-project/vllm-ascend#74

Merged

ckhordiasma mentioned this pull request Apr 17, 2025

[do not merge] pr test for nm changes into 2.20 red-hat-data-services/vllm#107

Closed

shreyankg pushed a commit to shreyankg/vllm that referenced this pull request May 3, 2025

[Feature] Add vllm bench CLI (vllm-project#13993)

cc948b6

Signed-off-by: Randy Chen <[email protected]> Signed-off-by: Cody Yu <[email protected]> Co-authored-by: Cody Yu <[email protected]>

huydhn mentioned this pull request Jul 27, 2025

Handle non-serializable objects in vllm bench #21665

Merged

3 tasks

Uh oh!

[Feature] Add vllm bench CLI #13993

[Feature] Add vllm bench CLI #13993

Uh oh!

Conversation

randyjhc commented Feb 27, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 27, 2025

Uh oh!

comaniac commented Feb 27, 2025

Uh oh!

randyjhc commented Feb 27, 2025

Uh oh!

randyjhc commented Mar 1, 2025

Uh oh!

mergify bot commented Mar 1, 2025

Uh oh!

comaniac commented Mar 1, 2025

Uh oh!

khluu commented Mar 2, 2025

Uh oh!

randyjhc commented Mar 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reframing the PR

Follow-up Items

Uh oh!

comaniac left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ywang96 left a comment

Choose a reason for hiding this comment

Uh oh!

comaniac commented Mar 10, 2025

Uh oh!

ywang96 commented Mar 10, 2025

Uh oh!

ywang96 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Feature] Add `vllm bench` CLI #13993

[Feature] Add `vllm bench` CLI #13993

randyjhc commented Feb 27, 2025 •

edited by github-actions bot

Loading

randyjhc commented Mar 3, 2025 •

edited

Loading