Skip to content

Conversation

randyjhc
Copy link
Contributor

@randyjhc randyjhc commented Feb 27, 2025

Part of #13840

This PR adds CLI commands of vllm bench. We only support vllm bench serve in this PR to align the interface, the following benchmark modes will be added in follow-up PRs (contributions are welcome).

vllm bench latency
vllm bench throughput

What has been covered in this PR:

  • All metrics in vllm bench serve.
  • OpenAI endpoint request function.
  • Random dataset

Future work:

  • Serve
    • Support other request functions (e.g., TGI).
    • Support other datasets (e.g., ShareGPT, sonnet, etc).
  • Latency
    • Support vllm bench latency.
  • Throughput
    • Support vllm bench throughput.
  • Refactor benchmarks/ to use this CLI.

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@randyjhc randyjhc changed the title [Draft][Feature] Add CLI Commands for Benchmarking #13840 [Draft][Feature] Add CLI Commands for Benchmarking Feb 27, 2025
@comaniac
Copy link
Collaborator

Thanks for the PR! My two cents:

  1. We could still use vllm benchmark [throughput|latency|serving] with hierarchal CLIs. For example, it shows different arguments/options with vllm benchmark serve --help and vllm benchmark latency --help. If the current CLISubcommand cannot support this we should consider improving it.
  2. I agree with that. We could move all common utilities under vllm/benchmark. Ideally once this CLI is landed, the current benchmarks folder should just have some shell scripts that use the CLI.
  3. We should not put datasets to the package. Instead, we could upload them to the vLLM S3 buckets and put the link to the default value of the dataset argument.

cc @ywang96 @simon-mo

@comaniac comaniac marked this pull request as draft February 27, 2025 22:58
@randyjhc
Copy link
Contributor Author

Thanks for the suggestions! I’ll keep working on improving points 1 and 3.

@randyjhc randyjhc force-pushed the feature-cli-for-benchmarking branch from 6e27b04 to 49267b5 Compare March 1, 2025 04:01
@randyjhc
Copy link
Contributor Author

randyjhc commented Mar 1, 2025

The new commits introduce the nested version for benchmark CLI.

  • To invoke the benchmark:
vllm bench [throughput|latency|serving] --opts
  • To view help messages:
vllm bench --help
vllm bench [throughput|latency|serving] --help

Copy link

mergify bot commented Mar 1, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @randyjhc.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Mar 1, 2025
@comaniac
Copy link
Collaborator

comaniac commented Mar 1, 2025

@khluu do you know why the changes made by local pre-commit is different from the CI and how should we fix it?

@khluu
Copy link
Collaborator

khluu commented Mar 2, 2025

@khluu do you know why the changes made by local pre-commit is different from the CI and how should we fix it?

A bug fix related to precommit was merged not too long ago.. maybe merge this branch with main and try again?

@randyjhc randyjhc closed this Mar 3, 2025
@randyjhc randyjhc force-pushed the feature-cli-for-benchmarking branch from 80b1aed to 872db2b Compare March 3, 2025 18:43
@randyjhc
Copy link
Contributor Author

randyjhc commented Mar 3, 2025

Reframing the PR

  1. Currently, this PR only supports vllm bench serving --opts at this time, but it can be easily extended to support other benchmark types like:
vllm bench [bench_type] --opts
  1. At this stage, the vllm bench serving command only supports random requests, leaving sonnet.txt in place. In addition, the supported backends are limited to:
vllm
lmdeploy
openai
scalellm
sglang
  1. Instead of moving the entire benchmarks folder, I have copied only the necessary functions into vllm/benchmarks. This approach helps us merge new changes in benchmarks in subsequent PRs while preventing conflicts for now.

Follow-up Items

  1. Extending support for additional benchmark command types within this framework, such as:
vllm bench [latency|throughput]
  1. Enabling full support for options in vllm bench serving --opts, including:
  • Existing datasets such as sonnet, sharegpt, etc.
  • Backend options, like tgi, mii, etc.

@randyjhc randyjhc reopened this Mar 3, 2025
@mergify mergify bot removed the needs-rebase label Mar 3, 2025
Copy link
Collaborator

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in a good shape in general. Will take a deep look tomorrow.
Meanwhile please update the PR description with the latest scope of this PR and the remaining follow-up tasks.

@comaniac comaniac changed the title [Draft][Feature] Add CLI Commands for Benchmarking [Feature] Add vllm bench CLI Mar 4, 2025
@comaniac comaniac self-assigned this Mar 4, 2025
@comaniac comaniac marked this pull request as ready for review March 4, 2025 16:53
@comaniac comaniac requested review from ywang96 and simon-mo March 4, 2025 16:56
Signed-off-by: Cody Yu <[email protected]>
@ywang96 ywang96 self-assigned this Mar 5, 2025
Copy link
Member

@ywang96 ywang96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @randyjhc! Thanks for working on this PR!

Now that #14036 has landed, could you include the dataset file into vllm/benchmarks too and update this PR accordingly?

@comaniac
Copy link
Collaborator

Hey @randyjhc! Thanks for working on this PR!

Now that #14036 has landed, could you include the dataset file into vllm/benchmarks too and update this PR accordingly?

We have to options to move forward:

  1. We only support random sampling in this PR to keep it short, and focus on the interface/CLI.
  2. We support datasets in this PR too to deliver a fully function benchmark serving capability.

I actually prefer (1) because it should take less time to merge this PR, but if you feel supporting datasets won't postponing the review process (because the logic is mostly copied), I'm also ok with it.

@ywang96
Copy link
Member

ywang96 commented Mar 10, 2025

Hey @randyjhc! Thanks for working on this PR!
Now that #14036 has landed, could you include the dataset file into vllm/benchmarks too and update this PR accordingly?

We have to options to move forward:

  1. We only support random sampling in this PR to keep it short, and focus on the interface/CLI.
  2. We support datasets in this PR too to deliver a fully function benchmark serving capability.

I actually prefer (1) because it should take less time to merge this PR, but if you feel supporting datasets won't postponing the review process (because the logic is mostly copied), I'm also ok with it.

yea I'm also okay with (1) just to get this PR in!

Copy link
Member

@ywang96 ywang96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@comaniac comaniac added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 11, 2025
@comaniac comaniac enabled auto-merge (squash) March 11, 2025 22:29
@comaniac comaniac merged commit 36e0c8f into vllm-project:main Mar 12, 2025
32 checks passed
richardsliu pushed a commit to richardsliu/vllm that referenced this pull request Mar 14, 2025
Signed-off-by: Randy Chen <[email protected]>
Signed-off-by: Cody Yu <[email protected]>
Co-authored-by: Cody Yu <[email protected]>
Signed-off-by: Richard Liu <[email protected]>
lulmer pushed a commit to lulmer/vllm that referenced this pull request Apr 7, 2025
Signed-off-by: Randy Chen <[email protected]>
Signed-off-by: Cody Yu <[email protected]>
Co-authored-by: Cody Yu <[email protected]>
Signed-off-by: Louis Ulmer <[email protected]>
shreyankg pushed a commit to shreyankg/vllm that referenced this pull request May 3, 2025
Signed-off-by: Randy Chen <[email protected]>
Signed-off-by: Cody Yu <[email protected]>
Co-authored-by: Cody Yu <[email protected]>
RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025
Signed-off-by: Randy Chen <[email protected]>
Signed-off-by: Cody Yu <[email protected]>
Co-authored-by: Cody Yu <[email protected]>
Signed-off-by: Mu Huai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci/build frontend ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants