Skip to content

[Feature]: Ensure benchmark serving do not import vLLM #14923

@simon-mo

Description

@simon-mo

🚀 The feature, motivation and pitch

vLLM's benchmark serving script is expected to be a standalone inference client that only requires minimum dependencies. Currently, it still imports vllm conditionally.

The task is as follows:

  1. Clearly define a requirements txt for benchmark serving client
numpy
pandas
Pillow
tqdm
transformers
datasets
  1. Add a CI test that create a new uv environment and execute the script. Ensure there is no vLLM present. This can be part of existing tests for benchmark scripts. https://github.com/vllm-project/vllm/blob/main/.buildkite/run-benchmarks.sh

  2. Make sure the existing usage of vLLM is moved to inlining whatever utility method is required.

Alternatives

No response

Additional context

See #14879 for discussion, cc @houseroad @ywang96

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions