Skip to content

Proposal: Adding more Prometheus metrics #2650

@ronensc

Description

@ronensc

Once #2316 is merged, I'm willing to contribute the following metrics which I believe would be helpful for monitoring the usage of vllm.

# Metric Type Labels Description
1. vllm:request_success Counter finish_reason=stop|length Count of successfully processed requests.
2. vllm:request_params_max_tokens Histogram Value of max_tokens request parameter.
3. vllm:request_params_n Histogram Value of n request parameter.
4. vllm:request_total_tokens Histogram Total sequence length of request (input tokens + generated tokens).
5. vllm:request_prompt_tokens Histogram Number of prefill tokens processed.
6. vllm:request_generation_tokens Histogram Number of generation tokens processed.

Notes:
metrics 5. and 6. already exist but as counters (vllm:prompt_tokens_total and vllm:generation_tokens_total). I think a Histogram is more meaningful. For backward compatibility, we can keep both types (counters and histograms).

Please let me know what you think.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions