-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Closed
Description
Once #2316 is merged, I'm willing to contribute the following metrics which I believe would be helpful for monitoring the usage of vllm.
# | Metric | Type | Labels | Description |
---|---|---|---|---|
1. | vllm:request_success | Counter | finish_reason=stop|length |
Count of successfully processed requests. |
2. | vllm:request_params_max_tokens | Histogram | Value of max_tokens request parameter. | |
3. | vllm:request_params_n | Histogram | Value of n request parameter. | |
4. | vllm:request_total_tokens | Histogram | Total sequence length of request (input tokens + generated tokens). | |
5. | vllm:request_prompt_tokens | Histogram | Number of prefill tokens processed. | |
6. | vllm:request_generation_tokens | Histogram | Number of generation tokens processed. |
Notes:
metrics 5. and 6. already exist but as counters (vllm:prompt_tokens_total
and vllm:generation_tokens_total
). I think a Histogram is more meaningful. For backward compatibility, we can keep both types (counters and histograms).
Please let me know what you think.
Metadata
Metadata
Assignees
Labels
No labels