-
-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Closed as not planned
Labels
Description
Your current environment
I'm attempting to independently measure the performance (e.g., latency, throughput, etc.) of the prefill and decode phases. Is there a way to achieve this? I have noticed a few benchmarks that measure end-to-end throughput and latency but do not provide separate metrics for each phase.
I would greatly appreciate any guidance on profiling these two phases separately.
How would you like to use vllm
No response