Skip to content

Clean up JSON output #76

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sjmonson opened this issue Feb 25, 2025 · 3 comments · Fixed by #91
Closed

Clean up JSON output #76

sjmonson opened this issue Feb 25, 2025 · 3 comments · Fixed by #91
Assignees

Comments

@sjmonson
Copy link
Collaborator

There is a lot of minor improvements that can be made to the JSON output of guidellm. For example:

  1. Deduplicate the prompt field for each request
  2. Flatten some field structures (e.g. "decode_times": { "data": [] } -> "decode_times": [] ).
  3. Label percentiles (e.g. "request_latency_percentiles": [ 1, 2,... ] -> "request_latency_percentiles": { "p01": 1, "p05": 5,. ... })
  4. Give max and min with all percentiles
  5. Drop concurrences timestamps (possibly replace with percentiles/min/max/mean)
@markurtz
Copy link
Member

Feel free to dive in, but wanted to callout that there will be some restructuring along these lines for the output standardization in case there is duplicate work. Will share something out a bit later

@rgreenberg1
Copy link
Collaborator

Let's also make sure that the output JSON is storing all of the metadata we may want to know:
Model_name
Quantized (None, INT4, INT8, FP8)
Hardware
Inference Scenario
vllm version
vllm-config file (need the file)

GuideLLM results:
Tokens per Second 
Time to First Token (TTFT) 
Inter-token Latency (ITL)
End-to-End Request Latency (e2e_latency)
Requests Per Second (RPS) Profiles/Sweeps
Cost to generate a million output tokens (Internal) - Future

@markurtz
Copy link
Member

#91 fixes the issues @sjmonson brought up within this issue. @rgreenberg1 can we move the extra pieces you included here into it's own issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants