generated from kubernetes/kubernetes-template-project
-
Notifications
You must be signed in to change notification settings - Fork 190
Closed
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationgood first issueDenotes an issue ready for a new contributor, according to the "help wanted" guidelines.Denotes an issue ready for a new contributor, according to the "help wanted" guidelines.help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.
Description
This doc states:
If you want to include usage metrics for vLLM model server streaming request, send the request with include_usage:
curl -i ${IP}:${PORT}/v1/completions -H 'Content-Type: application/json' -d '{
"model": "food-review",
"prompt": "whats your fav movie?",
"max_tokens": 10,
"temperature": 0,
"stream": true,
"stream_options": {"include_usage": "true"}
}'
According to OpenAI docs, the value of include_usage should be a boolean instead of a string. Here is an example error case:
curl -i http://localhost:8885/v1/completions -H 'Content-Type: application/json' -d '{
"model": "food-review-0",
"prompt": "whats your fav movie?",
"max_tokens": 10,
"temperature": 0,
"stream": true,
"stream_options": {"include_usage": "true"}
}'
HTTP/1.1 400 Bad Request
server: fasthttp
date: Mon, 20 Oct 2025 15:26:44 GMT
content-type: text/plain; charset=utf-8
x-went-into-resp-headers: true
transfer-encoding: chunked
Failed to read and parse request body, json: cannot unmarshal string into Go struct field StreamOptions.baseCompletionRequest.stream_options.include_usage of type bool%
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationgood first issueDenotes an issue ready for a new contributor, according to the "help wanted" guidelines.Denotes an issue ready for a new contributor, according to the "help wanted" guidelines.help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.triage/acceptedIndicates an issue or PR is ready to be actively worked on.Indicates an issue or PR is ready to be actively worked on.