Skip to content

Commit ec9cd68

Browse files
committed
Update formatting
1 parent fa9bd06 commit ec9cd68

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

site-src/implementations/model-servers.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Use `--set inferencePool.modelServerType=triton-tensorrt-llm` to install the `in
2323

2424
Add the following to the `flags` in the helm chart as [flags to EPP](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/29ea29028496a638b162ff287c62c0087211bbe5/config/charts/inferencepool/values.yaml#L36)
2525

26-
```
26+
```
2727
- --total-queued-requests-metric
2828
- "nv_trt_llm_request_metrics{request_type=waiting}"
2929
- --kv-cache-usage-percentage-metric
@@ -36,6 +36,7 @@ Use `--set inferencePool.modelServerType=triton-tensorrt-llm` to install the `in
3636

3737
Add the following `flags` while deploying using helm charts in the [EPP deployment](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/29ea29028496a638b162ff287c62c0087211bbe5/config/charts/inferencepool/values.yaml#L36)
3838

39+
3940
```
4041
- --totalQueuedRequestsMetric
4142
- "sglang:num_queue_reqs"

0 commit comments

Comments
 (0)