Skip to content

Commit 9f28aa7

Browse files
committed
Update formatting
1 parent fa9bd06 commit 9f28aa7

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

site-src/implementations/model-servers.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,12 @@ Use `--set inferencePool.modelServerType=triton-tensorrt-llm` to install the `in
2424
Add the following to the `flags` in the helm chart as [flags to EPP](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/29ea29028496a638b162ff287c62c0087211bbe5/config/charts/inferencepool/values.yaml#L36)
2525

2626
```
27-
- --total-queued-requests-metric
28-
- "nv_trt_llm_request_metrics{request_type=waiting}"
29-
- --kv-cache-usage-percentage-metric
30-
- "nv_trt_llm_kv_cache_block_metrics{kv_cache_block_type=fraction}"
31-
- --lora-info-metric
32-
- "" # Set an empty metric to disable LoRA metric scraping as they are not supported by Triton yet.
27+
--total-queued-requests-metric
28+
"nv_trt_llm_request_metrics{request_type=waiting}"
29+
--kv-cache-usage-percentage-metric
30+
"nv_trt_llm_kv_cache_block_metrics{kv_cache_block_type=fraction}"
31+
--lora-info-metric
32+
"" # Set an empty metric to disable LoRA metric scraping as they are not supported by Triton yet.
3333
```
3434

3535
## SGLang

0 commit comments

Comments
 (0)