You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: site-src/implementations/model-servers.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,6 +23,7 @@ Use `--set inferencePool.modelServerType=triton-tensorrt-llm` to install the `in
23
23
24
24
Add the following to the `flags` in the helm chart as [flags to EPP](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/29ea29028496a638b162ff287c62c0087211bbe5/config/charts/inferencepool/values.yaml#L36)
@@ -32,10 +33,12 @@ Use `--set inferencePool.modelServerType=triton-tensorrt-llm` to install the `in
32
33
- "" # Set an empty metric to disable LoRA metric scraping as they are not supported by Triton yet.
33
34
```
34
35
36
+
35
37
## SGLang
36
38
37
39
Add the following `flags` while deploying using helm charts in the [EPP deployment](https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/29ea29028496a638b162ff287c62c0087211bbe5/config/charts/inferencepool/values.yaml#L36)
0 commit comments