You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|`inferencePool.targetPortNumber`| Target port number for the vllm backends, will be used to scrape metrics by the inference extension. Defaults to 8000. |
52
-
|`inferencePool.modelServerType`| Type of the model servers in the pool, valid options are [vllm, triton-tensorrt-llm], default is vllm |
52
+
|`inferencePool.modelServerType`| Type of the model servers in the pool, valid options are [vllm, triton-tensorrt-llm], default is vllm.|
53
53
|`inferencePool.modelServers.matchLabels`| Label selector to match vllm backends managed by the inference pool. |
54
54
|`inferenceExtension.replicas`| Number of replicas for the endpoint picker extension service. Defaults to `1`. |
55
55
|`inferenceExtension.image.name`| Name of the container image used for the endpoint picker. |
0 commit comments