Skip to content

Commit 7fbfcf2

Browse files
committed
Adds status information to describe the state of Inference Pools (#3970)
Update the inference extension design doc to specify different status that needs to be set on Inference Pools to understand its state
1 parent afc7381 commit 7fbfcf2

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

docs/proposals/gateway-inference-extension.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,14 @@ InferenceObjective represents the desired state of a specific model use case. As
104104

105105
It is my impression that this API is purely for the EPP to handle, and does not need to be handled by NGINX Gateway Fabric.
106106

107+
### Inference Status
108+
109+
Each InferencePool publishes two conditions that together describe its overall state. The first is the `Accepted` condition, which communicates whether the pool is referenced by an HTTPRoute that the Gateway has accepted. When the route is not accepted, this condition is explicitly set to `False` with the reason `InferencePoolReasonHTTPRouteNotAccepted`, making it clear that the Gateway rejected the route referencing the pool.
110+
111+
The second is the `ResolvedRefs` condition, which reflects whether the `EndpointPickerRef` associated with the pool is valid. If it is misconfigured such as being an unsupported kind, left undefined, or pointing to a non-existent Service, this condition is set to `False` with the reason `InferencePoolReasonInvalidExtensionRef`.
112+
113+
The status of an InferencePool records the Gateway as its parent reference and associates it with the relevant conditions; when all conditions are `True`, the pool is valid and traffic can be directed to it.
114+
107115
### Personas and Processes
108116

109117
Two new personas are introduced, the `Inference Platform Owner/Admin` and `Inference Workload Owner`.

0 commit comments

Comments
 (0)