1212
1313## ** Steps**
1414
15+ ### Install the Inference Extension CRDs
16+
17+ ``` bash
18+ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
19+ ```
20+
1521### Deploy Sample Model Server
1622
1723--8<-- "site-src/_ includes/model-server-intro.md"
3541 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/vllm/sim-deployment.yaml
3642 ```
3743
38- ### Install the Inference Extension CRDs
39-
40- ``` bash
41- kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
42- ```
44+ --8<-- "site-src/_ includes/model-rollout.md"
4345
4446### Deploy the InferencePool and Endpoint Picker Extension
4547
@@ -69,20 +71,19 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
6971 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml
7072 ```
7173
72- Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
74+ 3. Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
7375
7476 ```bash
75- $ kubectl get gateway inference-gateway
76- NAME CLASS ADDRESS PROGRAMMED AGE
77- inference-gateway inference-gateway <MY_ADDRESS> True 22s
77+ kubectl get gateway inference-gateway
7878 ```
79- 3. Deploy the HTTPRoute
79+
80+ 4. Deploy the HTTPRoute:
8081
8182 ```bash
8283 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml
8384 ```
8485
85- 4 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
86+ 5 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
8687
8788 ```bash
8889 kubectl get httproute llm-route -o yaml
@@ -93,11 +94,11 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
9394 Please note that this feature is currently in an experimental phase and is not intended for production use.
9495 The implementation and user experience are subject to changes as we continue to iterate on this project.
9596
96- 1. Requirements
97+ 1. Requirements:
9798
9899 - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
99100
100- 2. Install Istio
101+ 2. Install Istio:
101102
102103 ```
103104 TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
@@ -120,26 +121,25 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
120121 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/destination-rule.yaml
121122 ```
122123
123- 4. Deploy Gateway
124+ 4. Deploy the Gateway:
124125
125126 ```bash
126127 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml
127128 ```
128129
129- Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
130+ 5. Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
131+
130132 ```bash
131- $ kubectl get gateway inference-gateway
132- NAME CLASS ADDRESS PROGRAMMED AGE
133- inference-gateway inference-gateway <MY_ADDRESS> True 22s
133+ kubectl get gateway inference-gateway
134134 ```
135135
136- 5 . Deploy the HTTPRoute
136+ 6 . Deploy the HTTPRoute:
137137
138138 ```bash
139139 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml
140140 ```
141141
142- 6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
142+ 7 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
143143
144144 ```bash
145145 kubectl get httproute llm-route -o yaml
@@ -150,44 +150,49 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
150150 [Kgateway](https://kgateway.dev/) added Inference Gateway support as a **technical preview** in the
151151 [v2.0.0 release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.0.0). InferencePool v1.0.1 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published.
152152
153- 1. Requirements
153+ 1. Requirements:
154154
155155 - [Helm](https://helm.sh/docs/intro/install/) installed.
156156 - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
157157
158- 2. Set the Kgateway version and install the Kgateway CRDs.
158+ 2. Set the Kgateway version and install the Kgateway CRDs:
159159
160160 ```bash
161161 KGTW_VERSION=v2.1.0-main
162162 helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
163163 ```
164164
165- 3. Install Kgateway
165+ 3. Install Kgateway:
166166
167167 ```bash
168168 helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
169169 ```
170170
171- 4. Deploy the Gateway
171+ 4. Wait for the Kgateway deployment to be successfully rolled out:
172+
173+ ```bash
174+ kubectl rollout status deployment kgateway -n kgateway-system
175+ ```
176+
177+ 5. Deploy the Gateway:
172178
173179 ```bash
174180 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml
175181 ```
176182
177- Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
183+ 6. Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
184+
178185 ```bash
179- $ kubectl get gateway inference-gateway
180- NAME CLASS ADDRESS PROGRAMMED AGE
181- inference-gateway kgateway <MY_ADDRESS> True 22s
186+ kubectl get gateway inference-gateway
182187 ```
183188
184- 5 . Deploy the HTTPRoute
189+ 7 . Deploy the HTTPRoute:
185190
186191 ```bash
187192 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml
188193 ```
189194
190- 6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
195+ 8 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
191196
192197 ```bash
193198 kubectl get httproute llm-route -o yaml
@@ -197,52 +202,57 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
197202
198203 [Agentgateway](https://agentgateway.dev/) is a purpose-built proxy designed for AI workloads, and comes with native support for Inference Gateway. Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane. InferencePool v1.0.0 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published.
199204
200- 1. Requirements
205+ 1. Requirements:
201206
202207 - [Helm](https://helm.sh/docs/intro/install/) installed.
203208 - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
204209
205- 2. Set the Kgateway version and install the Kgateway CRDs.
210+ 2. Set the Kgateway version and install the Kgateway CRDs:
206211
207212 ```bash
208213 KGTW_VERSION=v2.1.0-main
209214 helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
210215 ```
211216
212- 3. Install Kgateway
217+ 3. Install Kgateway:
213218
214219 ```bash
215220 helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentGateway.enabled=true
216221 ```
217222
218- 4. Deploy the Gateway
223+ 4. Wait for the Kgateway deployment to be successfully rolled out:
224+
225+ ```bash
226+ kubectl rollout status deployment kgateway -n kgateway-system
227+ ```
228+
229+ 5. Deploy the Gateway:
219230
220231 ```bash
221232 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/gateway.yaml
222233 ```
223234
224- Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
235+ 6. Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
236+
225237 ```bash
226- $ kubectl get gateway inference-gateway
227- NAME CLASS ADDRESS PROGRAMMED AGE
228- inference-gateway agentgateway <MY_ADDRESS> True 22s
238+ kubectl get gateway inference-gateway
229239 ```
230240
231- 5 . Deploy the HTTPRoute
241+ 7 . Deploy the HTTPRoute:
232242
233243 ```bash
234244 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/httproute.yaml
235245 ```
236246
237- 6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
247+ 8 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
238248
239249 ```bash
240250 kubectl get httproute llm-route -o yaml
241251 ```
242252
243253### Deploy InferenceObjective (Optional)
244254
245- Deploy the sample InferenceObjective which allows you to specify priority of requests.
255+ Deploy the sample InferenceObjective which allows you to specify priority of inference requests:
246256
247257 ``` bash
248258 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferenceobjective.yaml
@@ -257,7 +267,7 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
257267 The following instructions assume you would like to cleanup ALL resources that were created in this quickstart guide.
258268 Please be careful not to delete resources you'd like to keep.
259269
260- 1 . Uninstall the InferencePool, InferenceObjective and model server resources
270+ 1 . Uninstall the InferencePool, InferenceObjective and model server resources:
261271
262272 ``` bash
263273 helm uninstall vllm-llama3-8b-instruct
@@ -268,13 +278,13 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
268278 kubectl delete secret hf-token --ignore-not-found
269279 ```
270280
271- 1. Uninstall the Gateway API Inference Extension CRDs
281+ 1. Uninstall the Gateway API Inference Extension CRDs:
272282
273283 ` ` ` bash
274284 kubectl delete -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd --ignore-not-found
275285 ` ` `
276286
277- 1. Choose one of the following options to cleanup the Inference Gateway.
287+ 1. Choose one of the following options to cleanup the Inference Gateway:
278288
279289=== " GKE"
280290
@@ -294,13 +304,13 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
294304
295305 The following steps assume you would like to clean up ALL Istio resources that were created in this quickstart guide.
296306
297- 1. Uninstall All Istio resources
307+ 1. Uninstall All Istio resources:
298308
299309 ` ` ` bash
300310 istioctl uninstall -y --purge
301311 ` ` `
302312
303- 2. Remove the Istio namespace
313+ 2. Remove the Istio namespace:
304314
305315 ` ` ` bash
306316 kubectl delete ns istio-system
@@ -315,19 +325,19 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
315325
316326 The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.
317327
318- 1. Uninstall Kgateway
328+ 1. Uninstall Kgateway:
319329
320330 ` ` ` bash
321331 helm uninstall kgateway -n kgateway-system
322332 ` ` `
323333
324- 2. Uninstall the Kgateway CRDs.
334+ 2. Uninstall the Kgateway CRDs:
325335
326336 ` ` ` bash
327337 helm uninstall kgateway-crds -n kgateway-system
328338 ` ` `
329339
330- 3. Remove the Kgateway namespace.
340+ 3. Remove the Kgateway namespace:
331341
332342 ` ` ` bash
333343 kubectl delete ns kgateway-system
@@ -342,19 +352,19 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
342352
343353 The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.
344354
345- 1. Uninstall Kgateway
355+ 1. Uninstall Kgateway:
346356
347357 ` ` ` bash
348358 helm uninstall kgateway -n kgateway-system
349359 ` ` `
350360
351- 2. Uninstall the Kgateway CRDs.
361+ 2. Uninstall the Kgateway CRDs:
352362
353363 ` ` ` bash
354364 helm uninstall kgateway-crds -n kgateway-system
355365 ` ` `
356366
357- 3. Remove the Kgateway namespace.
367+ 3. Remove the Kgateway namespace:
358368
359369 ` ` ` bash
360370 kubectl delete ns kgateway-system
0 commit comments