Skip to content

Commit 94c7879

Browse files
committed
Docs: Minor Quickstart Improvements
Signed-off-by: Daneyon Hansen <[email protected]>
1 parent c7f41ce commit 94c7879

File tree

4 files changed

+133
-107
lines changed

4 files changed

+133
-107
lines changed
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
Wait for the model server deployment to be successfully rolled out:
2+
3+
```bash
4+
kubectl rollout status deployment vllm-llama3-8b-instruct
5+
```

site-src/_includes/test.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
### Try it out
22

3-
Wait until the gateway is ready.
3+
Use cURL to send a request to the vLLM model servers through the inference gateway:
44

55
```bash
66
IP=$(kubectl get gateway/inference-gateway -o jsonpath='{.status.addresses[0].value}')

site-src/guides/getting-started-latest.md

Lines changed: 62 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,12 @@
1212

1313
## **Steps**
1414

15+
### Install the Inference Extension CRDs
16+
17+
```bash
18+
kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
19+
```
20+
1521
### Deploy Sample Model Server
1622

1723
--8<-- "site-src/_includes/model-server-intro.md"
@@ -35,11 +41,7 @@
3541
kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api-inference-extension/refs/tags/v1.0.0/config/manifests/vllm/sim-deployment.yaml
3642
```
3743

38-
### Install the Inference Extension CRDs
39-
40-
```bash
41-
kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
42-
```
44+
--8<-- "site-src/_includes/model-rollout.md"
4345

4446
### Deploy the InferencePool and Endpoint Picker Extension
4547

@@ -69,20 +71,19 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
6971
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml
7072
```
7173

72-
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
74+
3. Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
7375

7476
```bash
75-
$ kubectl get gateway inference-gateway
76-
NAME CLASS ADDRESS PROGRAMMED AGE
77-
inference-gateway inference-gateway <MY_ADDRESS> True 22s
77+
kubectl get gateway inference-gateway
7878
```
79-
3. Deploy the HTTPRoute
79+
80+
4. Deploy the HTTPRoute:
8081

8182
```bash
8283
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml
8384
```
8485

85-
4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
86+
5. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
8687

8788
```bash
8889
kubectl get httproute llm-route -o yaml
@@ -93,11 +94,11 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
9394
Please note that this feature is currently in an experimental phase and is not intended for production use.
9495
The implementation and user experience are subject to changes as we continue to iterate on this project.
9596

96-
1. Requirements
97+
1. Requirements:
9798

9899
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
99100

100-
2. Install Istio
101+
2. Install Istio:
101102

102103
```
103104
TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
@@ -120,26 +121,25 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
120121
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/destination-rule.yaml
121122
```
122123

123-
4. Deploy Gateway
124+
4. Deploy the Gateway:
124125

125126
```bash
126127
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml
127128
```
128129

129-
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
130+
5. Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
131+
130132
```bash
131-
$ kubectl get gateway inference-gateway
132-
NAME CLASS ADDRESS PROGRAMMED AGE
133-
inference-gateway inference-gateway <MY_ADDRESS> True 22s
133+
kubectl get gateway inference-gateway
134134
```
135135

136-
5. Deploy the HTTPRoute
136+
6. Deploy the HTTPRoute:
137137

138138
```bash
139139
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml
140140
```
141141

142-
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
142+
7. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
143143

144144
```bash
145145
kubectl get httproute llm-route -o yaml
@@ -150,44 +150,49 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
150150
[Kgateway](https://kgateway.dev/) added Inference Gateway support as a **technical preview** in the
151151
[v2.0.0 release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.0.0). InferencePool v1.0.1 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published.
152152

153-
1. Requirements
153+
1. Requirements:
154154

155155
- [Helm](https://helm.sh/docs/intro/install/) installed.
156156
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
157157

158-
2. Set the Kgateway version and install the Kgateway CRDs.
158+
2. Set the Kgateway version and install the Kgateway CRDs:
159159

160160
```bash
161161
KGTW_VERSION=v2.1.0-main
162162
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
163163
```
164164

165-
3. Install Kgateway
165+
3. Install Kgateway:
166166

167167
```bash
168168
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
169169
```
170170

171-
4. Deploy the Gateway
171+
4. Wait for the Kgateway deployment to be successfully rolled out:
172+
173+
```bash
174+
kubectl rollout status deployment kgateway -n kgateway-system
175+
```
176+
177+
5. Deploy the Gateway:
172178

173179
```bash
174180
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml
175181
```
176182

177-
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
183+
6. Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
184+
178185
```bash
179-
$ kubectl get gateway inference-gateway
180-
NAME CLASS ADDRESS PROGRAMMED AGE
181-
inference-gateway kgateway <MY_ADDRESS> True 22s
186+
kubectl get gateway inference-gateway
182187
```
183188

184-
5. Deploy the HTTPRoute
189+
7. Deploy the HTTPRoute:
185190

186191
```bash
187192
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml
188193
```
189194

190-
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
195+
8. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
191196

192197
```bash
193198
kubectl get httproute llm-route -o yaml
@@ -197,52 +202,57 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
197202

198203
[Agentgateway](https://agentgateway.dev/) is a purpose-built proxy designed for AI workloads, and comes with native support for Inference Gateway. Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane. InferencePool v1.0.0 is currently supported in the latest [rolling release](https://github.com/kgateway-dev/kgateway/releases/tag/v2.1.0-main), which includes the latest changes but may be unstable until the [v2.1.0 release](https://github.com/kgateway-dev/kgateway/milestone/58) is published.
199204

200-
1. Requirements
205+
1. Requirements:
201206

202207
- [Helm](https://helm.sh/docs/intro/install/) installed.
203208
- Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
204209

205-
2. Set the Kgateway version and install the Kgateway CRDs.
210+
2. Set the Kgateway version and install the Kgateway CRDs:
206211

207212
```bash
208213
KGTW_VERSION=v2.1.0-main
209214
helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
210215
```
211216

212-
3. Install Kgateway
217+
3. Install Kgateway:
213218

214219
```bash
215220
helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentGateway.enabled=true
216221
```
217222

218-
4. Deploy the Gateway
223+
4. Wait for the Kgateway deployment to be successfully rolled out:
224+
225+
```bash
226+
kubectl rollout status deployment kgateway -n kgateway-system
227+
```
228+
229+
5. Deploy the Gateway:
219230

220231
```bash
221232
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/gateway.yaml
222233
```
223234

224-
Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
235+
6. Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
236+
225237
```bash
226-
$ kubectl get gateway inference-gateway
227-
NAME CLASS ADDRESS PROGRAMMED AGE
228-
inference-gateway agentgateway <MY_ADDRESS> True 22s
238+
kubectl get gateway inference-gateway
229239
```
230240

231-
5. Deploy the HTTPRoute
241+
7. Deploy the HTTPRoute:
232242

233243
```bash
234244
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/httproute.yaml
235245
```
236246

237-
6. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
247+
8. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
238248

239249
```bash
240250
kubectl get httproute llm-route -o yaml
241251
```
242252

243253
### Deploy InferenceObjective (Optional)
244254

245-
Deploy the sample InferenceObjective which allows you to specify priority of requests.
255+
Deploy the sample InferenceObjective which allows you to specify priority of inference requests:
246256

247257
```bash
248258
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferenceobjective.yaml
@@ -257,7 +267,7 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
257267
The following instructions assume you would like to cleanup ALL resources that were created in this quickstart guide.
258268
Please be careful not to delete resources you'd like to keep.
259269

260-
1. Uninstall the InferencePool, InferenceObjective and model server resources
270+
1. Uninstall the InferencePool, InferenceObjective and model server resources:
261271

262272
```bash
263273
helm uninstall vllm-llama3-8b-instruct
@@ -268,13 +278,13 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
268278
kubectl delete secret hf-token --ignore-not-found
269279
```
270280

271-
1. Uninstall the Gateway API Inference Extension CRDs
281+
1. Uninstall the Gateway API Inference Extension CRDs:
272282

273283
```bash
274284
kubectl delete -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd --ignore-not-found
275285
```
276286

277-
1. Choose one of the following options to cleanup the Inference Gateway.
287+
1. Choose one of the following options to cleanup the Inference Gateway:
278288

279289
=== "GKE"
280290

@@ -294,13 +304,13 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
294304

295305
The following steps assume you would like to clean up ALL Istio resources that were created in this quickstart guide.
296306

297-
1. Uninstall All Istio resources
307+
1. Uninstall All Istio resources:
298308

299309
```bash
300310
istioctl uninstall -y --purge
301311
```
302312

303-
2. Remove the Istio namespace
313+
2. Remove the Istio namespace:
304314

305315
```bash
306316
kubectl delete ns istio-system
@@ -315,19 +325,19 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
315325

316326
The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.
317327

318-
1. Uninstall Kgateway
328+
1. Uninstall Kgateway:
319329

320330
```bash
321331
helm uninstall kgateway -n kgateway-system
322332
```
323333

324-
2. Uninstall the Kgateway CRDs.
334+
2. Uninstall the Kgateway CRDs:
325335

326336
```bash
327337
helm uninstall kgateway-crds -n kgateway-system
328338
```
329339

330-
3. Remove the Kgateway namespace.
340+
3. Remove the Kgateway namespace:
331341

332342
```bash
333343
kubectl delete ns kgateway-system
@@ -342,19 +352,19 @@ Deploy the sample InferenceObjective which allows you to specify priority of req
342352

343353
The following steps assume you would like to cleanup ALL Kgateway resources that were created in this quickstart guide.
344354

345-
1. Uninstall Kgateway
355+
1. Uninstall Kgateway:
346356

347357
```bash
348358
helm uninstall kgateway -n kgateway-system
349359
```
350360

351-
2. Uninstall the Kgateway CRDs.
361+
2. Uninstall the Kgateway CRDs:
352362

353363
```bash
354364
helm uninstall kgateway-crds -n kgateway-system
355365
```
356366

357-
3. Remove the Kgateway namespace.
367+
3. Remove the Kgateway namespace:
358368

359369
```bash
360370
kubectl delete ns kgateway-system

0 commit comments

Comments
 (0)