1+ <!-- If you are updating this getting-started-latest.md guide, please make sure to update the index.md as well -->
2+
13# Getting started with an Inference Gateway
24
35!!! warning "Unreleased/main branch"
4143kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
4244```
4345
44- ### Deploy the InferencePool and Endpoint Picker Extension
46+ ### Install the Gateway
4547
46- Install an InferencePool named ` vllm-llama3-8b-instruct ` that selects from endpoints with label ` app: vllm-llama3-8b-instruct ` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources .
48+ Choose one of the following options to install Gateway .
4749
48- Set the chart version and then select a tab to follow the provider-specific instructions.
50+ === "GKE"
4951
50- ``` bash
51- export IGW_CHART_VERSION=v0
52- ```
52+ Nothing to install here, you can move to the next [section](#deploy-the-inferencepool-and-endpoint-picker-extension)
5353
54- --8<-- "site-src/ _ includes/epp-latest.md "
54+ === "Istio "
5555
56- ### Deploy an Inference Gateway
56+ 1. Requirements
57+ - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
5758
58- Choose one of the following options to deploy an Inference Gateway.
59+ 2. Install Istio
5960
60- === "GKE"
61+ ```
62+ TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
63+ # on Linux
64+ wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz
65+ tar -xvf istioctl-$TAG-linux-amd64.tar.gz
66+ # on macOS
67+ wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz
68+ tar -xvf istioctl-$TAG-osx.tar.gz
69+ # on Windows
70+ wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip
71+ unzip istioctl-$TAG-win.zip
6172
62- 1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary.
63- See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway)
64- for detailed instructions.
73+ ./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
74+ ```
75+
76+ === "Kgateway"
6577
66- 2. Deploy Inference Gateway:
78+ 1. Requirements
79+
80+ - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
81+ - [Helm](https://helm.sh/docs/intro/install/) installed.
82+
83+ 2. Set the Kgateway version and install the Kgateway CRDs.
6784
6885 ```bash
69- kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/gateway.yaml
86+ KGTW_VERSION=v2.1.0
87+ helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
7088 ```
7189
72- Confirm that the Gateway was assigned an IP address and reports a `Programmed=True` status:
90+ 3. Install Kgateway
7391
7492 ```bash
75- $ kubectl get gateway inference-gateway
76- NAME CLASS ADDRESS PROGRAMMED AGE
77- inference-gateway inference-gateway <MY_ADDRESS> True 22s
93+ helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
7894 ```
79- 3. Deploy the HTTPRoute
95+
96+ === "Agentgateway"
97+
98+ 1. Requirements
99+
100+ - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
101+ - [Helm](https://helm.sh/docs/intro/install/) installed.
102+
103+ 2. Set the Kgateway version and install the Kgateway CRDs.
80104
81105 ```bash
82- kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/gke/httproute.yaml
106+ KGTW_VERSION=v2.1.0
107+ helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
83108 ```
84109
85- 4. Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
110+ 3. Install Kgateway
86111
87112 ```bash
88- kubectl get httproute llm-route -o yaml
113+ helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentgateway.enabled=true
89114 ```
90115
91- === "Istio"
116+ ### Deploy the InferencePool and Endpoint Picker Extension
92117
93- Please note that this feature is currently in an experimental phase and is not intended for production use.
94- The implementation and user experience are subject to changes as we continue to iterate on this project.
118+ Install an InferencePool named ` vllm-llama3-8b-instruct ` that selects from endpoints with label ` app: vllm-llama3-8b-instruct ` and listening on port 8000. The Helm install command automatically installs the endpoint-picker, InferencePool along with provider specific resources.
119+
120+ Set the chart version and then select a tab to follow the provider-specific instructions.
95121
96- 1. Requirements
122+ ``` bash
123+ export IGW_CHART_VERSION=v0
124+ ```
97125
98- - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
126+ --8<-- "site-src/ _ includes/epp-latest.md"
99127
100- 2. Install Istio
128+ ### Deploy an Inference Gateway
101129
102- ```
103- TAG=$(curl https://storage.googleapis.com/istio-build/dev/1.28-dev)
104- # on Linux
105- wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-linux-amd64.tar.gz
106- tar -xvf istioctl-$TAG-linux-amd64.tar.gz
107- # on macOS
108- wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-osx.tar.gz
109- tar -xvf istioctl-$TAG-osx.tar.gz
110- # on Windows
111- wget https://storage.googleapis.com/istio-build/dev/$TAG/istioctl-$TAG-win.zip
112- unzip istioctl-$TAG-win.zip
130+ Choose one of the following options to deploy an Inference Gateway.
113131
114- ./istioctl install --set tag=$TAG --set hub=gcr.io/istio-testing --set values.pilot.env.ENABLE_GATEWAY_API_INFERENCE_EXTENSION=true
115- ```
132+ === "GKE"
116133
117- 3. If your EPP uses secure serving with self-signed certs (default), temporarily bypass TLS verification:
134+ 1. Enable the Google Kubernetes Engine API, Compute Engine API, the Network Services API and configure proxy-only subnets when necessary.
135+ See [Deploy Inference Gateways](https://cloud.google.com/kubernetes-engine/docs/how-to/deploy-gke-inference-gateway)
136+ for detailed instructions.
137+
138+ === "Istio"
139+
140+ Please note that this feature is currently in an experimental phase and is not intended for production use.
141+ The implementation and user experience are subject to changes as we continue to iterate on this project.
142+
143+ 1. If your EPP uses secure serving with self-signed certs (default), temporarily bypass TLS verification:
118144
119145 ```bash
120146 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/destination-rule.yaml
121147 ```
122148
123- 4 . Deploy Gateway
149+ 2 . Deploy the Inference Gateway
124150
125151 ```bash
126152 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/gateway.yaml
@@ -133,13 +159,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
133159 inference-gateway inference-gateway <MY_ADDRESS> True 22s
134160 ```
135161
136- 5 . Deploy the HTTPRoute
162+ 3 . Deploy the HTTPRoute
137163
138164 ```bash
139165 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/istio/httproute.yaml
140166 ```
141167
142- 6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
168+ 4 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
143169
144170 ```bash
145171 kubectl get httproute llm-route -o yaml
@@ -151,25 +177,7 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
151177 [conformant](https://github.com/kubernetes-sigs/gateway-api-inference-extension/tree/main/conformance/reports/v1.0.0/gateway/kgateway)
152178 gateway. Follow these steps to run Kgateway:
153179
154- 1. Requirements
155-
156- - [Helm](https://helm.sh/docs/intro/install/) installed.
157- - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
158-
159- 2. Set the Kgateway version and install the Kgateway CRDs.
160-
161- ```bash
162- KGTW_VERSION=v2.2.0-main
163- helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
164- ```
165-
166- 3. Install Kgateway
167-
168- ```bash
169- helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true
170- ```
171-
172- 4. Deploy the Gateway
180+ 1. Deploy the Inference Gateway
173181
174182 ```bash
175183 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/gateway.yaml
@@ -182,13 +190,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
182190 inference-gateway kgateway <MY_ADDRESS> True 22s
183191 ```
184192
185- 5 . Deploy the HTTPRoute
193+ 2 . Deploy the HTTPRoute
186194
187195 ```bash
188196 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/kgateway/httproute.yaml
189197 ```
190198
191- 6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
199+ 3 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
192200
193201 ```bash
194202 kubectl get httproute llm-route -o yaml
@@ -200,25 +208,7 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
200208 Agentgateway integrates with [Kgateway](https://kgateway.dev/) as it's control plane. Follow these steps to run Kgateway with the agentgateway
201209 data plane:
202210
203- 1. Requirements
204-
205- - [Helm](https://helm.sh/docs/intro/install/) installed.
206- - Gateway API [CRDs](https://gateway-api.sigs.k8s.io/guides/#installing-gateway-api) installed.
207-
208- 2. Set the Kgateway version and install the Kgateway CRDs.
209-
210- ```bash
211- KGTW_VERSION=v2.2.0-main
212- helm upgrade -i --create-namespace --namespace kgateway-system --version $KGTW_VERSION kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds
213- ```
214-
215- 3. Install Kgateway
216-
217- ```bash
218- helm upgrade -i --namespace kgateway-system --version $KGTW_VERSION kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --set inferenceExtension.enabled=true --set agentgateway.enabled=true
219- ```
220-
221- 4. Deploy the Gateway
211+ 1. Deploy the Inference Gateway
222212
223213 ```bash
224214 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/gateway.yaml
@@ -231,13 +221,13 @@ kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extens
231221 inference-gateway agentgateway <MY_ADDRESS> True 22s
232222 ```
233223
234- 5 . Deploy the HTTPRoute
224+ 2 . Deploy the HTTPRoute
235225
236226 ```bash
237227 kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/gateway/agentgateway/httproute.yaml
238228 ```
239229
240- 6 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
230+ 3 . Confirm that the HTTPRoute status conditions include `Accepted=True` and `ResolvedRefs=True`:
241231
242232 ```bash
243233 kubectl get httproute llm-route -o yaml
0 commit comments