Skip to content

First iteration of development deployments & environments #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

shaneutt
Copy link
Collaborator

@shaneutt shaneutt commented Apr 16, 2025

This is a first pass at adding deployment manifests and scripts for development environments, in service of #3.

This adds deployments for various components:

  • Istio Sail Operator
  • Istio Control Plane (for Gateways)
  • VLLM Simulator
  • Inference Gateway (Gateway + Endpoint Picker (EPP) via ext_proc) exposed via HTTPRoute

This also adds an initial deployment for a kind-based development environment, and a script which will set it all up automatically.

I've tested this on Linux with podman: I was able to make changes to the EPP and test them and things are working.

This is very much the first iteration, and there's a lot more to do. Next we'll need to adapt these pieces for dev envs on OpenShift. For now though it's functional at a basic level and provides several building blocks for later so should be a reasonable starting point.

Fixes #11

@shaneutt shaneutt force-pushed the shaneutt/initial-dev-deployments branch from 86e4726 to c679724 Compare April 16, 2025 23:05
@clubanderson
Copy link
Collaborator

/retest

@clubanderson
Copy link
Collaborator

/lgtm

@nirrozenbaum
Copy link
Collaborator

@shaneutt I'm not sure I got the point.
this is a PR to NM fork of GIE. and inside this PR there is a deployment of the deprecated inference-router.
I thought we wanted to have this exact automation and deployment for our fork of GIE, not for the deprecated router.
(kvcache implementation is based on our GIE fork which includes the notion of scorer).
maybe I missed something since I was in a day off yesterday..

@elevran
Copy link
Collaborator

elevran commented Apr 17, 2025

@nirrozenbaum your comment is correct - we need to validate this on our GIE fork and not the inference-router images. I think you're seeing the references as @shaneutt was porting the code over from that repo.

I'm going through this to

  • test with docker (instead of Podman)
  • edits to ensure working with GIE images

@clubanderson
Copy link
Collaborator

FYI. Once approved and merged it will be deployed in the cluster for all to use.

@clubanderson
Copy link
Collaborator

There is a target in your makefile to deploy to kube or docker using podman or docker.

@clubanderson
Copy link
Collaborator

/approve

@clubanderson clubanderson merged commit dad8db2 into neuralmagic:dev Apr 17, 2025
1 check passed
@shaneutt
Copy link
Collaborator Author

Yes you're right @nirrozenbaum, but that was intentional: this was just a first step I did late last night and needed to share so that @elevran could grab it and start the next iteration in his morning. So, as he mentions, next step is swapping in the GIE images and making any adaptations needed for that. After that we're going to take all these building blocks and tweak them for vanilla Kubernetes and OpenShift, and we intend to have them packaged up in a way people can use them on a shared OpenShift cluster (see #3 for the full goal).

@shaneutt shaneutt deleted the shaneutt/initial-dev-deployments branch April 17, 2025 13:38
@shaneutt shaneutt linked an issue Apr 18, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add GIE builds to development environment deployments Kind Dev Environment - Initial Components & Env
4 participants