For AI vLLM example, add more cloud provider specific information #568

seans3 · 2025-08-19T22:08:22Z

For AI vLLM inference example, add more cloud provider specific information.
- Add Platform-Specific Configuration section with relevant information to README.md
- Add nodeSelector examples for three cloud providers to vllm-deployment.yaml

seans3 · 2025-08-19T22:09:13Z

janetkuo · 2025-08-19T23:37:54Z

ai/vllm-deployment/vllm-deployment.yaml

+      # - GKE
      # nodeSelector:
      #   cloud.google.com/gke-accelerator: nvidia-l4
      #   cloud.google.com/gke-gpu-driver-version: latest


Open question: Is it recommended to use default or latest?

Good catch. I am changing to default. While the following documentation recommends latest, I believe default makes more sense in this case because it should be more stable.

Both latest and default can be correct, but they specify different driver installation behaviors. For most cases, latest is the recommended choice. Understanding the Difference When you create a GPU node pool in GKE, you can choose how the NVIDIA drivers are managed. The nodeSelector label in your workload must match the configuration on the node. 🚀 latest: This label targets nodes where GKE automatically installs and updates to the latest stable driver version available for your GKE version. This is the best option for most users as it ensures you have the most recent performance improvements and security patches without manual intervention. 🛡️ default: This label targets nodes that use the default driver version for your GKE version. This version is more static and will not change automatically, providing a more stable target if your application has a strict dependency on a specific driver version. You would use this to prevent unexpected driver updates from affecting your workload.

janetkuo · 2025-08-19T23:41:36Z

ai/vllm-deployment/README.md

+Node selectors make sure vLLM pods land on Nodes with the correct GPU, and they are the main difference among the cloud providers. The following are node selector examples for three cloud providers.
+
+- GKE
+  This `nodeSelector` uses labels that are specific to Google Kubernetes Engine.


This isn't rendered well (the newline isn't rendered)

I think this is fixed. Please let me know what you think.

janetkuo · 2025-08-19T23:47:21Z

ai/vllm-deployment/README.md

+
+---
+
+## Cloud Provider Differences


There are platforms that aren't public clouds, such as on-prem vendors. Suggest making the title more inclusive, such as "Platform-Specific Configuration" or "GPU Node Selection on Different Platforms"

Suggested change

## Cloud Provider Differences

## Platform-Specific Configuration

I have updated this (as well as the link). Please let me know what you think.

janetkuo · 2025-08-19T23:56:37Z

ai/vllm-deployment/README.md

+    cloud.google.com/gke-accelerator: nvidia-l4
+    cloud.google.com/gke-gpu-driver-version: latest
+  ```
+- EKS


This is a valuable addition for making this example useful on other clouds. My one question is about long-term maintenance. Since our team's expertise is primarily with GKE, how can we ensure the configurations for other platforms stay up-to-date? Perhaps we could add a note welcoming community contributions to maintain them?

This is a good question that I don't have an answer to at the moment. In a separate PR, I could add a CONTRIBUTING.md, which lays out our expectations for maintenance from each cloud provider (and/or bare metal). Let's discuss this in person.

janetkuo

/lgtm

k8s-ci-robot · 2025-08-22T18:56:38Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: janetkuo, seans3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [janetkuo]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 19, 2025

k8s-ci-robot requested review from kow3ns and soltysh August 19, 2025 22:08

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 19, 2025

k8s-ci-robot assigned janetkuo Aug 19, 2025

janetkuo reviewed Aug 19, 2025

View reviewed changes

seans3 force-pushed the vllm-deployment-update branch 4 times, most recently from df17639 to e78f5a1 Compare August 20, 2025 18:35

added more cloud provider information to ai/vllm example

a5b001f

seans3 force-pushed the vllm-deployment-update branch from e78f5a1 to a5b001f Compare August 20, 2025 18:37

janetkuo approved these changes Aug 22, 2025

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 22, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 22, 2025

k8s-ci-robot merged commit 5f7960f into kubernetes:master Aug 22, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

For AI vLLM example, add more cloud provider specific information #568

For AI vLLM example, add more cloud provider specific information #568

seans3 commented Aug 19, 2025 •

edited

Loading

Uh oh!

seans3 commented Aug 19, 2025

Uh oh!

janetkuo Aug 19, 2025

Uh oh!

seans3 Aug 20, 2025

Uh oh!

janetkuo Aug 19, 2025

Uh oh!

seans3 Aug 20, 2025

Uh oh!

janetkuo Aug 19, 2025 •

edited

Loading

Uh oh!

seans3 Aug 20, 2025

Uh oh!

janetkuo Aug 19, 2025

Uh oh!

seans3 Aug 20, 2025

Uh oh!

janetkuo left a comment

Uh oh!

k8s-ci-robot commented Aug 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	## Cloud Provider Differences
	## Platform-Specific Configuration

For AI vLLM example, add more cloud provider specific information #568

For AI vLLM example, add more cloud provider specific information #568

Conversation

seans3 commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seans3 commented Aug 19, 2025

Uh oh!

janetkuo Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

seans3 Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

janetkuo Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

seans3 Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

janetkuo Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seans3 Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

janetkuo Aug 19, 2025

Choose a reason for hiding this comment

Uh oh!

seans3 Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

janetkuo left a comment

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Aug 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

seans3 commented Aug 19, 2025 •

edited

Loading

janetkuo Aug 19, 2025 •

edited

Loading