Skip to content

Conversation

@seans3
Copy link
Contributor

@seans3 seans3 commented Aug 19, 2025

  • For AI vLLM inference example, add more cloud provider specific information.
    • Add Platform-Specific Configuration section with relevant information to README.md
    • Add nodeSelector examples for three cloud providers to vllm-deployment.yaml

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 19, 2025
@k8s-ci-robot k8s-ci-robot requested review from kow3ns and soltysh August 19, 2025 22:08
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Aug 19, 2025
@seans3
Copy link
Contributor Author

seans3 commented Aug 19, 2025

/assign @janetkuo

# - GKE
# nodeSelector:
# cloud.google.com/gke-accelerator: nvidia-l4
# cloud.google.com/gke-gpu-driver-version: latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open question: Is it recommended to use default or latest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I am changing to default. While the following documentation recommends latest, I believe default makes more sense in this case because it should be more stable.

Both latest and default can be correct, but they specify different driver installation behaviors. For most cases, latest is the recommended choice.

Understanding the Difference
When you create a GPU node pool in GKE, you can choose how the NVIDIA drivers are managed. The nodeSelector label in your workload must match the configuration on the node.

🚀 latest: This label targets nodes where GKE automatically installs and updates to the latest stable driver version available for your GKE version. This is the best option for most users as it ensures you have the most recent performance improvements and security patches without manual intervention.

🛡️ default: This label targets nodes that use the default driver version for your GKE version. This version is more static and will not change automatically, providing a more stable target if your application has a strict dependency on a specific driver version. You would use this to prevent unexpected driver updates from affecting your workload.

Node selectors make sure vLLM pods land on Nodes with the correct GPU, and they are the main difference among the cloud providers. The following are node selector examples for three cloud providers.

- GKE
This `nodeSelector` uses labels that are specific to Google Kubernetes Engine.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't rendered well (the newline isn't rendered)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fixed. Please let me know what you think.


---

## Cloud Provider Differences
Copy link
Member

@janetkuo janetkuo Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are platforms that aren't public clouds, such as on-prem vendors. Suggest making the title more inclusive, such as "Platform-Specific Configuration" or "GPU Node Selection on Different Platforms"

Suggested change
## Cloud Provider Differences
## Platform-Specific Configuration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated this (as well as the link). Please let me know what you think.

cloud.google.com/gke-accelerator: nvidia-l4
cloud.google.com/gke-gpu-driver-version: latest
```
- EKS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valuable addition for making this example useful on other clouds. My one question is about long-term maintenance. Since our team's expertise is primarily with GKE, how can we ensure the configurations for other platforms stay up-to-date? Perhaps we could add a note welcoming community contributions to maintain them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question that I don't have an answer to at the moment. In a separate PR, I could add a CONTRIBUTING.md, which lays out our expectations for maintenance from each cloud provider (and/or bare metal). Let's discuss this in person.

@seans3 seans3 force-pushed the vllm-deployment-update branch 4 times, most recently from df17639 to e78f5a1 Compare August 20, 2025 18:35
@seans3 seans3 force-pushed the vllm-deployment-update branch from e78f5a1 to a5b001f Compare August 20, 2025 18:37
Copy link
Member

@janetkuo janetkuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 22, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: janetkuo, seans3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 22, 2025
@k8s-ci-robot k8s-ci-robot merged commit 5f7960f into kubernetes:master Aug 22, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants