Skip to content

Conversation

pwschuurman
Copy link
Contributor

@pwschuurman pwschuurman commented Oct 9, 2025

Purpose

Test Result

Validated by running Dockerfile locally

@mergify mergify bot added the ci/build label Oct 9, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for Google Cloud Storage (GCS) in the runai-model-streamer by including the gcs extra during installation in the Dockerfile. The change is straightforward and aligns with the goal of enabling GCS support in the production Docker image. I've identified one high-severity issue related to dependency pinning that should be addressed to improve the stability and reproducibility of the Docker build.

BITSANDBYTES_VERSION="0.46.1"; \
fi; \
uv pip install --system accelerate hf_transfer modelscope "bitsandbytes>=${BITSANDBYTES_VERSION}" 'timm>=1.0.17' 'runai-model-streamer[s3]>=0.14.0'
uv pip install --system accelerate hf_transfer modelscope "bitsandbytes>=${BITSANDBYTES_VERSION}" 'timm>=1.0.17' 'runai-model-streamer[s3,gcs]>=0.14.0'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For reproducible builds, especially in a production Docker image (vllm-openai-base), it's crucial to pin dependency versions. The packages accelerate, hf_transfer, and modelscope are not pinned to a specific version. This could lead to unexpected issues or breakages in the future if a new version of these packages is released with breaking changes.

I recommend pinning these packages to known working versions using ==. For example:

    uv pip install --system \
        accelerate==<known_good_version> \
        hf_transfer==<known_good_version> \
        modelscope==<known_good_version> \
        "bitsandbytes>=${BITSANDBYTES_VERSION}" \
        'timm>=1.0.17' \
        'runai-model-streamer[s3,gcs]>=0.14.0'

While other packages in this line have minimum versions (>=), using exact versions (==) is generally safer for production images to ensure build stability.

@pwschuurman
Copy link
Contributor Author

@22quinn Would you be able to review and approve? I missed including the Dockerfile in #24909

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 9, 2025 04:29
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 9, 2025
@vllm-bot vllm-bot merged commit 0d7c3cb into vllm-project:main Oct 9, 2025
85 of 89 checks passed
845473182 pushed a commit to dsxsteven/vllm_splitPR that referenced this pull request Oct 10, 2025
…to loader

* 'loader' of https://github.com/dsxsteven/vllm_splitPR: (778 commits)
  [torchao] Add support for ModuleFqnToConfig using regex (vllm-project#26001)
  Add: Support for multiple hidden layers in Eagle3 (vllm-project#26164)
  Enable `RMSNorm` substitution for Transformers backend (vllm-project#26353)
  [Model] Gemma3: Fix GGUF loading and quantization (vllm-project#26189)
  Bump Flashinfer to v0.4.0 (vllm-project#26326)
  Update Dockerfile and install runai-model-streamer[gcs] package (vllm-project#26464)
  [Core] Relax the LoRA  max rank (vllm-project#26461)
  [CI/Build] Fix model nightly tests (vllm-project#26466)
  [Hybrid]: Decouple Kernel Block Size from KV Page Size (vllm-project#24486)
  [Core][KVConnector] Propagate all tokens on resumed preemptions (vllm-project#24926)
  [MM][Doc] Add documentation for configurable mm profiling (vllm-project#26200)
  [Hardware][AMD] Enable FlexAttention backend on ROCm (vllm-project#26439)
  [Bugfix] Incorrect another MM data format in vllm bench throughput (vllm-project#26462)
  [Bugfix] Catch and log invalid token ids in detokenizer #2 (vllm-project#26445)
  [Minor] Change warning->warning_once in preprocess (vllm-project#26455)
  [Bugfix] Set the minimum python version for gpt-oss (vllm-project#26392)
  [Misc] Redact ray runtime env before logging (vllm-project#26302)
  Separate MLAAttention class from Attention (vllm-project#25103)
  [Attention] Register FLASHMLA_SPARSE (vllm-project#26441)
  [Kernels] Modular kernel refactor (vllm-project#24812)
  ...
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants