-
-
Notifications
You must be signed in to change notification settings - Fork 10.7k
Update Dockerfile and install runai-model-streamer[gcs] package #26464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Peter Schuurman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for Google Cloud Storage (GCS) in the runai-model-streamer
by including the gcs
extra during installation in the Dockerfile. The change is straightforward and aligns with the goal of enabling GCS support in the production Docker image. I've identified one high-severity issue related to dependency pinning that should be addressed to improve the stability and reproducibility of the Docker build.
BITSANDBYTES_VERSION="0.46.1"; \ | ||
fi; \ | ||
uv pip install --system accelerate hf_transfer modelscope "bitsandbytes>=${BITSANDBYTES_VERSION}" 'timm>=1.0.17' 'runai-model-streamer[s3]>=0.14.0' | ||
uv pip install --system accelerate hf_transfer modelscope "bitsandbytes>=${BITSANDBYTES_VERSION}" 'timm>=1.0.17' 'runai-model-streamer[s3,gcs]>=0.14.0' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reproducible builds, especially in a production Docker image (vllm-openai-base
), it's crucial to pin dependency versions. The packages accelerate
, hf_transfer
, and modelscope
are not pinned to a specific version. This could lead to unexpected issues or breakages in the future if a new version of these packages is released with breaking changes.
I recommend pinning these packages to known working versions using ==
. For example:
uv pip install --system \
accelerate==<known_good_version> \
hf_transfer==<known_good_version> \
modelscope==<known_good_version> \
"bitsandbytes>=${BITSANDBYTES_VERSION}" \
'timm>=1.0.17' \
'runai-model-streamer[s3,gcs]>=0.14.0'
While other packages in this line have minimum versions (>=
), using exact versions (==
) is generally safer for production images to ensure build stability.
…to loader * 'loader' of https://github.com/dsxsteven/vllm_splitPR: (778 commits) [torchao] Add support for ModuleFqnToConfig using regex (vllm-project#26001) Add: Support for multiple hidden layers in Eagle3 (vllm-project#26164) Enable `RMSNorm` substitution for Transformers backend (vllm-project#26353) [Model] Gemma3: Fix GGUF loading and quantization (vllm-project#26189) Bump Flashinfer to v0.4.0 (vllm-project#26326) Update Dockerfile and install runai-model-streamer[gcs] package (vllm-project#26464) [Core] Relax the LoRA max rank (vllm-project#26461) [CI/Build] Fix model nightly tests (vllm-project#26466) [Hybrid]: Decouple Kernel Block Size from KV Page Size (vllm-project#24486) [Core][KVConnector] Propagate all tokens on resumed preemptions (vllm-project#24926) [MM][Doc] Add documentation for configurable mm profiling (vllm-project#26200) [Hardware][AMD] Enable FlexAttention backend on ROCm (vllm-project#26439) [Bugfix] Incorrect another MM data format in vllm bench throughput (vllm-project#26462) [Bugfix] Catch and log invalid token ids in detokenizer #2 (vllm-project#26445) [Minor] Change warning->warning_once in preprocess (vllm-project#26455) [Bugfix] Set the minimum python version for gpt-oss (vllm-project#26392) [Misc] Redact ray runtime env before logging (vllm-project#26302) Separate MLAAttention class from Attention (vllm-project#25103) [Attention] Register FLASHMLA_SPARSE (vllm-project#26441) [Kernels] Modular kernel refactor (vllm-project#24812) ...
…-project#26464) Signed-off-by: Peter Schuurman <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>
…-project#26464) Signed-off-by: Peter Schuurman <[email protected]> Signed-off-by: Dhruvil Bhatt <[email protected]>
…-project#26464) Signed-off-by: Peter Schuurman <[email protected]>
Purpose
Test Result
Validated by running Dockerfile locally