[Model] Add smolvlm support #16017

chaunceyjiang · 2025-04-03T14:30:44Z

Add smolvlm support

github-actions · 2025-04-03T14:30:55Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

chaunceyjiang · 2025-04-03T15:46:20Z

test

vllm serve HuggingFaceTB/SmolVLM2-2.2B-Instruct --limit-mm-per-prompt image=4

multi-image

# python examples/online_serving/openai_chat_completion_client_for_multimodal.py --chat-type multi-image 
INFO 04-03 13:37:43 [__init__.py:239] Automatically detected platform cuda.
Chat completion output:  In the center of this image, the majestic lion commands attention. Its fur, a rich, full-grown orange, is bathed in the warm glow of the sun, reflecting off the tall grass it stands in. The lion's black and brown mane, full and well-groomed, cascades over

text-only

#  python examples/online_serving/openai_chat_completion_client_for_multimodal.py --chat-type text-only
INFO 04-03 15:41:36 [__init__.py:239] Automatically detected platform cuda.
Chat completion output:  The capital of France is Paris. It is the country's largest city, cultural center, and a global hub for fashion, gastronomy, and art. Paris is located in northern France on the Seine River and is renowned globally for its beautiful architecture, historical landmarks, and iconic landmarks such as the Eiffel Tower, Notre

single-image

# python examples/online_serving/openai_chat_completion_client_for_multimodal.py   
INFO 04-03 13:30:35 [__init__.py:239] Automatically detected platform cuda.
Chat completion output from image url:  This image shows a vibrant scene of a wooden boardwalk path that extends through a lush, green grassy field. The boardwalk, made of wooden planks, is prominently displayed in the foreground, inviting viewers to imagine themselves walking along its length. The grassy field underfoot is a vibrant green, dotted with occasional wildflowers that add
Chat completion output from base64 encoded image:  This image captures the serene beauty of a wooden boardwalk that cuts through a verdant, sloping field adorned with vibrant green grass. The boardwalk, constructed from wooden planks, gently curves from the bottom left to the center of the image, inviting viewers to step into this pastoral scene. The grass, lush and ver

docs/source/models/supported_models.md

vllm/model_executor/models/smolvlm.py

Signed-off-by: chaunceyjiang <[email protected]>

tests/models/decoder_only/vision_language/test_models.py

Signed-off-by: chaunceyjiang <[email protected]>

tests/models/decoder_only/vision_language/vlm_utils/model_utils.py

Signed-off-by: chaunceyjiang <[email protected]>

DarkLight1337

Otherwise looks good, assuming that you have run the example scripts already

chaunceyjiang · 2025-04-08T09:43:52Z

test

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang · 2025-04-08T10:09:36Z

test

python examples/offline_inference/vision_language.py --model-type smolvlm
...
INFO 04-08 10:08:51 [kv_cache_utils.py:577] GPU KV cache size: 353,952 tokens
INFO 04-08 10:08:51 [kv_cache_utils.py:580] Maximum concurrency for 8,192 tokens per request: 43.21x
DEBUG 04-08 10:08:52 [core_client.py:421] Waiting for 1 core engine proc(s) to start: {0}
INFO 04-08 10:08:55 [core.py:162] init engine (profile, create kv cache, warmup model) took 5.39 seconds
DEBUG 04-08 10:08:55 [core.py:410] EngineCore waiting for work.
INFO 04-08 10:08:55 [core_client.py:435] Core engine process 0 ready.
DEBUG 04-08 10:08:55 [decorators.py:109] Inferred dynamic dimensions for forward method of <class 'vllm.model_executor.models.llama.LlamaModel'>: ['input_ids', 'positions', 'intermediate_tensors', 'inputs_embeds']
DEBUG 04-08 10:08:58 [core.py:416] EngineCore loop active - local unfinished: True, finished: False.
Processed prompts:  75%|██████████████████████████████████████████              | 3/4 [00:01<00:00,  2.47it/s, est. speed input: 2553.39 toks/s, output: 148.69 toks/s]DEBUG 04-08 10:08:59 [core.py:410] EngineCore waiting for work.
Processed prompts: 100%|████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  3.03it/s, est. speed input: 3327.80 toks/s, output: 193.79 toks/s]
 The image captures the iconic Tokyo Tower, a renowned landmark in Japan, standing tall against the backdrop of a clear blue sky. The tower, painted in a pristine white, is adorned with a lattice structure that adds an element of architectural interest. The perspective of the image is particularly striking, as it is taken from a low
 The image captures a breathtaking view of the Tokyo Skytree, the tallest structure in Japan, standing tall against the backdrop of a clear blue sky. The Skytree, a modern marvel of engineering, is adorned with a white lattice structure that adds a unique aesthetic appeal to its towering presence. The perspective of the image is particularly
 The image captures a breathtaking view of the Tokyo Tower, a renowned landmark in Japan. The tower, painted in a pristine white, stands tall against the backdrop of a clear blue sky. Its unique lattice structure adds an architectural marvel to the scene.

In the foreground, cherry blossoms are in full bloom, their delicate
 The image captures a breathtaking view of the Tokyo Skytree, the tallest structure in Japan, standing majestically against a backdrop of a clear blue sky. The Skytree, a modern marvel of architecture, is adorned with a white dome at its center, surrounded by a lattice of steel beams that add a touch of industrial
DEBUG 04-08 10:08:59 [core.py:382] EngineCore interrupted.

tests/models/multimodal/processing/test_smolvlm.py

Signed-off-by: chaunceyjiang <[email protected]>

vllm/model_executor/models/smolvlm.py

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang · 2025-04-08T11:20:11Z

test

python examples/offline_inference/vision_language.py --model-type smolvlm
...
DEBUG 04-08 11:18:46 [core.py:410] EngineCore waiting for work.
Processed prompts: 100%|████████████████████████████████████████████████████████| 4/4 [00:01<00:00,  3.14it/s, est. speed input: 3449.77 toks/s, output: 200.89 toks/s]
 The image captures a breathtaking view of the Tokyo Skytree, the tallest structure in Japan, standing tall against the backdrop of a clear blue sky. The Skytree, a modern marvel of engineering, is a towering structure with a unique spiral design. It's a testament to human ingenuity and the beauty of architecture.


 The image captures a breathtaking view of the Tokyo Skytree, the tallest structure in Japan, standing majestically against the backdrop of a clear blue sky. The Skytree, a modern marvel of architecture, is adorned with a lattice structure that adds a unique charm to its appearance. The perspective of the image is from below
 The image captures a breathtaking view of the Tokyo Skytree, the tallest structure in Japan, standing tall against the backdrop of a clear blue sky. The Skytree, a white tower with a unique spiral design, is prominently featured in the center of the image. It's surrounded by a sea of pink cherry blossoms, their
 The image captures a breathtaking view of the Tokyo Tower, a renowned landmark in Japan. The tower, painted in white, stands tall against the backdrop of a clear blue sky. It's surrounded by a sea of pink cherry blossoms, their delicate petals adding a touch of softness to the scene. The perspective of the image

chaunceyjiang · 2025-04-08T14:32:57Z

ImportError: Package `num2words` is required to run SmolVLM processor. Install it with `pip install num2words`.
--
  | [2025-04-08T12:21:46Z] FAILED models/multimodal/processing/test_smolvlm.py::test_processor_override[True-1-mm_processor_kwargs1-845-HuggingFaceTB/SmolVLM2-2.2B-Instruct] - ImportError: Package `num2words` is required to run SmolVLM processor. Install it with `pip install num2words`.

Hi, @DarkLight1337 Should the num2words dependency be added to common.txt?

Signed-off-by: chaunceyjiang <[email protected]>

DarkLight1337 · 2025-04-08T15:00:33Z

ImportError: Package `num2words` is required to run SmolVLM processor. Install it with `pip install num2words`.
--
  | [2025-04-08T12:21:46Z] FAILED models/multimodal/processing/test_smolvlm.py::test_processor_override[True-1-mm_processor_kwargs1-845-HuggingFaceTB/SmolVLM2-2.2B-Instruct] - ImportError: Package `num2words` is required to run SmolVLM processor. Install it with `pip install num2words`.

Hi, @DarkLight1337 Should the num2words dependency be added to common.txt?

Let's just add it to the test requirements, not common

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang · 2025-04-09T01:35:58Z

@DarkLight1337 The e2e test failures seem unrelated to my code.
The tests I added have already passed.

Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: Yang Wang <[email protected]>

Signed-off-by: chaunceyjiang <[email protected]>

Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: Mu Huai <[email protected]>

mergify bot added documentation Improvements or additions to documentation frontend v1 labels Apr 3, 2025

chaunceyjiang force-pushed the smolvlm branch 3 times, most recently from bc8df55 to 88f116b Compare April 3, 2025 15:38

chaunceyjiang force-pushed the smolvlm branch from 88f116b to 0c77d72 Compare April 3, 2025 15:52

chaunceyjiang marked this pull request as ready for review April 4, 2025 04:33

chaunceyjiang requested review from DarkLight1337 and ywang96 as code owners April 4, 2025 04:33

DarkLight1337 reviewed Apr 4, 2025

View reviewed changes

docs/source/models/supported_models.md Outdated Show resolved Hide resolved

vllm/model_executor/models/smolvlm.py Outdated Show resolved Hide resolved

chaunceyjiang commented Apr 5, 2025

View reviewed changes

vllm/model_executor/models/smolvlm.py Outdated Show resolved Hide resolved

chaunceyjiang added 4 commits April 8, 2025 07:54

[Model] Add smolvlm suppor

ef1c17b

Signed-off-by: chaunceyjiang <[email protected]>

[Model] Add smolvlm suppor

0f2d5ca

Signed-off-by: chaunceyjiang <[email protected]>

[Model] Add smolvlm suppor

da15194

Signed-off-by: chaunceyjiang <[email protected]>

[Model] Add smolvlm suppor

ab1de97

Signed-off-by: chaunceyjiang <[email protected]>

chaunceyjiang force-pushed the smolvlm branch from ab5e958 to ab1de97 Compare April 8, 2025 07:55

[Model] Add smolvlm suppor

3c942bb

Signed-off-by: chaunceyjiang <[email protected]>

DarkLight1337 reviewed Apr 8, 2025

View reviewed changes

tests/models/decoder_only/vision_language/test_models.py Outdated Show resolved Hide resolved

chaunceyjiang added 3 commits April 8, 2025 09:04

[Model] Add smolvlm suppor

dcfa65d

Signed-off-by: chaunceyjiang <[email protected]>

[Model] Add smolvlm suppor

890aa1b

Signed-off-by: chaunceyjiang <[email protected]>

[Model] Add smolvlm suppor

e235868

Signed-off-by: chaunceyjiang <[email protected]>

mergify bot added the multi-modality Related to multi-modality (#4194) label Apr 8, 2025

[Model] Add smolvlm suppor

e2e5976

Signed-off-by: chaunceyjiang <[email protected]>

DarkLight1337 reviewed Apr 8, 2025

View reviewed changes

tests/models/decoder_only/vision_language/vlm_utils/model_utils.py Show resolved Hide resolved

[Model] Add smolvlm suppor

65baca8

Signed-off-by: chaunceyjiang <[email protected]>

DarkLight1337 approved these changes Apr 8, 2025

View reviewed changes

auto-merge was automatically disabled April 8, 2025 09:48
Head branch was pushed to by a user without write access

chaunceyjiang added 2 commits April 8, 2025 09:48

[Model] Add smolvlm suppor

b992e60

Signed-off-by: chaunceyjiang <[email protected]>

[Model] Add smolvlm suppor

a4bae41

Signed-off-by: chaunceyjiang <[email protected]>

DarkLight1337 reviewed Apr 8, 2025

View reviewed changes

tests/models/multimodal/processing/test_smolvlm.py Outdated Show resolved Hide resolved

chaunceyjiang added 2 commits April 8, 2025 10:50

[Model] Add smolvlm suppor

17da1e4

Signed-off-by: chaunceyjiang <[email protected]>

[Model] Add smolvlm suppor

06cbcc4

Signed-off-by: chaunceyjiang <[email protected]>

DarkLight1337 reviewed Apr 8, 2025

View reviewed changes

vllm/model_executor/models/smolvlm.py Outdated Show resolved Hide resolved

[Model] Add smolvlm suppor

6b2cefd

Signed-off-by: chaunceyjiang <[email protected]>

DarkLight1337 approved these changes Apr 8, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) April 8, 2025 11:22

[Model] Add smolvlm suppor

55d7c63

Signed-off-by: chaunceyjiang <[email protected]>

auto-merge was automatically disabled April 8, 2025 14:34
Head branch was pushed to by a user without write access

chaunceyjiang requested a review from DarkLight1337 April 8, 2025 14:36

[Model] Add smolvlm suppor

be8665f

Signed-off-by: chaunceyjiang <[email protected]>

mergify bot added the ci/build label Apr 8, 2025

[Model] Add smolvlm suppor

23e91d8

Signed-off-by: chaunceyjiang <[email protected]>

vllm-bot merged commit 102bf96 into vllm-project:main Apr 9, 2025
64 of 67 checks passed

chaunceyjiang deleted the smolvlm branch April 9, 2025 02:14

yangw-dev pushed a commit to yangw-dev/vllm that referenced this pull request Apr 21, 2025

[Model] Add smolvlm support (vllm-project#16017)

007b8da

Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: Yang Wang <[email protected]>

jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Apr 29, 2025

[Model] Add smolvlm support (vllm-project#16017)

79015cd

Signed-off-by: chaunceyjiang <[email protected]>

lk-chen pushed a commit to lk-chen/vllm that referenced this pull request Apr 29, 2025

[Model] Add smolvlm support (vllm-project#16017)

6f93dde

Signed-off-by: chaunceyjiang <[email protected]>

RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025

[Model] Add smolvlm support (vllm-project#16017)

4978d6a

Signed-off-by: chaunceyjiang <[email protected]> Signed-off-by: Mu Huai <[email protected]>

Uh oh!

[Model] Add smolvlm support #16017

[Model] Add smolvlm support #16017

Uh oh!

Conversation

chaunceyjiang commented Apr 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 3, 2025

Uh oh!

chaunceyjiang commented Apr 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

chaunceyjiang commented Apr 8, 2025

Uh oh!

chaunceyjiang commented Apr 8, 2025

Uh oh!

Uh oh!

Uh oh!

chaunceyjiang commented Apr 8, 2025

Uh oh!

chaunceyjiang commented Apr 8, 2025

Uh oh!

DarkLight1337 commented Apr 8, 2025

Uh oh!

chaunceyjiang commented Apr 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chaunceyjiang commented Apr 3, 2025 •

edited by github-actions bot

Loading