[Bugfix] Merge MM embeddings by index instead of token IDs #16229

DarkLight1337 · 2025-04-08T03:20:04Z

This PR fixes a mismatch in merging multi-modal embeddings when the model itself generates embedding placeholder tokens such as <image>. Although this error mainly occurs in V1, it can possibly occur in V0 as well. This PR focuses on the V1 case.

For V0 users, you can work around this by setting top_p so that the model has no chance of generating such tokens.

FIX #15677
FIX #15764
FIX #23891
FIX #23954
FIX #24456

Breaking change for model developers

This PR has updated SupportsMultiModal.get_input_embeddings to support passing is_multimodal mask and added a default implementation so that there is no need to override it in most cases. OOT/WIP models should either remove their override to use the default implementation, or update their override to accept is_multimodal and do_language_embed_multimodal arguments.

Text-only model developers should ensure that their models have implemented get_input_embeddings to continue using them in vLLM.

Breaking change for model runner plugins

In order to continue supporting multimodal models, you should update _gather_mm_embeddings method to build up and return the is_mm_embed mask, then pass it to the model.

…ken ID Signed-off-by: DarkLight1337 <[email protected]>

github-actions · 2025-04-08T03:20:14Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: DarkLight1337 <[email protected]>

vllm/v1/worker/tpu_model_runner.py

Signed-off-by: DarkLight1337 <[email protected]>

vllm/v1/worker/tpu_model_runner.py

Signed-off-by: DarkLight1337 <[email protected]>

vllm/v1/worker/tpu_model_runner.py

Signed-off-by: DarkLight1337 <[email protected]>

Signed-off-by: Roger Wang <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]>

NickLucche

Fixed the tpu issues, thanks @DarkLight1337 !

mergify · 2025-09-22T10:57:14Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @DarkLight1337.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: DarkLight1337 <[email protected]>

mergify · 2025-09-27T04:47:37Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @DarkLight1337.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 · 2025-09-27T06:56:58Z

We will not include this in the v0.11 release because of the breaking change. But it should be fine to merge this into main branch since the release branch has been cut already.

Signed-off-by: DarkLight1337 <[email protected]>

upstream PR: vllm-project/vllm#16229 Fix is still in progress, don't merge yet --------- Signed-off-by: Agata Dobrzyniewicz <[email protected]> Signed-off-by: Chendi Xue <[email protected]> Co-authored-by: Chendi Xue <[email protected]>

upstream PR: vllm-project/vllm#16229 Fix is still in progress, don't merge yet --------- Signed-off-by: Agata Dobrzyniewicz <[email protected]> Signed-off-by: Chendi Xue <[email protected]> Co-authored-by: Chendi Xue <[email protected]> Signed-off-by: Iryna Boiko <[email protected]>

…ect#16229) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: NickLucche <[email protected]> Co-authored-by: Roger Wang <[email protected]>

Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: NickLucche <[email protected]> Co-authored-by: Roger Wang <[email protected]> Signed-off-by: yewentao256 <[email protected]>

…ect#16229) Signed-off-by: DarkLight1337 <[email protected]> Signed-off-by: NickLucche <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: NickLucche <[email protected]> Co-authored-by: Roger Wang <[email protected]> Signed-off-by: xuebwang-amd <[email protected]>

[Bugfix] Merge multimodal embeddings by is_embed mask instead of to…

dfebf51

…ken ID Signed-off-by: DarkLight1337 <[email protected]>

mergify bot added v1 tpu Related to Google TPUs labels Apr 8, 2025

DarkLight1337 changed the title ~~[Bugfix] Merge multimodal embeddings by mask instead of token ID~~ [Bugfix][V1] Merge multimodal embeddings by mask instead of token ID Apr 8, 2025

Rename

437dacd

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 changed the title ~~[Bugfix][V1] Merge multimodal embeddings by mask instead of token ID~~ [Bugfix][V1] Merge multimodal embeddings by index instead of token ID Apr 8, 2025

DarkLight1337 changed the title ~~[Bugfix][V1] Merge multimodal embeddings by index instead of token ID~~ [Bugfix][V1] Merge multimodal embeddings by index instead of matching token ID Apr 8, 2025

DarkLight1337 changed the title ~~[Bugfix][V1] Merge multimodal embeddings by index instead of matching token ID~~ [Bugfix][V1] Merge multimodal embeddings by index instead of matching tokens Apr 8, 2025

DarkLight1337 added 2 commits April 9, 2025 11:13

Merge branch 'main' into rm-merge-mm-embeddings

bbe7096

Use vllm-project#16007

57e9f03

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 added this to Multi-modality Core Apr 12, 2025

DarkLight1337 moved this to In Progress in Multi-modality Core Apr 12, 2025

FerryHuang mentioned this pull request Apr 12, 2025

[Bug]: qwen2-vl 7b, on vllm 0.8.1 & 0.8.2, sometimes (not deterministically but depends on data) I got: ValueError: Attempted to assign 702 = 702 multimodal tokens to 703 placeholders #15764

Closed

1 task

DarkLight1337 added 6 commits August 27, 2025 13:32

Merge branch 'main' into rm-merge-mm-embeddings

d5c9555

Signed-off-by: DarkLight1337 <[email protected]>

Fix

e08deaa

Signed-off-by: DarkLight1337 <[email protected]>

Merge branch 'main' into rm-merge-mm-embeddings

302b2c5

Update

6a1307f

Signed-off-by: DarkLight1337 <[email protected]>

Fix

3a4740a

Signed-off-by: DarkLight1337 <[email protected]>

Draft

68c54d8

Signed-off-by: DarkLight1337 <[email protected]>

mergify bot added the speculative-decoding label Aug 28, 2025

DarkLight1337 added 2 commits August 28, 2025 10:41

Fix device

6ddc91e

Signed-off-by: DarkLight1337 <[email protected]>

Persistent buffer

28cc8cb

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 commented Aug 28, 2025

View reviewed changes

vllm/v1/worker/tpu_model_runner.py Outdated Show resolved Hide resolved

Avoid unnecessary initialization

c335908

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 commented Aug 28, 2025

View reviewed changes

vllm/v1/worker/tpu_model_runner.py Outdated Show resolved Hide resolved

DarkLight1337 added 2 commits August 28, 2025 14:15

Fix reset

cbb70ea

Signed-off-by: DarkLight1337 <[email protected]>

Update

76f2925

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 commented Aug 28, 2025

View reviewed changes

vllm/v1/worker/tpu_model_runner.py Outdated Show resolved Hide resolved

Simplify

b6e8775

Signed-off-by: DarkLight1337 <[email protected]>

ywang96 and others added 3 commits September 22, 2025 00:09

fix qwen3-vl

8a6fb1b

Signed-off-by: Roger Wang <[email protected]>

Fix wrong condition

2eefc2d

Signed-off-by: DarkLight1337 <[email protected]>

Merge branch 'main' into rm-merge-mm-embeddings

b79860e

NickLucche approved these changes Sep 22, 2025

View reviewed changes

mergify bot added the needs-rebase label Sep 22, 2025

Merge branch 'main' into rm-merge-mm-embeddings

7769ec1

Signed-off-by: DarkLight1337 <[email protected]>

mergify bot removed the needs-rebase label Sep 24, 2025

DarkLight1337 added 4 commits September 24, 2025 08:53

Reduce diff

aa67033

Signed-off-by: DarkLight1337 <[email protected]>

Merge branch 'main' into rm-merge-mm-embeddings

3656239

Signed-off-by: DarkLight1337 <[email protected]>

Simplify

9260170

Signed-off-by: DarkLight1337 <[email protected]>

Fix doc

2ac91b6

Signed-off-by: DarkLight1337 <[email protected]>

mergify bot added the needs-rebase label Sep 27, 2025

Merge branch 'main' into rm-merge-mm-embeddings

3033297

Signed-off-by: DarkLight1337 <[email protected]>

mergify bot removed the needs-rebase label Sep 27, 2025

DarkLight1337 enabled auto-merge (squash) September 27, 2025 06:56

DarkLight1337 merged commit 27d7638 into vllm-project:main Sep 27, 2025
52 checks passed

DarkLight1337 deleted the rm-merge-mm-embeddings branch September 27, 2025 08:15

DarkLight1337 added a commit to wangxiongts/vllm-dev that referenced this pull request Sep 27, 2025

Update w.r.t. vllm-project#16229

f0d057a

Signed-off-by: DarkLight1337 <[email protected]>

xuechendi mentioned this pull request Oct 1, 2025

Fix after #16229, mm vllm-project/vllm-gaudi#286

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Merge MM embeddings by index instead of token IDs #16229

[Bugfix] Merge MM embeddings by index instead of token IDs #16229

Uh oh!

DarkLight1337 commented Apr 8, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Apr 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NickLucche left a comment

Uh oh!

mergify bot commented Sep 22, 2025

Uh oh!

mergify bot commented Sep 27, 2025

Uh oh!

DarkLight1337 commented Sep 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Bugfix] Merge MM embeddings by index instead of token IDs #16229

[Bugfix] Merge MM embeddings by index instead of token IDs #16229

Uh oh!

Conversation

DarkLight1337 commented Apr 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Breaking change for model developers

Breaking change for model runner plugins

Uh oh!

github-actions bot commented Apr 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

NickLucche left a comment

Choose a reason for hiding this comment

Uh oh!

mergify bot commented Sep 22, 2025

Uh oh!

mergify bot commented Sep 27, 2025

Uh oh!

DarkLight1337 commented Sep 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DarkLight1337 commented Apr 8, 2025 •

edited by github-actions bot

Loading

DarkLight1337 commented Sep 27, 2025 •

edited

Loading