Skip to content

Conversation

@nvjullin
Copy link
Contributor

@nvjullin nvjullin commented Sep 17, 2025

Purpose

The blocking H2D memcpys breaks overlap scheduler #23569, setting them to non-blocking fixes it.
The correctness is ensured by vllm/v1/worker/gpu_model_runner.py:2112

            if self.prepare_inputs_event is not None:
                # Ensure prior step has finished with reused CPU tensors.
                self.prepare_inputs_event.synchronize()
            try:
                # Prepare the decoder inputs.
                (attn_metadata, logits_indices, spec_decode_metadata,
                 num_scheduled_tokens_np, spec_decode_common_attn_metadata,
                 max_query_len, ubatch_slices, num_tokens_after_padding
                 ) = self._prepare_inputs(scheduler_output)

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@nvjullin nvjullin changed the title Make FlashInferMetadataBuilder non-blocking [BugFix] Make FlashInferMetadataBuilder non-blocking Sep 17, 2025
@mergify mergify bot added the v1 label Sep 17, 2025
@nvjullin nvjullin marked this pull request as ready for review September 17, 2025 06:03
@nvjullin nvjullin requested a review from mgoin as a code owner September 17, 2025 06:03
@nvjullin nvjullin marked this pull request as draft September 17, 2025 06:48
@nvjullin nvjullin marked this pull request as ready for review September 17, 2025 08:32
@nvpohanh
Copy link
Contributor

@benchislett Please check if this fix is correct. Thanks!

Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me, but @LucasWilkinson or @benchislett should validate before merge

@mgoin mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 18, 2025
Copy link
Collaborator

@LucasWilkinson LucasWilkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the contribution!

@mgoin mgoin enabled auto-merge (squash) September 19, 2025 18:21
@mgoin mgoin merged commit b1a63d1 into vllm-project:main Sep 19, 2025
44 checks passed
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
)

Signed-off-by: Julien Lin <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Signed-off-by: charlifu <[email protected]>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: Julien Lin <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Signed-off-by: yewentao256 <[email protected]>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
)

Signed-off-by: Julien Lin <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Signed-off-by: xuebwang-amd <[email protected]>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
)

Signed-off-by: Julien Lin <[email protected]>
Co-authored-by: Michael Goin <[email protected]>
Signed-off-by: xuebwang-amd <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants