Skip to content

Conversation

@alexm-redhat
Copy link
Collaborator

Tests showed that limiting kv_split to 2 is not enough and it needs to be limited to 1 to ensure no-hangs (1 disables the sm100 cutlass reduction kernel)

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical bug that causes hangs on SM100 architectures when the batch size is greater than one. The fix involves limiting max_splits to 1 in such cases, which correctly disables the problematic reduction kernel as described. The code change is accurate, and the corresponding comment has been updated to reflect the new logic. This is a solid workaround for a significant stability issue.

@mgoin mgoin added bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed labels Sep 23, 2025
@vllm-bot vllm-bot merged commit 1210e4d into main Sep 23, 2025
93 of 94 checks passed
@vllm-bot vllm-bot deleted the fix_kv_split_again branch September 23, 2025 23:57
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
gjc0824 pushed a commit to gjc0824/vllm that referenced this pull request Oct 10, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants