[ROCm] Remove unnecessary assertion of max_model_len in ROCM_AITER_MLA attention backend. #18938

vllmellm · 2025-05-30T05:16:50Z

This PR removes the unnecessary constraint and assertion of a specific max_model_len value from the ROCM_AITER_MLA attention backend on both VLLM v1 and v0 engines.

lm_eval results on DeepSeek-V2-Lite-Chat with default engine args on both VLLM v0 and v1 engines.

VLLM_USE_V1=1 VLLM_ROCM_USE_AITER=1 VLLM_ROCM_USE_AITER_MOE=0 VLLM_ROCM_USE_AITER_RMSNORM=0 VLLM_ROCM_USE_AITER_LINEAR=0 lm_eval --model vllm --model_args pretrained=deepseek-ai/DeepSeek-V2-Lite-Chat,tensor_parallel_size=1,block_size=1 --trust_remote_code --tasks gsm8k --num_fewshot 5 --batch_size auto

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k	3	flexible-extract	5	exact_match	↑	0.6657	±	0.0130
		strict-match	5	exact_match	↑	0.6543	±	0.0131

VLLM_USE_V1=0 VLLM_ROCM_USE_AITER=1 VLLM_ROCM_USE_AITER_MOE=0 VLLM_ROCM_USE_AITER_RMSNORM=0 VLLM_ROCM_USE_AITER_LINEAR=0 lm_eval --model vllm --model_args pretrained=deepseek-ai/DeepSeek-V2-Lite-Chat,tensor_parallel_size=1,block_size=1 --trust_remote_code --tasks gsm8k --num_fewshot 5 --batch_size auto

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
gsm8k	3	flexible-extract	5	exact_match	↑	0.5974	±	0.0135
		strict-match	5	exact_match	↑	0.5876	±	0.0136

Signed-off-by: vllmellm <[email protected]>

github-actions · 2025-05-30T05:17:00Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

…A attention backend. (vllm-project#18938) Signed-off-by: vllmellm <[email protected]> Signed-off-by: amit <[email protected]>

remove unnecessary assertion of max_model_len in rocm_aiter_mla

3853544

Signed-off-by: vllmellm <[email protected]>

vllmellm requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners May 30, 2025 05:16

mergify bot added the v1 label May 30, 2025

DarkLight1337 approved these changes May 30, 2025

View reviewed changes

vllm-bot merged commit 77b6e74 into vllm-project:main May 30, 2025
9 of 12 checks passed

tjtanaa mentioned this pull request Jun 2, 2025

[Feature] [ROCm]: AITER Kernel Integration #14964

Open

61 tasks

vllmellm mentioned this pull request Aug 26, 2025

[Feature] [ROCm]: AITER Kernel Integration vllmellm/vllm#51

Open

61 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[ROCm] Remove unnecessary assertion of max_model_len in ROCM_AITER_MLA attention backend. #18938

[ROCm] Remove unnecessary assertion of max_model_len in ROCM_AITER_MLA attention backend. #18938

Uh oh!

vllmellm commented May 30, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[ROCm] Remove unnecessary assertion of max_model_len in ROCM_AITER_MLA attention backend. #18938

[ROCm] Remove unnecessary assertion of max_model_len in ROCM_AITER_MLA attention backend. #18938

Uh oh!

Conversation

vllmellm commented May 30, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

lm_eval results on DeepSeek-V2-Lite-Chat with default engine args on both VLLM v0 and v1 engines.

Uh oh!

github-actions bot commented May 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vllmellm commented May 30, 2025 •

edited by github-actions bot

Loading