Skip to content

Conversation

@zouyida2002
Copy link
Contributor

What this PR does / why we need it?

optimize Qwen2 5 vl vit with pta

Does this PR introduce any user-facing change?

no

How was this patch tested?

we've tested on benchmark and it proves to be equal.

@wangxiyuan wangxiyuan changed the title optimize Qwen2.5 vl vit [0.7.3] optimize Qwen2.5 vl vit Apr 23, 2025
@wangxiyuan
Copy link
Collaborator

I'm fine with this change, this is for qwen2.5 vl performance improvement. @ganyi1996ppo please double check it. Thanks.



RotaryEmbedding.forward_oot = rope_forward_oot
MRotaryEmbedding.forward = mrope_forward
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

basically, we don't want to change anything in vllm except foward_oot. For this kind of change, we should ask vllm to satisfy our requirement. Let's do it in the future. Thanks.


q, k, v = (rearrange(x, "s b ... -> b s ...").contiguous()
for x in (q, k, v))
q = torch_npu.npu_rotary_mul(q, cos, sin)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any custom op here? Qwen2.5 VL, what's the shape of these q, k, cos and sin?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function's functionality is as follows:

x1, x2 = torch.chunk(q, 2, -1)
x_new = torch.cat((-x2, x1), dim=-1)
output = cos * x + sin * x_new

I cann't find any custom op that meets my expectations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks exactly what normal rotary embedding do.....

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can merge this PR first and optimize at next PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks for your advice, I will optimize it soon.

@ganyi1996ppo ganyi1996ppo merged commit 1e56aae into vllm-project:v0.7.3-dev Apr 23, 2025
11 checks passed
@Yikun
Copy link
Collaborator

Yikun commented Apr 25, 2025

This should also be merged to main before v0.8.4.rc2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants