Qwen3 hpu support #656

kaixuanliu · 2025-06-25T08:25:57Z

No description provided.

kaixuanliu · 2025-06-25T08:29:57Z

@regisss pls help review. We use our own scripts to do a benchmark, w/ this PR, for model Qwen3/Qwen3-Embedding-8B, the throughput improved from original 121.2 seq/s to 126.93 seq/s

regisss

LGTM

Signed-off-by: Liu, Kaixuan <[email protected]>

kaixuanliu marked this pull request as ready for review June 25, 2025 08:26

regisss approved these changes Jun 25, 2025

View reviewed changes

regisss merged commit f1df357 into huggingface:main Jun 25, 2025

kaixuanliu added 2 commits June 25, 2025 10:34

add customized qwen3 model support for HPU

434e878

Signed-off-by: Liu, Kaixuan <[email protected]>

fix bug

33d2d0d

Signed-off-by: Liu, Kaixuan <[email protected]>

BrewTestBot mentioned this pull request Jun 30, 2025

text-embeddings-inference 1.7.3 Homebrew/homebrew-core#228576

Merged

kaixuanliu deleted the qwen3-hpu branch July 1, 2025 01:51

BrewTestBot mentioned this pull request Aug 5, 2025

text-embeddings-inference 1.8.0 Homebrew/homebrew-core#232408

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qwen3 hpu support #656

Qwen3 hpu support #656

kaixuanliu commented Jun 25, 2025

Uh oh!

kaixuanliu commented Jun 25, 2025

Uh oh!

regisss left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Qwen3 hpu support #656

Qwen3 hpu support #656

Conversation

kaixuanliu commented Jun 25, 2025

Uh oh!

kaixuanliu commented Jun 25, 2025

Uh oh!

regisss left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants