HabanaAI / vllm-fork Public

forked from vllm-project/vllm

Notifications You must be signed in to change notification settings
Fork 134
Star 84

Code
Issues 12
Pull requests 75
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: HabanaAI/vllm-fork

Labels 19 Milestones 0

New pull request New

75 Open 1,931 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Optimized hpu_graph of bert models to improve the embedding performance.

#2111 opened Nov 1, 2025 by gyou2021

Loading…

Fix ray init params

#2110 opened Oct 30, 2025 by michalkuligowski

Loading…

Improve weights loading and fp8 range conversion on Gaudi2

#2108 opened Oct 30, 2025 by yangulei

Loading…

Multiple model engine and launch script to support qwen3 reranker

#2106 opened Oct 30, 2025 by tinafengfun

Loading…

Add support for MiniMax-M2

#2104 opened Oct 29, 2025 by Wei-Lin-Intel

Loading…

compose small seqlen sdpa for qwen2vl and qwen2.5vl

#2102 opened Oct 29, 2025 by yingjie-han

Loading…

Fix no attr enable_server_load_tracking error

#2097 opened Oct 28, 2025 by shepark

Loading…

Add max_pixels option.

#2094 opened Oct 28, 2025 by wenbinc-Bin

Loading…

Workaround for Assertion error when embedding with bge-m3 in lazy mode

#2093 opened Oct 28, 2025 by slokesha

Loading…

Move only the quantized model and tensors to HPU

#2091 opened Oct 27, 2025 by yangulei

Loading…

add draft version of vllm inference document for v1.22.0

#2082 opened Oct 24, 2025 by heyuanliu-intel

Loading…

3 tasks

Add dotsocr

#2077 opened Oct 23, 2025 by tianyuan211

Loading…

fix wrong section for Qwen series doc

#2074 opened Oct 23, 2025 by heyuanliu-intel

Loading…

3 tasks

[deepseek_r1]fix weight loading on 1.23

#2073 opened Oct 23, 2025 by ccrhx4

Loading…

Enable KEYE_VL on hpu v1.22

#2072 opened Oct 23, 2025 by yingjie-han

Loading…

Enable chunked prefill on aice 1.22

#2070 opened Oct 23, 2025 by YuJiankang

Loading…

refactor(hpu_model_runner): restructure multimodal-related code

#2066 opened Oct 22, 2025 by Jing1Ling • Draft

3 tasks

Slokesha port ovis

#2063 opened Oct 21, 2025 by slokesha • Draft

3 tasks

[CS-1549] Eanble function call DeepSeek-V3.1

#2047 opened Oct 19, 2025 by JianyuLi01

Loading…

Porting_ovis

#2044 opened Oct 16, 2025 by SupreetSinghPalne • Draft

3 tasks

Spalne/porting ovis

#2038 opened Oct 16, 2025 by SupreetSinghPalne • Draft

3 tasks

replaced apply_rotary_emb_torch() with rotary_embedding imp

#2037 opened Oct 15, 2025 by slokesha • Draft

3 tasks

fix bug that VLLM_SKIP_WARMUP=1 is not recognized in vision_bucket

#2036 opened Oct 15, 2025 by yingjie-han

Loading…

Fix cache miss for Ovis2.5

#2035 opened Oct 15, 2025 by Jianhong-Zhang • Draft

Fix cache miss for InternVL

#2034 opened Oct 15, 2025 by Jianhong-Zhang • Draft

Previous 1 2 3 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!