forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 134
Pull requests: HabanaAI/vllm-fork
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Optimized hpu_graph of bert models to improve the embedding performance.
#2111
opened Nov 1, 2025 by
gyou2021
Loading…
Improve weights loading and fp8 range conversion on Gaudi2
#2108
opened Oct 30, 2025 by
yangulei
Loading…
Multiple model engine and launch script to support qwen3 reranker
#2106
opened Oct 30, 2025 by
tinafengfun
Loading…
compose small seqlen sdpa for qwen2vl and qwen2.5vl
#2102
opened Oct 29, 2025 by
yingjie-han
Loading…
Workaround for Assertion error when embedding with bge-m3 in lazy mode
#2093
opened Oct 28, 2025 by
slokesha
Loading…
add draft version of vllm inference document for v1.22.0
#2082
opened Oct 24, 2025 by
heyuanliu-intel
Loading…
3 tasks
fix bug that VLLM_SKIP_WARMUP=1 is not recognized in vision_bucket
#2036
opened Oct 15, 2025 by
yingjie-han
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.