Skip to content

Commit 7b8fb4a

Browse files
committed
fix hanging issue when cudagraph is enababled
Signed-off-by: elvischenv <[email protected]>
1 parent 5a9da1d commit 7b8fb4a

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/utils/flashinfer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -228,7 +228,7 @@ def use_trtllm_attention(
228228

229229
if force_use_trtllm is None:
230230
# Environment variable not set - use auto-detection
231-
use_trtllm = (num_tokens <= 256 and max_seq_len < 131072
231+
use_trtllm = (num_tokens <= 256 and max_seq_len <= 131072
232232
and kv_cache_dtype == "auto")
233233
if use_trtllm:
234234
logger.warning_once("Using TRTLLM attention (auto-detected).")

0 commit comments

Comments
 (0)