Skip to content

Commit 0b51c9b

Browse files
authored
[Core] Early return in SlidingWindowManager.remove_skipped_blocks (#27673)
Signed-off-by: Jialin Ouyang <[email protected]>
1 parent d3ab240 commit 0b51c9b

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

vllm/v1/core/single_type_kv_cache_manager.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -394,7 +394,13 @@ def remove_skipped_blocks(self, request_id: str, num_computed_tokens: int) -> No
394394
# skipped during the attention computation.
395395
last_useful_token = num_computed_tokens - self.sliding_window + 1
396396
last_useful_block = last_useful_token // self.block_size
397+
if last_useful_block <= 0:
398+
# Early return if tokens are not enough to fill the sliding window
399+
return
397400
blocks = self.req_to_blocks[request_id]
401+
if blocks[last_useful_block - 1] == self._null_block:
402+
# Early return if there are no blocks to remove
403+
return
398404
removed_blocks: list[KVCacheBlock] = []
399405
for i in range(last_useful_block - 1, -1, -1):
400406
if blocks[i] == self._null_block:

0 commit comments

Comments
 (0)