Skip to content

Commit 74dd803

Browse files
njhilladobrzyn
authored andcommitted
[BugFix] Fix incremental detokenization perf issue (vllm-project#16963)
Signed-off-by: Nick Hill <[email protected]> Signed-off-by: Agata Dobrzyniewicz <[email protected]>
1 parent eaed4f9 commit 74dd803

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/v1/engine/detokenizer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -161,7 +161,7 @@ def __init__(self, tokenizer: PreTrainedTokenizerFast,
161161
prompt_suffix = request.prompt_token_ids
162162
prompt_len = len(prompt_suffix)
163163
if prompt_len > 4:
164-
for i in range(4, max(prompt_len + 1, 32)):
164+
for i in range(4, min(prompt_len + 1, 24)):
165165
suffix = request.prompt_token_ids[-i:]
166166
if '�' not in self.tokenizer.decode(suffix):
167167
prompt_suffix = suffix

0 commit comments

Comments
 (0)