Skip to content

Commit b80a1b6

Browse files
committed
Try fetching stop_reason from EngineOutput before checking the request
Signed-off-by: Bill Nell <[email protected]>
1 parent 565c1ef commit b80a1b6

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

vllm/v1/engine/output_processor.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -179,11 +179,14 @@ def process_outputs(
179179
# in the EngineCore.
180180
req_state.is_prefilling = not new_token_ids
181181

182+
stop_reason = engine_core_output.stop_reason
183+
182184
# 2) Detokenize the token ids into text and check for stop
183185
# strings.
184-
stop_reason = req_state.detokenizer.update(new_token_ids)
185-
if stop_reason:
186+
stop_string = req_state.detokenizer.update(new_token_ids)
187+
if stop_string and finish_reason != FinishReason.STOP:
186188
finish_reason = FinishReason.STOP
189+
stop_reason = stop_string
187190

188191
# 3) Compute sample and prompt logprobs for request,
189192
# if required.

0 commit comments

Comments
 (0)