-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Description
Motivation.
Overview
As we transition to vLLM V1, we plan to discontinue support for the best_of
sampling parameter. This decision is driven by a combination of low usage, alignment with industry trends, and a desire for system simplicity and performance.
Background: What is best_of
?
The best_of
parameter was originally part of the earlier OpenAI completion API. It enabled the generation of multiple completions—n
different outputs—then selected the “best” completion based on the cumulative log probabilities of each result.
Reasons for Deprecation
-
Limited Usage and Industry Trends:
- Low Adoption: To the best of our knowledge, the
best_of
feature is used by very few users. Users have observed that output quality isn’t reliably correlated with their log probabilities in most cases. - Evolving Standards: Major AI providers such as OpenAI (in its current API), Claude, and Gemini have moved away from including the
best_of
option.
- Low Adoption: To the best of our knowledge, the
-
Alternative Methods:
- Users can implement
best_of
by leveraging then
parameter to obtain multiple completions and thelogprobs
parameter for the log probability of each generated token. This method effectively replicates the behavior ofbest_of
without requiring dedicated support.
- Users can implement
-
System Simplification and Performance:
- Supporting
best_of
introduces additional complexity, as it necessitates tracking cumulative log probabilities for each generated completion. This extra overhead runs counter to our focus on performance and streamlined design.
- Supporting
Proposed Change.
In light of the minimal usage, the availability of alternative methods, and our commitment to a simpler, more efficient system, we plan to phasing out the best_of
parameter with vLLM V1. Users who wish to mimic its functionality can continue to do so by generating multiple completions and comparing their log probabilities directly.
Please let us know if this change impacts your usage or if you have any other concerns.
Feedback Period.
2 weeks.
CC List.
No response
Any Other Things.
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.