Skip to content

[RFC]: Deprecation of the best_of Sampling Parameter in vLLM V1 #13361

@WoosukKwon

Description

@WoosukKwon

Motivation.

Overview

As we transition to vLLM V1, we plan to discontinue support for the best_of sampling parameter. This decision is driven by a combination of low usage, alignment with industry trends, and a desire for system simplicity and performance.

Background: What is best_of?

The best_of parameter was originally part of the earlier OpenAI completion API. It enabled the generation of multiple completions—n different outputs—then selected the “best” completion based on the cumulative log probabilities of each result.

Reasons for Deprecation

  1. Limited Usage and Industry Trends:

    • Low Adoption: To the best of our knowledge, the best_of feature is used by very few users. Users have observed that output quality isn’t reliably correlated with their log probabilities in most cases.
    • Evolving Standards: Major AI providers such as OpenAI (in its current API), Claude, and Gemini have moved away from including the best_of option.
  2. Alternative Methods:

    • Users can implement best_of by leveraging the n parameter to obtain multiple completions and the logprobs parameter for the log probability of each generated token. This method effectively replicates the behavior of best_of without requiring dedicated support.
  3. System Simplification and Performance:

    • Supporting best_of introduces additional complexity, as it necessitates tracking cumulative log probabilities for each generated completion. This extra overhead runs counter to our focus on performance and streamlined design.

Proposed Change.

In light of the minimal usage, the availability of alternative methods, and our commitment to a simpler, more efficient system, we plan to phasing out the best_of parameter with vLLM V1. Users who wish to mimic its functionality can continue to do so by generating multiple completions and comparing their log probabilities directly.

Please let us know if this change impacts your usage or if you have any other concerns.

Feedback Period.

2 weeks.

CC List.

No response

Any Other Things.

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions