Skip to content

[Doc]: Update the vllm distributed Inference and Serving with the new MultiprocessingGPUExecutor #5221

@rcarrata

Description

@rcarrata

📚 The doc issue

The vLLM documentation only reflects the possibility to use Ray for running Distributed Inference and Serving with vLLM, even though the #4539 issue is merged and v0.4.3 is released with the MultiprocessingGPUExecutor feature included as an alternative to Ray for single-node inferencing.

Suggest a potential alternative/fix

Update the documentation to reflect the possibility of using MultiprocessingGPUExecutor as an alternative to Ray for single-node inferencing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions