[WIP][Feature] Impl the connector based on the llmdatadist for v1 #681
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements the connector functionality for NPU based on LLMDataDist, building upon the connector API merged in vLLM v1. (vllm-project/vllm#15960) We've successfully tested various scenarios in offline environments:
Key implementation aspects include:
Cross-machine PD: LLMDataDist requires NPU device IP for connection establishment. Our approach utilizes a global rank table (JSON) on each machine containing:
nPmD: Given that the community's nPmD design, particularly the router component API, is still evolving, we've implemented a solution using a meta server component (to be provided separately) that:
We propose initially merging the 1P1D implementation, where the global rank table contains information for two nodes, allowing direct prefill node identification. The nPmD implementation can be refined and merged following community discussion.
Todo: