Skip to content

Can tensorrt-llm or how tensorrt-llm support that seprating the prefill stage and decode stage in different GPU or different nodes with self configuration #2235

@GGBond8488

Description

@GGBond8488

Metadata

Metadata

Assignees

Labels

InvestigatingquestionFurther information is requestedtriagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions