[Bug]: Sequence Parallelism and Async TP disabled by default

Currently, due to compilation issues, we only enable sequence parallelism (and the dependent AsyncTP) for static compile sizes (and not by default). That's because sequence parallelism splits the residual tensor into smaller pieces, which breaks with piecewise compilation and dynamic shapes. #21031 addressed this but got stuck on an Inductor bug that caused extreme memory pressure. That Inductor bug has since been resolved with PyTorch 2.9.

The course of action:
1. Pick up #21031 and verify that [torch==2.9](https://dev-discuss.pytorch.org/t/pytorch-2-9-rc1-produced-for-pytorch/3230) resolved the memory issue (compare end-to-end activation memory with pass disabled and enabled).
2. Test #21031 with changes in #24281 and `-O. use_inductor_graph_partition=True` to check that Inductor partitioning full compilation works with sequence parallelism.
3. Check end-to-end performance of sequence parallelism alone as well as async TP on a dense unquantized and dense quantized model (both Hopper and Blackwell). Make sure to use full cudagraphs where available.
4. Merge the PR guarding on torch 2.9 and enable seq par and async tp by default to get performance on day 0 of torch 2.9 release.

Additionally, the padding requirement should be re-evaluated: we should benchmark to see the performance reduction in padding `num_tokens` and compare it to just manually padding with `-num_tokens % tp_size` around the sequence parallel section, or by doing uneven work across TP ranks by manipulating the sizes returned by `reduce_scatter` (likely too complicated)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Sequence Parallelism and Async TP disabled by default #25277

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Sequence Parallelism and Async TP disabled by default #25277

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions