[BugFix] Work around graph partition x torch.compile cache issue #26956

zou3519 · 2025-10-15T23:31:04Z

In PyTorch 2.9, torch.compile has a bug where the graph partition is not taken into account during caching. Because vLLM's Mode.VLLM_COMPILE is the only mode that uses Inductor graph partition, and VLLM_COMPILE implies there is a PostGradPassManager, we put the list of operators to graph partition into the PostGradPassManager's uuid (which then gets incorporated into Inductor's FX graph cache key). Remove this hack whenever torch.compile fixes it.

gemini-code-assist

Code Review

This pull request introduces a workaround for a torch.compile caching bug related to graph partitioning by including the list of splitting operators in the PostGradPassManager's UUID. However, I've found a critical issue where the list of splitting operators is not correctly assigned, which means the workaround is currently ineffective. My review includes a specific code suggestion to fix this.

vllm/compilation/pass_manager.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

vllm/compilation/pass_manager.py

ProExpertProg

I'm confused; do we want to treat no-inductor-partition and inductor-partition-with-empty-splitting-ops differently or not?

ProExpertProg · 2025-10-15T23:51:58Z

vllm/compilation/pass_manager.py

+        # Remove this hack whenever torch.compile fixes it.
+        self.splitting_ops = None
+        if config.compilation_config.use_inductor_graph_partition:
+            if config.compilation_config.splitting_ops is None:


Comment that we want empty splitting ops with inductor partition to behave differently than any splitting ops without inductor partition?

ProExpertProg · 2025-10-15T23:52:36Z

vllm/compilation/pass_manager.py

        state["passes"].append(self.fix_functionalization.uuid())
+
+        # See [HACK: Bug with Inductor graph partition and torch.compile cache]
+        if self.splitting_ops is not None:


Nvm, in both cases we will end up with an empty list

zou3519 · 2025-10-16T00:00:30Z

@ProExpertProg I updated the code to be clearer, if that helps. We want to add "the operators that we ask inductor to split" as a part of the cache key. If inductor_graph_partition is False, that is no operators, if inductor_graph_partition is True, then that is whatever compilation_config.splitting_ops is.

ProExpertProg

@zou3519 could you update the test in test_toy_llama that currently disables FX cache to work around this? Then I can add this PR to the inductor partition CI PR

In PyTorch 2.9, torch.compile has a bug where the graph partition is not taken into account during caching. Because vLLM's Mode.VLLM_COMPILE is the only mode that uses Inductor graph partition, and VLLM_COMPILE implies there is a PostGradPassManager, we put the list of operators to graph partition into the PostGradPassManager's uuid (which then gets incorporated into Inductor's FX graph cache key). Remove this hack whenever torch.compile fixes it. Signed-off-by: Richard Zou <[email protected]>

zou3519 · 2025-10-16T00:25:53Z

could you update the test in test_toy_llama that currently disables FX cache to work around this? Then I can add this PR to the inductor partition CI PR

Yup, updated

ProExpertProg

Thx

commit ad717d4 Author: Richard Zou <[email protected]> Date: Wed Oct 15 16:29:49 2025 -0700 [BugFix] Work around graph partition x torch.compile cache issue In PyTorch 2.9, torch.compile has a bug where the graph partition is not taken into account during caching. Because vLLM's Mode.VLLM_COMPILE is the only mode that uses Inductor graph partition, and VLLM_COMPILE implies there is a PostGradPassManager, we put the list of operators to graph partition into the PostGradPassManager's uuid (which then gets incorporated into Inductor's FX graph cache key). Remove this hack whenever torch.compile fixes it. Signed-off-by: Richard Zou <[email protected]> Signed-off-by: ProExpertProg <[email protected]>

…m-project#26956) Signed-off-by: Richard Zou <[email protected]>

…m-project#26956) Signed-off-by: Richard Zou <[email protected]> Signed-off-by: Alberto Perdomo <[email protected]>

zou3519 requested review from ProExpertProg and youkaichao as code owners October 15, 2025 23:31

gemini-code-assist bot reviewed Oct 15, 2025

View reviewed changes

vllm/compilation/pass_manager.py Outdated Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Oct 15, 2025

View reviewed changes

vllm/compilation/pass_manager.py Outdated Show resolved Hide resolved

zou3519 force-pushed the fix_partition_cache branch 2 times, most recently from 75d4e46 to 19ca497 Compare October 15, 2025 23:34

zou3519 mentioned this pull request Oct 15, 2025

inductor graph partition extension mechanism doesn't work with fx graph cache pytorch/pytorch#165595

Open

ProExpertProg reviewed Oct 15, 2025

View reviewed changes

zou3519 force-pushed the fix_partition_cache branch from 19ca497 to e12bbdd Compare October 15, 2025 23:58

ProExpertProg approved these changes Oct 16, 2025

View reviewed changes

zou3519 force-pushed the fix_partition_cache branch from e12bbdd to ad717d4 Compare October 16, 2025 00:25

mergify bot added the llama Related to Llama models label Oct 16, 2025

ProExpertProg added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 16, 2025

ProExpertProg approved these changes Oct 16, 2025

View reviewed changes

ProExpertProg mentioned this pull request Oct 16, 2025

[DO NOT MERGE] 2.9, Inductor partition, standalone compile, monkeypatch fix(es) #26738

Open

vllm-bot merged commit 9b6504c into vllm-project:main Oct 16, 2025
45 of 48 checks passed

ProExpertProg added this to the vllm==v0.12.0/torch==2.9.0 compilation improvements milestone Oct 16, 2025

mandy-li pushed a commit to mandy-li/vllm that referenced this pull request Oct 16, 2025

[BugFix] Work around graph partition x torch.compile cache issue (vll…

ebe91d1

…m-project#26956) Signed-off-by: Richard Zou <[email protected]>

ProExpertProg mentioned this pull request Oct 17, 2025

[RFC]: To Inductor partition or to not Inductor partition (by default in v0.11.1) #27080

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BugFix] Work around graph partition x torch.compile cache issue #26956

[BugFix] Work around graph partition x torch.compile cache issue #26956

Uh oh!

zou3519 commented Oct 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

ProExpertProg left a comment

Uh oh!

ProExpertProg Oct 15, 2025

Uh oh!

ProExpertProg Oct 15, 2025

Uh oh!

zou3519 commented Oct 16, 2025

Uh oh!

ProExpertProg left a comment

Uh oh!

zou3519 commented Oct 16, 2025

Uh oh!

ProExpertProg left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[BugFix] Work around graph partition x torch.compile cache issue #26956

[BugFix] Work around graph partition x torch.compile cache issue #26956

Uh oh!

Conversation

zou3519 commented Oct 15, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

ProExpertProg Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

ProExpertProg Oct 15, 2025

Choose a reason for hiding this comment

Uh oh!

zou3519 commented Oct 16, 2025

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

zou3519 commented Oct 16, 2025

Uh oh!

ProExpertProg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants