Skip to content

Conversation

@amanasifkhalid
Copy link
Contributor

Part of #107749. This replaces optMarkLoopHeads and optFindAndScaleLoopBlocks with the graph-based loop visitor pattern we now use elsewhere in the JIT. I tried to preserve the heuristics in optScaleLoopBlocks, but a side effect of this change is we no longer scale unnatural loops. However, we no longer rely on lexical loop finding, so we gain some precision as well.

Once this and graph-based loop inversion have been merged, we will no longer have any fgRenumberBlocks calls.

Copilot AI review requested due to automatic review settings May 29, 2025 23:31
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 29, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Replaces the legacy lexical loop-scaling functions with a graph-based visitor pattern for optScaleLoopBlocks, removes the BBF_LOOP_HEAD flag and its handling, and updates diagnostics and block splitting to align with the new approach.

  • Switches loop-weight scaling to use FlowGraphNaturalLoop and VisitLoopBlocks
  • Eliminates optMarkLoopHeads, optFindAndScaleGeneralLoopBlocks, and the BBF_LOOP_HEAD flag
  • Cleans up diagnostic output and block-splitting logic to drop loop-head considerations

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/jit/optimizer.cpp Replaced legacy loop-scaling with graph visitor
src/coreclr/jit/block.h Removed BBF_LOOP_HEAD and renumbered flags
src/coreclr/jit/flowgraph.cpp Adjusted split‐flag assertion without loop‐head
src/coreclr/jit/fgdiagnostic.cpp Dropped loop-head from flow-graph dumps
src/coreclr/jit/fgbasic.cpp No longer clears loop-head in block splits
src/coreclr/jit/compiler.h Updated optScaleLoopBlocks signature
src/coreclr/jit/block.cpp Removed loop-head from flag display
Comments suppressed due to low confidence (1)

src/coreclr/jit/optimizer.cpp:66

  • Since unnatural loops are no longer scaled by this change, add unit tests for scenarios with irreducible control flow to verify that loop weights remain unscaled as intended.
for (FlowGraphNaturalLoop* const loop : m_loops->InReversePostOrder())

// 512 -- triple loop nesting
//
void Compiler::optScaleLoopBlocks(BasicBlock* begBlk, BasicBlock* endBlk)
void Compiler::optScaleLoopBlocks(FlowGraphNaturalLoop* loop)
Copy link

Copilot AI May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The function header should include a brief note that this uses VisitLoopBlocks on FlowGraphNaturalLoop and scales blocks based on back-edge reachability and dominance.

Copilot uses AI. Check for mistakes.
bool dominates = false;

if (m_reachabilitySets->CanReach(curBlk, begBlk) && m_reachabilitySets->CanReach(begBlk, curBlk))
for (FlowEdge* const backEdge : loop->BackEdges())
Copy link

Copilot AI May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider retrieving loop->BackEdges() into a local variable outside the inner loop or lambda to avoid repeated accessor calls and slightly reduce overhead.

Suggested change
for (FlowEdge* const backEdge : loop->BackEdges())
const auto& backEdges = loop->BackEdges();
for (FlowEdge* const backEdge : backEdges)

Copilot uses AI. Check for mistakes.
break;
}
}
reachable |= m_reachabilitySets->CanReach(curBlk, backEdgeSource);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious what the diffs are if you consider reachable here to be always true?

All loop blocks can reach all other loop blocks, so conceptually this would always be true... Except that reachability sets do not consider EH flow, so there is still going to be a small difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I initially avoided this change to minimize the first round of diffs, but the diffs against this PR are small enough that we might as well include them here:

Diffs are based on 2,723,124 contexts (1,064,836 MinOpts, 1,658,288 FullOpts).

MISSED contexts: 166 (0.01%)

Overall (-310 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs
benchmarks.run.windows.x64.checked.mch 12,261,026 -40 +13.36%
benchmarks.run_pgo_optrepeat.windows.x64.checked.mch 11,699,861 -25 +12.94%
coreclr_tests.run.windows.x64.checked.mch 418,321,055 -105 +90.25%
libraries.crossgen2.windows.x64.checked.mch 38,561,432 +26 +137.62%
libraries.pmi.windows.x64.checked.mch 58,333,813 -133 +180.81%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 155,319,820 -46 +203.67%
realworld.run.windows.x64.checked.mch 11,730,662 +14 +15.73%
smoke_tests.nativeaot.windows.x64.checked.mch 5,067,801 -1 +25.14%
FullOpts (-310 bytes)
Collection Base size (bytes) Diff size (bytes) PerfScore in Diffs
benchmarks.run.windows.x64.checked.mch 12,260,324 -40 +13.36%
benchmarks.run_pgo_optrepeat.windows.x64.checked.mch 11,699,183 -25 +12.94%
coreclr_tests.run.windows.x64.checked.mch 129,087,953 -105 +90.25%
libraries.crossgen2.windows.x64.checked.mch 38,559,781 +26 +137.62%
libraries.pmi.windows.x64.checked.mch 58,221,028 -133 +180.81%
libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch 144,273,582 -46 +203.67%
realworld.run.windows.x64.checked.mch 11,505,783 +14 +15.73%
smoke_tests.nativeaot.windows.x64.checked.mch 5,066,658 -1 +25.14%

Copy link
Member

@jakobbotsch jakobbotsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice to see this removed.

@amanasifkhalid
Copy link
Contributor Author

Diffs are in both directions, as expected for a profile change. The number and magnitude of diffs, however, is smaller than I expected.

In benchmarks.run on win-x64, I'm seeing:

  • 112 more CSEs
  • 1 more loop strength-reduced
  • 1 more loop made downward-counted
  • 6 fewer loops IV-widened
  • 50 more loops aligned

So this is relatively low-impact.

@amanasifkhalid amanasifkhalid merged commit b0d68f7 into dotnet:main May 30, 2025
109 checks passed
@amanasifkhalid amanasifkhalid deleted the optSetBlockWeights-graph-based branch May 30, 2025 21:16
@github-actions github-actions bot locked and limited conversation to collaborators Jun 30, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants