-
Notifications
You must be signed in to change notification settings - Fork 5.2k
JIT: Use flowgraph annotations to scale loop blocks in optSetBlockWeights
#116120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Use flowgraph annotations to scale loop blocks in optSetBlockWeights
#116120
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Replaces the legacy lexical loop-scaling functions with a graph-based visitor pattern for optScaleLoopBlocks, removes the BBF_LOOP_HEAD flag and its handling, and updates diagnostics and block splitting to align with the new approach.
- Switches loop-weight scaling to use
FlowGraphNaturalLoopandVisitLoopBlocks - Eliminates
optMarkLoopHeads,optFindAndScaleGeneralLoopBlocks, and theBBF_LOOP_HEADflag - Cleans up diagnostic output and block-splitting logic to drop loop-head considerations
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/jit/optimizer.cpp | Replaced legacy loop-scaling with graph visitor |
| src/coreclr/jit/block.h | Removed BBF_LOOP_HEAD and renumbered flags |
| src/coreclr/jit/flowgraph.cpp | Adjusted split‐flag assertion without loop‐head |
| src/coreclr/jit/fgdiagnostic.cpp | Dropped loop-head from flow-graph dumps |
| src/coreclr/jit/fgbasic.cpp | No longer clears loop-head in block splits |
| src/coreclr/jit/compiler.h | Updated optScaleLoopBlocks signature |
| src/coreclr/jit/block.cpp | Removed loop-head from flag display |
Comments suppressed due to low confidence (1)
src/coreclr/jit/optimizer.cpp:66
- Since unnatural loops are no longer scaled by this change, add unit tests for scenarios with irreducible control flow to verify that loop weights remain unscaled as intended.
for (FlowGraphNaturalLoop* const loop : m_loops->InReversePostOrder())
| // 512 -- triple loop nesting | ||
| // | ||
| void Compiler::optScaleLoopBlocks(BasicBlock* begBlk, BasicBlock* endBlk) | ||
| void Compiler::optScaleLoopBlocks(FlowGraphNaturalLoop* loop) |
Copilot
AI
May 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The function header should include a brief note that this uses VisitLoopBlocks on FlowGraphNaturalLoop and scales blocks based on back-edge reachability and dominance.
| bool dominates = false; | ||
|
|
||
| if (m_reachabilitySets->CanReach(curBlk, begBlk) && m_reachabilitySets->CanReach(begBlk, curBlk)) | ||
| for (FlowEdge* const backEdge : loop->BackEdges()) |
Copilot
AI
May 29, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider retrieving loop->BackEdges() into a local variable outside the inner loop or lambda to avoid repeated accessor calls and slightly reduce overhead.
| for (FlowEdge* const backEdge : loop->BackEdges()) | |
| const auto& backEdges = loop->BackEdges(); | |
| for (FlowEdge* const backEdge : backEdges) |
src/coreclr/jit/optimizer.cpp
Outdated
| break; | ||
| } | ||
| } | ||
| reachable |= m_reachabilitySets->CanReach(curBlk, backEdgeSource); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious what the diffs are if you consider reachable here to be always true?
All loop blocks can reach all other loop blocks, so conceptually this would always be true... Except that reachability sets do not consider EH flow, so there is still going to be a small difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I initially avoided this change to minimize the first round of diffs, but the diffs against this PR are small enough that we might as well include them here:
Diffs are based on 2,723,124 contexts (1,064,836 MinOpts, 1,658,288 FullOpts).
MISSED contexts: 166 (0.01%)
Overall (-310 bytes)
| Collection | Base size (bytes) | Diff size (bytes) | PerfScore in Diffs |
|---|---|---|---|
| benchmarks.run.windows.x64.checked.mch | 12,261,026 | -40 | +13.36% |
| benchmarks.run_pgo_optrepeat.windows.x64.checked.mch | 11,699,861 | -25 | +12.94% |
| coreclr_tests.run.windows.x64.checked.mch | 418,321,055 | -105 | +90.25% |
| libraries.crossgen2.windows.x64.checked.mch | 38,561,432 | +26 | +137.62% |
| libraries.pmi.windows.x64.checked.mch | 58,333,813 | -133 | +180.81% |
| libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch | 155,319,820 | -46 | +203.67% |
| realworld.run.windows.x64.checked.mch | 11,730,662 | +14 | +15.73% |
| smoke_tests.nativeaot.windows.x64.checked.mch | 5,067,801 | -1 | +25.14% |
FullOpts (-310 bytes)
| Collection | Base size (bytes) | Diff size (bytes) | PerfScore in Diffs |
|---|---|---|---|
| benchmarks.run.windows.x64.checked.mch | 12,260,324 | -40 | +13.36% |
| benchmarks.run_pgo_optrepeat.windows.x64.checked.mch | 11,699,183 | -25 | +12.94% |
| coreclr_tests.run.windows.x64.checked.mch | 129,087,953 | -105 | +90.25% |
| libraries.crossgen2.windows.x64.checked.mch | 38,559,781 | +26 | +137.62% |
| libraries.pmi.windows.x64.checked.mch | 58,221,028 | -133 | +180.81% |
| libraries_tests_no_tiered_compilation.run.windows.x64.Release.mch | 144,273,582 | -46 | +203.67% |
| realworld.run.windows.x64.checked.mch | 11,505,783 | +14 | +15.73% |
| smoke_tests.nativeaot.windows.x64.checked.mch | 5,066,658 | -1 | +25.14% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice to see this removed.
|
Diffs are in both directions, as expected for a profile change. The number and magnitude of diffs, however, is smaller than I expected. In
So this is relatively low-impact. |
Part of #107749. This replaces
optMarkLoopHeadsandoptFindAndScaleLoopBlockswith the graph-based loop visitor pattern we now use elsewhere in the JIT. I tried to preserve the heuristics inoptScaleLoopBlocks, but a side effect of this change is we no longer scale unnatural loops. However, we no longer rely on lexical loop finding, so we gain some precision as well.Once this and graph-based loop inversion have been merged, we will no longer have any
fgRenumberBlockscalls.