-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Fix arm64 prolog generation for register masks with holes. #21395
Fix arm64 prolog generation for register masks with holes. #21395
Conversation
5376715
to
90c355b
Compare
As bonus I will improve our prolog/epilog generation by using It will give us few nice asm diffs on arm64. |
PTAL @BruceForstall @dotnet/jit-contrib |
@dotnet-bot test Ubuntu16.04 arm64 Cross Checked jitstress2_jitstressregs0x10 Build and Test |
arm64 pmi fails with
Is it a known issue? |
Not known by me. Looks like an issue building jitutils on arm64. |
@dotnet-bot test |
@dotnet-bot test Windows_NT x64 Checked CoreFX Tests |
@dotnet-bot Windows_NT arm64 Cross Checked Innerloop Build and Test |
@dotnet-bot test Windows_NT arm64 Cross Checked Innerloop Build and Test |
Ubuntu16.04 arm64 Cross Checked jitstress2_jitstressregs2 Build and Test fails due to #21500 . |
@BruceForstall Could you please take another look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments.
As we discussed in person, I'm mostly concerned about testing. So:
- There should be no asm diffs with this change (including unwind codes) which you indicated is true.
- You should attempt to verify that unwinding with the newly generated unwind info that includes "gaps" should work. My suggestion was to add a register stress mode that causes there to be many gaps, like removing all the odd callee-saved registers from the set of available registers. Then, you need to force unwinding to happen. One way is to run GCStress=4/8/C.
Overall, looks like a nice simplification and cleanup to the code.
src/jit/codegenarm64.cpp
Outdated
// spOffset - stack pointer offset value; | ||
// slotSize - stack slot size in bytes. | ||
// | ||
void CheckSPOffset(bool isRegsCountOdd, int spOffset, int slotSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not fond of this and the functions below (buildRegPairsStack, GetSlotSizeForRegsInMask) being unprototyped, file-level functions. I'd prefer they be members of CodeGen, perhaps static members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not have a separate header file like codegenarm64.h
, so I will have to place it into codegen.h
under #if defined(_TARGET_ARM64_) && defined(DEBUG)
, do you think it is worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please. (The others don't need DEBUG)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in b889dc2 , does it look better?
a677a64
to
b889dc2
Compare
Checked, got clean run with gcStress=0xc and 50% holes. |
There are two callers and both save `LR` with `FP` before calling this helper. In case if somebody calls this function wth`REG_LR` it won't work because unwinding info doesn't expect holes in the reg pairs.
It will be used for both float and int types.
The previous version was a copy-paste with an additional case for `REG_LR` that was unreachable. Fix that.
It will be used for both int and float groups.
…LR`. Both callers do it later.
and move it out of loops because it works only before the first save intruction.
…d `genRestoreCaleeSavedRegisterGroup`. Build a stack of registers that we want to save/restore and then iterate over it doing the actual saving/restoring.
For future use.
b889dc2
to
46010a5
Compare
@BruceForstall The PR was updated,please take another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if tests pass. Please trigger some arm64 jobs.
@dotnet-bot test |
test Windows_NT arm Cross Debug Innerloop Build |
…reclr#21395) * Check that `genSaveCalleeSavedRegistersHelp` doesn't accept `REG_LR`. There are two callers and both save `LR` with `FP` before calling this helper. In case if somebody calls this function wth`REG_LR` it won't work because unwinding info doesn't expect holes in the reg pairs. * Extract `genPushOrPopCalleeSavedRegisters`. It will be used for both float and int types. * Extend `genSaveCaleeSavedRegisterGroup` to support float. The previous version was a copy-paste with an additional case for `REG_LR` that was unreachable. Fix that. * Use `genSaveCaleeSavedRegisterGroup` for floats. * Extract `genRestoreCaleeSavedRegisterGroup`. It will be used for both int and float groups. * Prepare `genPopCalleeSavedRegisters` to work with float. * Use `genRestoreCaleeSavedRegisterGroup` for float. * Check that `genRestoreCalleeSavedRegistersHelp` doesn't restore `REG_LR`. Both callers do it later. * Extract `CheckSPOffset`. and move it out of loops because it works only before the first save intruction. * Format `genRestoreCaleeSavedRegisterGroup` as `genSaveCalleeSavedRegistersHelp`. * Extract `buildRegPairsStack` from `genSaveCaleeSavedRegisterGroup` and `genRestoreCaleeSavedRegisterGroup`. Build a stack of registers that we want to save/restore and then iterate over it doing the actual saving/restoring. * Extract `GetSlotSizeForRegsInMask`. * Tolerate holes in arm64 prolog/epilog register masks. Fixes dotnet/coreclr#21363 * Reenable the test. * Add new methods headers. * Fix typos. * Do not use non-const references. * Describe `buildRegPairsStack` better. * Change signature of `buildRegPairsStack` to avoid copyings of big structs. * Return the logic for `RBM_LR`. For future use. * Clean-up some unused variables. * Fix comments. * Get rid of file-level functions. * Make new methods static. Commit migrated from dotnet/coreclr@a9ceb15
This PR cleans
genSaveCalleeSavedRegistersHelp
andgenRestoreCalleeSavedRegistersHelp
to delete code duplication and then fixes issue #21363.No diffs on arm64 checked (-c, -f) with crossgen and pmi.
644977e: Check that
genSaveCalleeSavedRegistersHelp
doesn't acceptREG_LR
.There are two callers and both save
LR
withFP
before calling this helper.In case if somebody calls this function wth
REG_LR
it won't work because unwinding info doesn't expect holes in the reg pairs.7a51619: Extract
genPushOrPopCalleeSavedRegisters
.It will be used for both float and int types.
11c4e0d: Extend
genSaveCaleeSavedRegisterGroup
to support float.The previous version was a copy-paste with an additional case for
REG_LR
that was unreachable. Fix that.ac622c1: Use
genSaveCaleeSavedRegisterGroup
for floats.fa0be31: Extract
genRestoreCaleeSavedRegisterGroup
.It will be used for both int and float groups.
f508521: Prepare
genPopCalleeSavedRegisters
to work with float.3e92518: Use
genRestoreCaleeSavedRegisterGroup
for float.c088641: Check that
genRestoreCalleeSavedRegistersHelp
doesn't restoreREG_LR
.Both callers do it later.
e36ab6d: Extract
CheckSPOffset
.and move it out of loops because it works only before the first save intruction.
d9fd6ff: Format
genRestoreCaleeSavedRegisterGroup
asgenSaveCalleeSavedRegistersHelp
.6870f6f: Extract
buildRegPairsStack
fromgenSaveCaleeSavedRegisterGroup
andgenRestoreCaleeSavedRegisterGroup
.Build a stack of registers that we want to save/restore and then iterate over it doing the actual saving/restoring.
ec076a1: Extract
GetSlotSizeForRegsInMask
.6f0fa31: Tolerate holes in arm64 prolog/epilog register masks.
Fixes #21363
b91d19f: Reenable the test.
90c355b: Add new methods headers.