-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Use saveNext
unwind opcode on arm64.
#21683
Conversation
PTAL @BruceForstall @dotnet/arm64-contrib |
The beginnings of coreclr/src/jit/codegenarm64.cpp Lines 566 to 583 in 03f0b6d
coreclr/src/jit/codegenarm64.cpp Lines 704 to 720 in 03f0b6d
but the only way that I saw to fix that was to create another function like |
@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test |
test Windows_NT arm Cross Checked Innerloop Build and Test |
In your example diff, why doesn't
also get replaced by |
Are there diff examples where Apparently, |
because x27, x28 were not saved as pair for this frame. |
So what's the advantage? Just curious, I'm not very familiar with unwind info. |
I was rewriting this part to fix the issue (#21395) and did not want to leave any commented lines or |
@mikedn the |
We do not support this case, but we can easily add this. However, I do not see any examples of saving float registers in System.Private.CoreLib.dasm for arm64, so we won't see any diffs and the testing will be poor. Is it expected that we do not have any floats in |
@sandreenko I'm not sure I understand this. I think that the x19/x20 case forms the "base", then the |
That seems odd; there are 8 callee-saved FP regs on arm64. I tried and see 61 cases of |
src/jit/codegenarm64.cpp
Outdated
continue; | ||
} | ||
|
||
if (genCanUseSaveNextPair(prev, curr) && genCanUseSaveNextPair(curr, next)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is && genCanUseSaveNextPair(curr, next)
really required?
I would expect the code to not depend on next
, so it would simply be:
for (int i = 1; i < regStack->Height(); ++i)
{
RegPair& curr = regStack->BottomRef(i);
RegPair& prev = regStack->BottomRef(i - 1);
if (prev.reg2 == REG_NA || curr.reg2 == REG_NA)
{
continue;
}
if (genCanUseSaveNextPair(prev, curr))
{
curr.useSaveNextPair = true;
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, lets imagine we have mask of
r3, r4, r5, r6, r7, r8
,
then we can do only
stp r3, r4; store_next; stp r7, r8
in the prolog and the epilog will do it in the reversed order:
stp r7, r8; store_next; stp r3, r4
.
We can't start an epilog with store_next
that means we can't finish a prolog with store_next
to keep matching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we start epilog with save_next
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fyi, you can see how save_next
is handled (and how all the unwinding happens) here: https://github.com/dotnet/coreclr/blob/master/src/unwinder/arm64/unwinder_arm64.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for explaining this, fixed.
Thanks, now I see, there was something strange with my search that did not see the file:
Now I see them. However, we need sequences of 3 or more of consecutive pairs to be able to use I will push the change that supports using of |
test Windows_NT arm Cross Checked Innerloop Build and Test |
The PR was updated to support please take another look @BruceForstall. |
Can you show some asm diffs here? Both for pure int |
As disscused above we have examples only for the first case:
or
instead of
|
a new infrastructure failure on arm64 windows:
@dotnet/dnceng PTAL |
test Windows_NT arm Cross Checked Innerloop Build and Test |
@sandreenko this problem has been around since the mid-2000s; CoreCLR tests just look like viruses, independent of architecture used, and anti-virus must have exceptions for the build output folders to succeed when building them. I'm unsure how you just started noticing this (I'd guess perhaps more stuff is building, build agent changed, or tests were disabled before... dunno? ) @meganaquinn is actively working to address this via https://github.com/dotnet/core-eng/issues/4555 |
test OSX10.12 x64 Checked Innerloop Build and Test |
test Ubuntu16.04 arm64 Cross Checked jitstress2_jitstressregs2 Build and Test |
Windows_NT arm Cross Debug Innerloop Build fails because of "file contains a virus or potentially unwanted software". I have checked diffs once more and found that the change improved size for 9508 unwind sections (means
and each Code Word is 4 bytes long. So it means we should expect ~40 Kbytes improvement, but I see only ~10 in crossgened System.Private.CoreLib.dll image size diff, where do we lose other 30? |
It's very likely we don't gain much due to alignment. I think the full alignment data is 4-byte aligned, and padded. |
Unwind Info is currently 4-byte aligned, so we have cases where we replaced one
but even with that we should see 9508 code words improvement. |
@BruceForstall I think it is ready for another round of review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks for the cleanup, too.
@BruceForstall thank you for the review. |
* Use `saveNext` opcode on arm64. * Support using of `save_next` on int/float border. * Delete the extra requirement that an epilog sequences can't start from `save_next`. * response feedback Commit migrated from dotnet/coreclr@62298e6
According to Microsoft ARM64 exception handling doc we can use
save_next
as unwind code.And we have had part of it implemented in
genPrologSaveRegPair
but before the cleaning in #21395 it was tricky to support it in the epilog generation and keep prolog/epilog unwind infos matched.This PR adds
genSetUseSaveNextPairs
that marks register pairs that we can save/restore withsave_next
and teachesgenEpilogRestoreRegPair
to usesave_next
(genPrologSaveRegPair
has already known how to do that).Asm diffs for System.Private.CoreLib (arm64 checked, altjit):
that look like:
and there is a tiny improvement in the native image
System.Private.CoreLib.dll
size.Tested with GC_Stress=0xc and forced holes on arm64.