Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Fix arm64 prolog generation for register masks with holes. #21395

Merged
merged 24 commits into from
Dec 21, 2018

Conversation

sandreenko
Copy link

@sandreenko sandreenko commented Dec 6, 2018

This PR cleans genSaveCalleeSavedRegistersHelp and genRestoreCalleeSavedRegistersHelp to delete code duplication and then fixes issue #21363.

No diffs on arm64 checked (-c, -f) with crossgen and pmi.

644977e: Check that genSaveCalleeSavedRegistersHelp doesn't accept REG_LR.

There are two callers and both save LR with FP before calling this helper.
In case if somebody calls this function wthREG_LR it won't work because unwinding info doesn't expect holes in the reg pairs.

7a51619: Extract genPushOrPopCalleeSavedRegisters.

It will be used for both float and int types.

11c4e0d: Extend genSaveCaleeSavedRegisterGroup to support float.

The previous version was a copy-paste with an additional case for REG_LR that was unreachable. Fix that.

ac622c1: Use genSaveCaleeSavedRegisterGroup for floats.

fa0be31: Extract genRestoreCaleeSavedRegisterGroup.

It will be used for both int and float groups.

f508521: Prepare genPopCalleeSavedRegisters to work with float.

3e92518: Use genRestoreCaleeSavedRegisterGroup for float.

c088641: Check that genRestoreCalleeSavedRegistersHelp doesn't restore REG_LR.

Both callers do it later.

e36ab6d: Extract CheckSPOffset.

and move it out of loops because it works only before the first save intruction.

d9fd6ff: Format genRestoreCaleeSavedRegisterGroup as genSaveCalleeSavedRegistersHelp.

6870f6f: Extract buildRegPairsStack from genSaveCaleeSavedRegisterGroup and genRestoreCaleeSavedRegisterGroup.

Build a stack of registers that we want to save/restore and then iterate over it doing the actual saving/restoring.

ec076a1: Extract GetSlotSizeForRegsInMask.

6f0fa31: Tolerate holes in arm64 prolog/epilog register masks.

Fixes #21363

b91d19f: Reenable the test.

90c355b: Add new methods headers.

@sandreenko
Copy link
Author

sandreenko commented Dec 6, 2018

As bonus I will improve our prolog/epilog generation by using save_next. It is clear how to do it after the cleaning.

It will give us few nice asm diffs on arm64.

@sandreenko
Copy link
Author

PTAL @BruceForstall @dotnet/jit-contrib

@sandreenko
Copy link
Author

@dotnet-bot test Ubuntu16.04 arm64 Cross Checked jitstress2_jitstressregs0x10 Build and Test
@dotnet-bot test Ubuntu16.04 arm64 Cross Checked jitstress2_jitstressregs2 Build and Test
@dotnet-bot test Windows_NT arm64 Cross Checked jitstress2_jitstressregs0x1000 Build and Test
@dotnet-bot test Ubuntu16.04 arm64 Cross Checked pmi_asm_diffs
@dotnet-bot test Windows_NT x64_arm64_altjit Checked pmi_asm_diffs

@sandreenko
Copy link
Author

arm64 pmi fails with

error NU1605: Detected package downgrade: System.Collections from 4.3.0 to 4.0.11. Reference the package directly from the project to select a different version. [/home/robox/j/workspace/dotnet_coreclr/master/arm64_cross_checked_ubuntu16.04_pmi_asm_diffs_tst_prtest/_/pmi/jitutils/jitutils.sln]

Is it a known issue?

@BruceForstall
Copy link

Not known by me.

Looks like an issue building jitutils on arm64.

@sandreenko
Copy link
Author

@dotnet-bot test
Ubuntu16.04 arm64 Cross Checked jitstress2_jitstressregs2 Build and Test
@dotnet-bot Windows_NT arm64 Cross Checked Innerloop Build and Test
@dotnet-bot test Windows_NT arm64 Cross Checked jitstress2_jitstressregs0x1000 Build and Test

@sandreenko
Copy link
Author

@dotnet-bot test Windows_NT x64 Checked CoreFX Tests

@sandreenko
Copy link
Author

@dotnet-bot Windows_NT arm64 Cross Checked Innerloop Build and Test

@sandreenko
Copy link
Author

@dotnet-bot test Windows_NT arm64 Cross Checked Innerloop Build and Test
@dotnet-bot test Windows_NT arm64 Cross Checked jitstress2_jitstressregs0x1000 Build and Test

@sandreenko
Copy link
Author

Ubuntu16.04 arm64 Cross Checked jitstress2_jitstressregs2 Build and Test fails due to #21500 .

@sandreenko
Copy link
Author

@BruceForstall Could you please take another look?

Copy link

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments.

As we discussed in person, I'm mostly concerned about testing. So:

  1. There should be no asm diffs with this change (including unwind codes) which you indicated is true.
  2. You should attempt to verify that unwinding with the newly generated unwind info that includes "gaps" should work. My suggestion was to add a register stress mode that causes there to be many gaps, like removing all the odd callee-saved registers from the set of available registers. Then, you need to force unwinding to happen. One way is to run GCStress=4/8/C.

Overall, looks like a nice simplification and cleanup to the code.

// spOffset - stack pointer offset value;
// slotSize - stack slot size in bytes.
//
void CheckSPOffset(bool isRegsCountOdd, int spOffset, int slotSize)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fond of this and the functions below (buildRegPairsStack, GetSlotSizeForRegsInMask) being unprototyped, file-level functions. I'd prefer they be members of CodeGen, perhaps static members.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not have a separate header file like codegenarm64.h, so I will have to place it into codegen.h under #if defined(_TARGET_ARM64_) && defined(DEBUG), do you think it is worth it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please. (The others don't need DEBUG)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b889dc2 , does it look better?

@sandreenko sandreenko force-pushed the FixArm64PrologGeneration branch from a677a64 to b889dc2 Compare December 20, 2018 22:05
@sandreenko
Copy link
Author

sandreenko commented Dec 20, 2018

You should attempt to verify that unwinding with the newly generated unwind info that includes "gaps" should work. My suggestion was to add a register stress mode that causes there to be many gaps, like removing all the odd callee-saved registers from the set of available registers. Then, you need to force unwinding to happen. One way is to run GCStress=4/8/C.

Checked, got clean run with gcStress=0xc and 50% holes.

Sergey Andreenko added 10 commits December 20, 2018 16:33
There are two callers and both save `LR` with `FP` before calling this helper.
In case if somebody calls this function wth`REG_LR` it won't work because unwinding info doesn't expect holes in the reg pairs.
It will be used for both float and int types.
The previous version was a copy-paste with an additional case for `REG_LR` that was unreachable. Fix that.
It will be used for both int and float groups.
and move it out of loops because it works only before the first save intruction.
@sandreenko sandreenko force-pushed the FixArm64PrologGeneration branch from b889dc2 to 46010a5 Compare December 21, 2018 00:34
@sandreenko
Copy link
Author

@BruceForstall The PR was updated,please take another look.

Copy link

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if tests pass. Please trigger some arm64 jobs.

@sandreenko
Copy link
Author

@dotnet-bot test
Ubuntu16.04 arm64 Cross Checked jitstress2_jitstressregs2 Build and Test
@dotnet-bot Windows_NT arm64 Cross Checked Innerloop Build and Test
@dotnet-bot test Windows_NT arm64 Cross Checked jitstress2_jitstressregs0x1000 Build and Test

@sandreenko
Copy link
Author

test Windows_NT arm Cross Debug Innerloop Build
test Windows_NT arm64 Cross Debug Innerloop Build
test Windows_NT arm Cross Checked Innerloop Build and Test

@sandreenko sandreenko merged commit a9ceb15 into dotnet:master Dec 21, 2018
@sandreenko sandreenko deleted the FixArm64PrologGeneration branch January 23, 2019 00:12
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…reclr#21395)

* Check that `genSaveCalleeSavedRegistersHelp` doesn't accept `REG_LR`.

There are two callers and both save `LR` with `FP` before calling this helper.
In case if somebody calls this function wth`REG_LR` it won't work because unwinding info doesn't expect holes in the reg pairs.

* Extract `genPushOrPopCalleeSavedRegisters`.

It will be used for both float and int types.

* Extend `genSaveCaleeSavedRegisterGroup` to support float.

The previous version was a copy-paste with an additional case for `REG_LR` that was unreachable. Fix that.

* Use `genSaveCaleeSavedRegisterGroup` for floats.

* Extract `genRestoreCaleeSavedRegisterGroup`.

It will be used for both int and float groups.

* Prepare `genPopCalleeSavedRegisters` to work with float.

* Use `genRestoreCaleeSavedRegisterGroup` for float.

* Check that `genRestoreCalleeSavedRegistersHelp` doesn't restore `REG_LR`.

Both callers do it later.

* Extract `CheckSPOffset`.

and move it out of loops because it works only before the first save intruction.

* Format `genRestoreCaleeSavedRegisterGroup` as `genSaveCalleeSavedRegistersHelp`.

* Extract `buildRegPairsStack` from `genSaveCaleeSavedRegisterGroup` and `genRestoreCaleeSavedRegisterGroup`.

Build a stack of registers that we want to save/restore and then iterate over it doing the actual saving/restoring.

* Extract `GetSlotSizeForRegsInMask`.

* Tolerate holes in arm64 prolog/epilog register masks.

Fixes dotnet/coreclr#21363

* Reenable the test.

* Add new methods headers.

* Fix typos.

* Do not use non-const references.

* Describe `buildRegPairsStack` better.

* Change signature of `buildRegPairsStack` to avoid copyings of big structs.

* Return the logic for `RBM_LR`.

For future use.

* Clean-up some unused variables.

* Fix comments.

* Get rid of file-level functions.

* Make new methods static.


Commit migrated from dotnet/coreclr@a9ceb15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[arm64] the prolog generation for arm64 doesn't support holes in regMasks.
2 participants