-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Conversation
This is an x86-only change. |
@dotnet/jit-contrib |
@dotnet-bot help |
Welcome to the dotnet/coreclr Repository The following is a list of valid commands on this PR. To invoke a command, comment the indicated phrase on the PR The following commands are valid for all PRs and repositories. Click to expand
The following jobs are launched by default for each PR against dotnet/coreclr:master. Click to expand
The following optional jobs are available in PRs against dotnet/coreclr:master. Click to expand
Have a nice day! |
Welcome to the dotnet/coreclr Perf help The following is a list of valid commands on this PR. To invoke a command, comment the indicated phrase on the PR The following commands are valid for all PRs and repositories. Click to expand
The following optional jobs are available in PRs against dotnet/coreclr:master. Click to expand
Have a nice day! |
Welcome to the dotnet/coreclr Repository The following is a list of valid commands on this PR. To invoke a command, comment the indicated phrase on the PR The following commands are valid for all PRs and repositories. Click to expand
The following jobs are launched by default for each PR against dotnet/coreclr:master. Click to expand
The following optional jobs are available in PRs against dotnet/coreclr:master. Click to expand
Have a nice day! |
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_zapdisable_jitstress2 |
@dotnet-bot test Ubuntu x64 Formatting |
@dotnet-bot test Ubuntu arm64 Cross Debug Innerloop Build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
info - Incoming arg used to determine if there's a frame, and to save results | ||
table - The pointer table | ||
curOffsRegs - The current code offset that should be used for reporting registers | ||
curOffsArgs - The current code offset that should be used for reporting args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, and/or below where you assert that one is less than the other, you should add a comment describing why these are (or may be) different (or refer to the comment where this is called from EnumGcRefs()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment before the assert that references the comment in EnumGCRefs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be better that instead of two offset arguments to have just one offset and a flag requiring the register offset to be adjusted (e.g. inactiveFrame
, callInProgress
etc.). That would avoid that need for asserts such as "one is less than the other".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started with that version but decided this is better. This method is called from several places and some of them don't really care about the registers and don't have the active/inactive frame info. It seemed better to just pass the same curOffs from those callers.
When enumerating live gc registers, if we are not on the active stack frame, we need to report callee-save gc registers that are live before the call. The reason is that the liveness of gc registers may change across a call to a method that does not return. In this case the instruction after the call may be a jump target and a register that didn't have a live gc pointer before the call may have a live gc pointer after the jump. To make sure we report the registers that have live gc pointers before the call we subtract 1 from curOffs.
@dotnet-bot test Windows_NT x86 Checked gcstress0xc_zapdisable_jitstress2 |
@dotnet-bot test Windows_NT x64 Formatting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good
@dotnet-bot test CentOS7.1 x64 Checked Innerloop Build and Test |
@dotnet-bot test OSX10.12 x64 Checked Innerloop Build and Test |
Are we sure this is the right time to make such a fix? This surely is the right fix but the one that inserts FWIW I debugged a case that involved callee saved registers that were live across the call, just to ensure that they're still reported correctly. I also tried a case involving nested calls, at some point I thought that inactive frames could simply skip reporting arguments but that doesn't work for nested calls. So after spending hours in the debugger I'm relatively sure that's it all right. Though it still seems scary :) |
I have similar reservations. It seems odd to me that we'd just discover now that the x86 gc reporting has had a bug all these years. Perhaps it is the no return call blocks being transformed into Even if we move ahead with this change, I also like the idea of putting |
I believe the problem has existed only since August 2016 after #6103 that changed the basic blocks for calls that don't return to BBJ_THROW. |
3 tests are failing in Windows_NT x86 Checked gcstress0xc_zapdisable_jitstress2: All 3 failed with "Consistency check failed: hit privileged instruction". @AndyAyersMS Are these these failures still expected? |
I verified that the repro from @AndyAyersMS fails with 2.0 bits as well. |
Yes, those three failures are the type addressed by #17330, which I haven't had a chance to push further. |
My preference is to go with this fix now to break the unnecessary and fragile implicit codegen/gc contract that the liveness of gc pointers in callee-save registers doesn't change across a call in fully-interruptible methods. We can't be sure that this is the only place where codegen broke the contract. I'm ok with also putting int3 after all BBJ_THROW blocks; I can do that in a separate change. @RussKeldorph Do you have an opinion on which of the two fixes (or both) to check in before 2.1? |
Does anyone know how come GC stress tests failed to catch this bug for more than a year? It would be easier to trust that the change is correct if we know why the tests didn't do their job in the past. As far as I can tell these are not new tests, |
Looking at a successful run from Jul 16, 2017, waitallex3 wasn't run: waitallex3.csproj has |
We decided to merge this change as well as the insertion of int 3 after non returning calls. I'll follow up with another PR for the latter. |
This is a follow-up to dotnet#17501 that fixed #17398. #17398 was caused by a break in implicit contract between codegen and gc pointer reporting in fully-interruptible mode: the latter assumed that register gc pointer liveness doesn't change across calls while dotnet#6103 introduced codegen where it wasn't true. dotnet#17501 changed gc pointer reporting not to expect that register gc pointer liveness doesn't change across calls. This change inserts int3 after non-returning calls at the end of basic blocks so that gc pointer liveness doesn't change across calls. This is additional insurance in case any other place in runtime was dependent on that contract.
This is a follow-up to dotnet#17501 that fixed #17398. gc pointer reporting in fully-interruptible mode: the latter assumed that register gc pointer liveness doesn't change across calls while dotnet#6103 introduced codegen where it wasn't true. doesn't change across calls. This change inserts int3 after non-returning calls at the end of basic blocks so that gc pointer liveness doesn't change across calls. This is additional insurance in case any other place in the runtime is dependent on that contract.
…7535) This is a follow-up to #17501 that fixed #17398. gc pointer reporting in fully-interruptible mode: the latter assumed that register gc pointer liveness doesn't change across calls while #6103 introduced codegen where it wasn't true. doesn't change across calls. This change inserts int3 after non-returning calls at the end of basic blocks so that gc pointer liveness doesn't change across calls. This is additional insurance in case any other place in the runtime is dependent on that contract.
When enumerating live gc registers, if we are not on the active stack frame,
we need to report callee-save gc registers that are live before the call.
The reason is that the liveness of gc registers may change across a call
to a method that does not return. In this case the instruction after the call
may be a jump target and a register that didn't have a live gc pointer before
the call may have a live gc pointer after the jump. To make sure we report the
registers that have live gc pointers before the call we subtract 1 from curOffs.
Fixes #17398.
I included a repro provided by @AndyAyersMS that doesn't require GCStress or JitStress.