This repository was archived by the owner on Jan 23, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The test will use Roslyn to compile a simple program. [tfs-changeset: 1408001]
The PR for this change failed, but that's because the Jenkin's system was unable to find the correct ref to build, which I think is a configuration issue on our end. I am going to merge this anyway. We'll get a good read of quality from TFS and I've asked @mmitche to take a look at Jenkins when he has a chance. |
sergiy-k
added a commit
that referenced
this pull request
Mar 2, 2015
Exception handling needs a full context here to restore properly
richlander
added a commit
to richlander/coreclr
that referenced
this pull request
Apr 8, 2015
Rework wiki structure
This was referenced Apr 18, 2015
Merged
mikem8361
pushed a commit
that referenced
this pull request
Aug 4, 2015
…p where the system ids (TIDs) are wrong. First find the managed thread os id is: (lldb) sos Threads Lock ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception 1 1 3787 00000000006547F8 20220 Preemptive 00007FFFCC0145D0:00007FFFCC015FD0 00000000006357F8 0 Ukn 6 2 3790 0000000000678FB8 21220 Preemptive 0000000000000000:0000000000000000 00000000006357F8 0 Ukn (Finalizer) (lldb) thread list Process 0 stopped * thread #1: tid = 0x0000, 0x00007f01fe64d267 libc.so.6`__GI_raise(sig=6) + 55 at raise.c:55, name = 'corerun', stop reason = signal SIGABRT thread #2: tid = 0x0001, 0x00007f01fe7138dd libc.so.6, stop reason = signal SIGABRT thread #3: tid = 0x0002, 0x00007f01fd27dda0 libpthread.so.0`__pthread_cond_wait + 192, stop reason = signal SIGABRT thread #4: tid = 0x0003, 0x00007f01fd27e149 libpthread.so.0`__pthread_cond_timedwait + 297, stop reason = signal SIGABRT thread #5: tid = 0x0004, 0x00007f01fe70f28d libc.so.6, stop reason = signal SIGABRT thread #6: tid = 0x0005, 0x00007f01fe70f49d libc.so.6, stop reason = signal SIGABRT Then use the new command "setsostid" to set the current thread using the "OSID" from above: (lldb) setsostid 3790 6 Set sos thread os id to 0x3790 which maps to lldb thread index 6 Now ClrStack should dump that managed thread: (lldb) sos ClrStack To undo the affect of this command: (lldb) setsostid Added setclrpath command that allows the path that sos/dac/dbi are loaded from to be changes instead of using the coreclr path. This may be needed if loading a core dump and the debugger binaries are in a different directory that what the dump has for coreclr's path.
BruceForstall
referenced
this pull request
in BruceForstall/coreclr
Jun 23, 2016
Fixes #4181 "NYI_X86: Implement PInvoke frame init inlining for x86" The main work here is to handle the custom calling convention for the x86 CORINFO_HELP_INIT_PINVOKE_FRAME helper call: it takes EDI as an argument, trashes only EAX, and returns the TCB in ESI. The code changes are as follows: 1. Lowering::InsertPInvokeMethodProlog(): don't pass the "secret stub param" for x86. Also, don't store the InlinedCallFrame.m_pCallSiteSP in the prolog: for x86 this is done at the call site, due to the floating stack pointer. 2. LinearScan::getKillSetForNode(): for helper calls, call compHelperCallKillSet() to get the killMask, to account for non-standard kill sets. 3. Morph.cpp::fgMorphArgs(): set non-standard arguments for CORINFO_HELP_INIT_PINVOKE_FRAME. 4. compHelperCallKillSet(): set the correct kill set for CORINFO_HELP_INIT_PINVOKE_FRAME. 5. codegenxarch.cpp::genCallInstruction(): set the ABI return register for CORINFO_HELP_INIT_PINVOKE_FRAME. 6. lowerxarch.cpp::TreeNodeInfoInit(): set the GT_CALL dstCandidates for CORINFO_HELP_INIT_PINVOKE_FRAME. 5 & 6 are both needed to avoid a copy. With this change, the #1 NYI with 18415 hits over the tests is gone. The total number of NYI is now 29516.
BruceForstall
referenced
this pull request
in BruceForstall/coreclr
Jun 23, 2016
Fixes #4181 "NYI_X86: Implement PInvoke frame init inlining for x86" The main work here is to handle the custom calling convention for the x86 CORINFO_HELP_INIT_PINVOKE_FRAME helper call: it takes EDI as an argument, trashes only EAX, and returns the TCB in ESI. The code changes are as follows: 1. Lowering::InsertPInvokeMethodProlog(): don't pass the "secret stub param" for x86. Also, don't store the InlinedCallFrame.m_pCallSiteSP in the prolog: for x86 this is done at the call site, due to the floating stack pointer. 2. LinearScan::getKillSetForNode(): for helper calls, call compHelperCallKillSet() to get the killMask, to account for non-standard kill sets. 3. Morph.cpp::fgMorphArgs(): set non-standard arguments for CORINFO_HELP_INIT_PINVOKE_FRAME. 4. compHelperCallKillSet(): set the correct kill set for CORINFO_HELP_INIT_PINVOKE_FRAME. 5. codegenxarch.cpp::genCallInstruction(): set the ABI return register for CORINFO_HELP_INIT_PINVOKE_FRAME. 6. lowerxarch.cpp::TreeNodeInfoInit(): set the GT_CALL dstCandidates for CORINFO_HELP_INIT_PINVOKE_FRAME. 5 & 6 are both needed to avoid a copy. With this change, the #1 NYI with 18415 hits over the tests is gone. The total number of NYI is now 29516.
BruceForstall
referenced
this pull request
in BruceForstall/coreclr
Jun 23, 2016
Fixes #4181 "NYI_X86: Implement PInvoke frame init inlining for x86" The main work here is to handle the custom calling convention for the x86 CORINFO_HELP_INIT_PINVOKE_FRAME helper call: it takes EDI as an argument, trashes only EAX, and returns the TCB in ESI. The code changes are as follows: 1. Lowering::InsertPInvokeMethodProlog(): don't pass the "secret stub param" for x86. Also, don't store the InlinedCallFrame.m_pCallSiteSP in the prolog: for x86 this is done at the call site, due to the floating stack pointer. 2. LinearScan::getKillSetForNode(): for helper calls, call compHelperCallKillSet() to get the killMask, to account for non-standard kill sets. 3. Morph.cpp::fgMorphArgs(): set non-standard arguments for CORINFO_HELP_INIT_PINVOKE_FRAME. 4. compHelperCallKillSet(): set the correct kill set for CORINFO_HELP_INIT_PINVOKE_FRAME. 5. codegenxarch.cpp::genCallInstruction(): set the ABI return register for CORINFO_HELP_INIT_PINVOKE_FRAME. 6. lowerxarch.cpp::TreeNodeInfoInit(): set the GT_CALL dstCandidates for CORINFO_HELP_INIT_PINVOKE_FRAME. 5 & 6 are both needed to avoid a copy. With this change, the #1 NYI with 18415 hits over the tests is gone. The total number of NYI is now 29516.
BruceForstall
referenced
this pull request
in BruceForstall/coreclr
Jul 28, 2016
Remove GT_STMT usage from the code generator.
dotnet-bot
pushed a commit
to dotnet-bot/coreclr
that referenced
this pull request
Aug 19, 2016
OVERVIEW ======== This directory contains the SuperPMI tool used for testing the .NET just-in-time (JIT) compiler. SuperPMI has two uses: 1. Verification that a JIT code change doesn't cause any asserts. 2. Finding test code where two JIT compilers generate different code, or verifying that the two compilers generate the same code. Case dotnet#1 is useful for doing quick regression checking when making a source code change to the JIT compiler. The process is: (a) make a JIT source code change, (b) run that newly built JIT through a SuperPMI run to verify no asserts have been introduced. Case dotnet#2 is useful for generating assembly language diffs, to help analyze the impact of a JIT code change. SuperPMI works in two phases: collection and playback. In the collection phase, the system is configured to collect SuperPMI data. Then, run any set of .NET managed programs. When these managed programs invoke the JIT compiler, SuperPMI gathers and captures all information passed between the JIT and its .NET host. In the playback phase, SuperPMI loads the JIT directly, and causes it to compile all the functions that it previously compiled, but using the collected data to provide answers to various questions that the JIT needs to ask. The .NET execution engine (EE) is not invoked at all. TOOLS ========== There are two native executable tools: superpmi and mcs. There is a .NET Core C# program that is built as part of the coreclr repo tests build called superpmicollect.exe. All will show a help screen if passed -?. COLLECTION ========== Set the following environment variables: SuperPMIShimLogPath=<full path to an empty temporary directory> SuperPMIShimPath=<full path to clrjit.dll, the "standalone" JIT> COMPlus_AltJit=* COMPlus_AltJitName=superpmi-shim-collector.dll (On Linux, use libclrjit.so and libsuperpmi-shim-collector.so. On Mac, use libclrjit.dylib and libsuperpmi-shim-collector.dylib.) Then, run some managed programs. When done running programs, un-set these variables. Now, you will have a large number of .mc files. Merge these using the mcs tool: mcs -merge base.mch *.mc One benefit of SuperPMI is the ability to remove duplicated compilations, so on replay only unique functions are compiled. Use the following to create a "unique" set of functions: mcs -removeDup -thin base.mch unique.mch Note that -thin is not required. However, it will delete all the compilation result collected during the collection phase, which makes the resulting MCH file smaller. Those compilation results are not required for playback. Use the superpmicollect.exe tool to automate and simplify this process. PLAYBACK ======== Once you have a merged, de-duplicated MCH collection, you can play it back using: superpmi unique.mch clrjit.dll You can do this much faster by utilizing all the processors on your machine, and replaying in parallel, using: superpmi -p unique.mch clrjit.dll REMAINING WORK ============= The basic of assembly diffing are there, using the "coredistools" package. The open source build needs to be altered to use this package to wire up the correct build steps. [tfs-changeset: 1623347]
litian2025
pushed a commit
to litian2025/coreclr
that referenced
this pull request
Dec 12, 2016
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the dotnet#1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because dotnet#2 SSE to AVX transition penalty won't happen since dotnet#1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix #7240
litian2025
pushed a commit
to litian2025/coreclr
that referenced
this pull request
Jan 8, 2017
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the dotnet#1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because dotnet#2 SSE to AVX transition penalty won't happen since dotnet#1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix #7240
litian2025
pushed a commit
to litian2025/coreclr
that referenced
this pull request
Jan 8, 2017
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the dotnet#1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because dotnet#2 SSE to AVX transition penalty won't happen since dotnet#1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix #7240 move setContainsAVX flags to lower, refactor to a smaller method refactor, fix typo in comments fix format error
BruceForstall
referenced
this pull request
in BruceForstall/coreclr
Mar 13, 2017
1. Use the LIR node dumper to display nodes to be generated by codegen, since we're in LIR form at that point. Add a new "prefix message" argument to allow "Generating: " to prefix all such lines. 2. Fix off-by-one error in LIR dump due to `#ifdef` versus `#if`. 3. Remove extra trailing line for each LIR node. This interfered with #1. But I always thought it was unnecessarily verbose; I don't believe there is any ambiguity without that extra space. 4. Add dTreeLIR()/cTreeLIR() functions for use in the debugger.
Merged
BruceForstall
added a commit
that referenced
this pull request
Mar 14, 2017
1. Use the LIR node dumper to display nodes to be generated by codegen, since we're in LIR form at that point. Add a new "prefix message" argument to allow "Generating: " to prefix all such lines. 2. Fix off-by-one error in LIR dump due to `#ifdef` versus `#if`. 3. Remove extra trailing line for each LIR node. This interfered with #1. But I always thought it was unnecessarily verbose; I don't believe there is any ambiguity without that extra space. 4. Add dTreeLIR()/cTreeLIR() functions for use in the debugger.
jorive
pushed a commit
to guhuro/coreclr
that referenced
this pull request
May 4, 2017
1. Use the LIR node dumper to display nodes to be generated by codegen, since we're in LIR form at that point. Add a new "prefix message" argument to allow "Generating: " to prefix all such lines. 2. Fix off-by-one error in LIR dump due to `#ifdef` versus `#if`. 3. Remove extra trailing line for each LIR node. This interfered with dotnet#1. But I always thought it was unnecessarily verbose; I don't believe there is any ambiguity without that extra space. 4. Add dTreeLIR()/cTreeLIR() functions for use in the debugger.
jkotas
pushed a commit
that referenced
this pull request
Apr 24, 2018
stephentoub
added a commit
to stephentoub/coreclr
that referenced
this pull request
May 31, 2018
1. Computing GC roots is a relatively slow operation, and doing it for every state machine object found in a large heap can be time consuming. Making it opt-in with -roots command-line flag. 2. Added -waiting command-line flag. DumpAsync will now retrieve the <>1__state field from the StateMachine, and if -waiting is specified, it'll filter down to state machines that have a state value >= 0, meaning the state machines are waiting at an await point. For example, given this program: ```C# using System.Threading.Tasks; class Program { static async Task Main() { await MethodA(0); await MethodA(int.MaxValue); } static async Task MethodA(int delay) => await MethodB(delay); static async Task MethodB(int delay) { await Task.Yield(); await Task.Delay(delay); } } ``` using `!DumpAsync` outputs: ``` Address MT Size Name #0 0000026848693438 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance -2 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848693490 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 0 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848693498 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 00000268486934a0 <>u__2 Continuation: 00000268486934b0 (System.Object) dotnet#1 0000026848693e68 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance -2 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848693ec0 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 0 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848693ec8 <>u__1 Continuation: 00000268486934b0 (System.Object) dotnet#2 0000026848693ed8 00007ff88ea37188 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]] StateMachine: Program+<Main>d__0 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000001 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000002 8 ...TaskMethodBuilder 1 instance 0000026848693f30 <>t__builder 00007ff8e9bcead0 4000003 10 ...vices.TaskAwaiter 1 instance 0000026848693f38 <>u__1 Continuation: 0000026848693f48 (System.Threading.Tasks.Task+SetOnInvokeMres) dotnet#3 0000026848695d30 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848695d88 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848695d90 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 0000026848695d98 <>u__2 Continuation: 0000026848695dd0 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]]) dotnet#4 0000026848695dd0 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848695e28 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848695e30 <>u__1 Continuation: 0000026848693ed8 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]]) Found 5 state machines. ``` while using `!DumpAsync -waiting` outputs only: ``` Address MT Size Name #0 0000026848693ed8 00007ff88ea37188 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]] StateMachine: Program+<Main>d__0 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000001 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000002 8 ...TaskMethodBuilder 1 instance 0000026848693f30 <>t__builder 00007ff8e9bcead0 4000003 10 ...vices.TaskAwaiter 1 instance 0000026848693f38 <>u__1 Continuation: 0000026848693f48 (System.Threading.Tasks.Task+SetOnInvokeMres) dotnet#1 0000026848695d30 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848695d88 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848695d90 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 0000026848695d98 <>u__2 Continuation: 0000026848695dd0 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]]) dotnet#2 0000026848695dd0 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848695e28 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848695e30 <>u__1 Continuation: 0000026848693ed8 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]]) Found 3 state machines. ``` skipping the two state machines that have a `<>1__state` field value of -2 (meaning it's completed). Note that this change has the somewhat unfortunate impact of taking a dependency on what's effectively an implementation detail of Roslyn, but the value the filtering provides is deemed to be worth it. This design is unlikely to change in the future, and as with other diagnostic/debugging features that rely on such details, it can be updated if Roslyn ever changes its scheme. In the meantime, the code will output a warning message if it can't find the state field. 3. If a state machine is found to have 0 roots but also to have a <>1__state value >= 0, that suggests it was dropped without having been completed, which is likely a sign of an application bug. The command now prints out an information message to highlight that state. For example, this program: ```C# using System; using System.Threading.Tasks; class Program { static void Main() { Task.Run(async () => await new TaskCompletionSource<bool>().Task); Console.ReadLine(); } } ``` when processed with `!DumpAsync -roots` results in: ``` Address MT Size Name #0 0000020787fb5b30 00007ff88ea1afe8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Boolean, System.Private.CoreLib],[Program+<>c+<<Main>b__0_0>d, test]] StateMachine: Program+<>c+<<Main>b__0_0>d (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000003 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd0b88 4000004 8 ...Private.CoreLib]] 1 instance 0000020787fb5b88 <>t__builder 00007ff8e9bffd58 4000005 10 ...Private.CoreLib]] 1 instance 0000020787fb5b90 <>u__1 Continuation: 0000020787fb3fc8 (System.Threading.Tasks.UnwrapPromise`1[[System.Boolean, System.Private.CoreLib]]) GC roots: Incomplete state machine (<>1__state == 0) with 0 roots. Found 1 state machines. ```
stephentoub
added a commit
to stephentoub/coreclr
that referenced
this pull request
May 31, 2018
1. Computing GC roots is a relatively slow operation, and doing it for every state machine object found in a large heap can be time consuming. Making it opt-in with -roots command-line flag. 2. Added -waiting command-line flag. DumpAsync will now retrieve the <>1__state field from the StateMachine, and if -waiting is specified, it'll filter down to state machines that have a state value >= 0, meaning the state machines are waiting at an await point. For example, given this program: ```C# using System.Threading.Tasks; class Program { static async Task Main() { await MethodA(0); await MethodA(int.MaxValue); } static async Task MethodA(int delay) => await MethodB(delay); static async Task MethodB(int delay) { await Task.Yield(); await Task.Delay(delay); } } ``` using `!DumpAsync` outputs: ``` Address MT Size Name #0 0000026848693438 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance -2 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848693490 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 0 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848693498 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 00000268486934a0 <>u__2 Continuation: 00000268486934b0 (System.Object) dotnet#1 0000026848693e68 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance -2 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848693ec0 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 0 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848693ec8 <>u__1 Continuation: 00000268486934b0 (System.Object) dotnet#2 0000026848693ed8 00007ff88ea37188 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]] StateMachine: Program+<Main>d__0 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000001 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000002 8 ...TaskMethodBuilder 1 instance 0000026848693f30 <>t__builder 00007ff8e9bcead0 4000003 10 ...vices.TaskAwaiter 1 instance 0000026848693f38 <>u__1 Continuation: 0000026848693f48 (System.Threading.Tasks.Task+SetOnInvokeMres) dotnet#3 0000026848695d30 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848695d88 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848695d90 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 0000026848695d98 <>u__2 Continuation: 0000026848695dd0 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]]) dotnet#4 0000026848695dd0 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848695e28 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848695e30 <>u__1 Continuation: 0000026848693ed8 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]]) Found 5 state machines. ``` while using `!DumpAsync -waiting` outputs only: ``` Address MT Size Name #0 0000026848693ed8 00007ff88ea37188 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]] StateMachine: Program+<Main>d__0 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000001 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000002 8 ...TaskMethodBuilder 1 instance 0000026848693f30 <>t__builder 00007ff8e9bcead0 4000003 10 ...vices.TaskAwaiter 1 instance 0000026848693f38 <>u__1 Continuation: 0000026848693f48 (System.Threading.Tasks.Task+SetOnInvokeMres) dotnet#1 0000026848695d30 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848695d88 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848695d90 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 0000026848695d98 <>u__2 Continuation: 0000026848695dd0 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]]) dotnet#2 0000026848695dd0 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848695e28 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848695e30 <>u__1 Continuation: 0000026848693ed8 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]]) Found 3 state machines. ``` skipping the two state machines that have a `<>1__state` field value of -2 (meaning it's completed). Note that this change has the somewhat unfortunate impact of taking a dependency on what's effectively an implementation detail of Roslyn, but the value the filtering provides is deemed to be worth it. This design is unlikely to change in the future, and as with other diagnostic/debugging features that rely on such details, it can be updated if Roslyn ever changes its scheme. In the meantime, the code will output a warning message if it can't find the state field. 3. If a state machine is found to have 0 roots but also to have a <>1__state value >= 0, that suggests it was dropped without having been completed, which is likely a sign of an application bug. The command now prints out an information message to highlight that state. For example, this program: ```C# using System; using System.Threading.Tasks; class Program { static void Main() { Task.Run(async () => await new TaskCompletionSource<bool>().Task); Console.ReadLine(); } } ``` when processed with `!DumpAsync -roots` results in: ``` Address MT Size Name #0 0000020787fb5b30 00007ff88ea1afe8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Boolean, System.Private.CoreLib],[Program+<>c+<<Main>b__0_0>d, test]] StateMachine: Program+<>c+<<Main>b__0_0>d (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000003 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd0b88 4000004 8 ...Private.CoreLib]] 1 instance 0000020787fb5b88 <>t__builder 00007ff8e9bffd58 4000005 10 ...Private.CoreLib]] 1 instance 0000020787fb5b90 <>u__1 Continuation: 0000020787fb3fc8 (System.Threading.Tasks.UnwrapPromise`1[[System.Boolean, System.Private.CoreLib]]) GC roots: Incomplete state machine (<>1__state == 0) with 0 roots. Found 1 state machines. ```
stephentoub
added a commit
that referenced
this pull request
May 31, 2018
1. Computing GC roots is a relatively slow operation, and doing it for every state machine object found in a large heap can be time consuming. Making it opt-in with -roots command-line flag. 2. Added -waiting command-line flag. DumpAsync will now retrieve the <>1__state field from the StateMachine, and if -waiting is specified, it'll filter down to state machines that have a state value >= 0, meaning the state machines are waiting at an await point. For example, given this program: ```C# using System.Threading.Tasks; class Program { static async Task Main() { await MethodA(0); await MethodA(int.MaxValue); } static async Task MethodA(int delay) => await MethodB(delay); static async Task MethodB(int delay) { await Task.Yield(); await Task.Delay(delay); } } ``` using `!DumpAsync` outputs: ``` Address MT Size Name #0 0000026848693438 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance -2 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848693490 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 0 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848693498 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 00000268486934a0 <>u__2 Continuation: 00000268486934b0 (System.Object) #1 0000026848693e68 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance -2 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848693ec0 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 0 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848693ec8 <>u__1 Continuation: 00000268486934b0 (System.Object) #2 0000026848693ed8 00007ff88ea37188 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]] StateMachine: Program+<Main>d__0 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000001 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000002 8 ...TaskMethodBuilder 1 instance 0000026848693f30 <>t__builder 00007ff8e9bcead0 4000003 10 ...vices.TaskAwaiter 1 instance 0000026848693f38 <>u__1 Continuation: 0000026848693f48 (System.Threading.Tasks.Task+SetOnInvokeMres) #3 0000026848695d30 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848695d88 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848695d90 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 0000026848695d98 <>u__2 Continuation: 0000026848695dd0 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]]) #4 0000026848695dd0 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848695e28 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848695e30 <>u__1 Continuation: 0000026848693ed8 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]]) Found 5 state machines. ``` while using `!DumpAsync -waiting` outputs only: ``` Address MT Size Name #0 0000026848693ed8 00007ff88ea37188 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]] StateMachine: Program+<Main>d__0 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000001 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000002 8 ...TaskMethodBuilder 1 instance 0000026848693f30 <>t__builder 00007ff8e9bcead0 4000003 10 ...vices.TaskAwaiter 1 instance 0000026848693f38 <>u__1 Continuation: 0000026848693f48 (System.Threading.Tasks.Task+SetOnInvokeMres) #1 0000026848695d30 00007ff88ea35e58 120 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodB>d__2, test]] StateMachine: Program+<MethodB>d__2 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000008 0 System.Int32 1 instance 1 <>1__state 00007ff8e9bd82f8 4000009 8 ...TaskMethodBuilder 1 instance 0000026848695d88 <>t__builder 00007ff8e9bc4bc0 400000a 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bee4d0 400000b 10 ...able+YieldAwaiter 1 instance 0000026848695d90 <>u__1 00007ff8e9bcead0 400000c 18 ...vices.TaskAwaiter 1 instance 0000026848695d98 <>u__2 Continuation: 0000026848695dd0 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]]) #2 0000026848695dd0 00007ff88ea36cc8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<MethodA>d__1, test]] StateMachine: Program+<MethodA>d__1 (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000004 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd82f8 4000005 8 ...TaskMethodBuilder 1 instance 0000026848695e28 <>t__builder 00007ff8e9bc4bc0 4000006 4 System.Int32 1 instance 2147483647 delay 00007ff8e9bcead0 4000007 10 ...vices.TaskAwaiter 1 instance 0000026848695e30 <>u__1 Continuation: 0000026848693ed8 (System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib],[Program+<Main>d__0, test]]) Found 3 state machines. ``` skipping the two state machines that have a `<>1__state` field value of -2 (meaning it's completed). Note that this change has the somewhat unfortunate impact of taking a dependency on what's effectively an implementation detail of Roslyn, but the value the filtering provides is deemed to be worth it. This design is unlikely to change in the future, and as with other diagnostic/debugging features that rely on such details, it can be updated if Roslyn ever changes its scheme. In the meantime, the code will output a warning message if it can't find the state field. 3. If a state machine is found to have 0 roots but also to have a <>1__state value >= 0, that suggests it was dropped without having been completed, which is likely a sign of an application bug. The command now prints out an information message to highlight that state. For example, this program: ```C# using System; using System.Threading.Tasks; class Program { static void Main() { Task.Run(async () => await new TaskCompletionSource<bool>().Task); Console.ReadLine(); } } ``` when processed with `!DumpAsync -roots` results in: ``` Address MT Size Name #0 0000020787fb5b30 00007ff88ea1afe8 112 System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Boolean, System.Private.CoreLib],[Program+<>c+<<Main>b__0_0>d, test]] StateMachine: Program+<>c+<<Main>b__0_0>d (struct) MT Field Offset Type VT Attr Value Name 00007ff8e9bc4bc0 4000003 0 System.Int32 1 instance 0 <>1__state 00007ff8e9bd0b88 4000004 8 ...Private.CoreLib]] 1 instance 0000020787fb5b88 <>t__builder 00007ff8e9bffd58 4000005 10 ...Private.CoreLib]] 1 instance 0000020787fb5b90 <>u__1 Continuation: 0000020787fb3fc8 (System.Threading.Tasks.UnwrapPromise`1[[System.Boolean, System.Private.CoreLib]]) GC roots: Incomplete state machine (<>1__state == 0) with 0 roots. Found 1 state machines. ```
Merged
dotnet-maestro-bot
referenced
this pull request
in dotnet-maestro-bot/coreclr
Sep 5, 2018
* Fix ServiceController name population perf * Split tests * Remove dead field * Remove new use of DangerousGetHandle * SafeHandle all the things! * VSB #1 * VSB dotnet#2 * Fix GLE * Initialize machineName in ctor * Test for empty name ex * Null names * Inadvertent edit * Unix build * Move interop into class * Reverse SafeHandle for HAllocGlobal * Fix tests * Disable test for NETFX * CR feedback * Pattern matching on VSB * Direct call * typo Signed-off-by: dotnet-bot <[email protected]>
jkotas
pushed a commit
that referenced
this pull request
Sep 5, 2018
* Fix ServiceController name population perf * Split tests * Remove dead field * Remove new use of DangerousGetHandle * SafeHandle all the things! * VSB #1 * VSB #2 * Fix GLE * Initialize machineName in ctor * Test for empty name ex * Null names * Inadvertent edit * Unix build * Move interop into class * Reverse SafeHandle for HAllocGlobal * Fix tests * Disable test for NETFX * CR feedback * Pattern matching on VSB * Direct call * typo Signed-off-by: dotnet-bot <[email protected]>
jkotas
pushed a commit
to jkotas/coreclr
that referenced
this pull request
Aug 27, 2020
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
Fixes dotnet/coreclr#4181 "NYI_X86: Implement PInvoke frame init inlining for x86" The main work here is to handle the custom calling convention for the x86 CORINFO_HELP_INIT_PINVOKE_FRAME helper call: it takes EDI as an argument, trashes only EAX, and returns the TCB in ESI. The code changes are as follows: 1. Lowering::InsertPInvokeMethodProlog(): don't pass the "secret stub param" for x86. Also, don't store the InlinedCallFrame.m_pCallSiteSP in the prolog: for x86 this is done at the call site, due to the floating stack pointer. 2. LinearScan::getKillSetForNode(): for helper calls, call compHelperCallKillSet() to get the killMask, to account for non-standard kill sets. 3. Morph.cpp::fgMorphArgs(): set non-standard arguments for CORINFO_HELP_INIT_PINVOKE_FRAME. 4. compHelperCallKillSet(): set the correct kill set for CORINFO_HELP_INIT_PINVOKE_FRAME. 5. codegenxarch.cpp::genCallInstruction(): set the ABI return register for CORINFO_HELP_INIT_PINVOKE_FRAME. 6. lowerxarch.cpp::TreeNodeInfoInit(): set the GT_CALL dstCandidates for CORINFO_HELP_INIT_PINVOKE_FRAME. 5 & 6 are both needed to avoid a copy. With this change, the dotnet/coreclr#1 NYI with 18415 hits over the tests is gone. The total number of NYI is now 29516. Commit migrated from dotnet/coreclr@3c7ecfe
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
OVERVIEW ======== This directory contains the SuperPMI tool used for testing the .NET just-in-time (JIT) compiler. SuperPMI has two uses: 1. Verification that a JIT code change doesn't cause any asserts. 2. Finding test code where two JIT compilers generate different code, or verifying that the two compilers generate the same code. Case dotnet/coreclr#1 is useful for doing quick regression checking when making a source code change to the JIT compiler. The process is: (a) make a JIT source code change, (b) run that newly built JIT through a SuperPMI run to verify no asserts have been introduced. Case dotnet/coreclr#2 is useful for generating assembly language diffs, to help analyze the impact of a JIT code change. SuperPMI works in two phases: collection and playback. In the collection phase, the system is configured to collect SuperPMI data. Then, run any set of .NET managed programs. When these managed programs invoke the JIT compiler, SuperPMI gathers and captures all information passed between the JIT and its .NET host. In the playback phase, SuperPMI loads the JIT directly, and causes it to compile all the functions that it previously compiled, but using the collected data to provide answers to various questions that the JIT needs to ask. The .NET execution engine (EE) is not invoked at all. TOOLS ========== There are two native executable tools: superpmi and mcs. There is a .NET Core C# program that is built as part of the coreclr repo tests build called superpmicollect.exe. All will show a help screen if passed -?. COLLECTION ========== Set the following environment variables: SuperPMIShimLogPath=<full path to an empty temporary directory> SuperPMIShimPath=<full path to clrjit.dll, the "standalone" JIT> COMPlus_AltJit=* COMPlus_AltJitName=superpmi-shim-collector.dll (On Linux, use libclrjit.so and libsuperpmi-shim-collector.so. On Mac, use libclrjit.dylib and libsuperpmi-shim-collector.dylib.) Then, run some managed programs. When done running programs, un-set these variables. Now, you will have a large number of .mc files. Merge these using the mcs tool: mcs -merge base.mch *.mc One benefit of SuperPMI is the ability to remove duplicated compilations, so on replay only unique functions are compiled. Use the following to create a "unique" set of functions: mcs -removeDup -thin base.mch unique.mch Note that -thin is not required. However, it will delete all the compilation result collected during the collection phase, which makes the resulting MCH file smaller. Those compilation results are not required for playback. Use the superpmicollect.exe tool to automate and simplify this process. PLAYBACK ======== Once you have a merged, de-duplicated MCH collection, you can play it back using: superpmi unique.mch clrjit.dll You can do this much faster by utilizing all the processors on your machine, and replaying in parallel, using: superpmi -p unique.mch clrjit.dll REMAINING WORK ============= The basic of assembly diffing are there, using the "coredistools" package. The open source build needs to be altered to use this package to wire up the correct build steps. [tfs-changeset: 1623347] Commit migrated from dotnet/coreclr@d85eb92
picenka21
pushed a commit
to picenka21/runtime
that referenced
this pull request
Feb 18, 2022
There are two two kinds of transition penalties: 1.Transition from 256-bit AVX code to 128-bit legacy SSE code. 2.Transition from 128-bit legacy SSE code to either 128 or 256-bit AVX code. This only happens if there was a preceding AVX256->legacy SSE transition penalty. The primary goal is to remove the dotnet/coreclr#1 AVX to SSE transition penalty. Added two emitter flags: contains256bitAVXInstruction indicates that if the JIT method contains 256-bit AVX code, containsAVXInstruction indicates that if the method contains 128-bit or 256-bit AVX code. Issue VZEROUPPER in prolog if the method contains 128-bit or 256-bit AVX code, to avoid legacy SSE to AVX transition penalty, this could happen for reverse pinvoke situation. Issue VZEROUPPER in epilog if the method contains 256-bit AVX code, to avoid AVX to legacy SSE transition penalty. To limite code size increase impact, we only issue VZEROUPPER before PInvoke call on user defined function if the JIT method contains 256-bit AVX code, assuming user defined function contains legacy SSE code. No need to issue VZEROUPPER after PInvoke call because dotnet/coreclr#2 SSE to AVX transition penalty won't happen since dotnet/coreclr#1 AVX to SSE transition has been taken care of before the PInvoke call. We measured ~3% to 1% performance gain on TechEmPower plaintext and verified those VTune AVX/SSE events: OTHER_ASSISTS.AVX_TO_SSE and OTHER_ASSISTS.SSE_TO_AVE have been reduced to 0. Fix dotnet/coreclr#7240 move setContainsAVX flags to lower, refactor to a smaller method refactor, fix typo in comments fix format error Commit migrated from dotnet/coreclr@cc169ea
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.