-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Set the initial spin count unit to 10 for DATAS. #105634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/coreclr/gc/gc.cpp
Outdated
|
|
||
| #ifdef MULTIPLE_HEAPS | ||
| #ifdef DYNAMIC_HEAP_COUNT | ||
| yp_spin_count_unit = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to set the low initial spin counts unconditionally for all flavors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah would be ideal to set it unconditionally, unless there are other perf concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do some more testing with SVR and a low initial yp spin count to see if there is any impact. Changed my mind and currently isolated it to just the DATAS case to minimize impact but will get more data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just took a look at more data and concluded that for WKS and SVR, we aren't exhibiting the case where we don't have the Yield Processor Factor before the first GC that was the cause of the regression. We can possibly consider setting an default agnostic of flavor after doing more analysis on other benchmarks.
|
this is actually very different from what I tried. I just set the first few GC's spin count to 10 but kept the same values as soon changing the spin count should be done very carefully - I made some attempts before but didn't come up with something that worked well yet. the tiny benchmarks could show great improvement but real world scenarios are much more complicated. I would limit this PR to just fixing the startup problem. |
Fixes #94076 - there will never be a full improvement vs. WKS but this gets us down from a 18.75% regression to a 6.25% one in terms of Startup time.
The main issue here is that DATAS calls
change_heap_countduring the startup phase without having access to the yp_spin_count provided bySetYieldProcessorScalingFactorset by the YieldProcessor and this delays the startup.Initial Data
Ran multiple yp_spin_count configurations on a 28 core Windows machine (which seemed to have repro'd the issue, as well):
Based on this, a YP Spin Count Unit of 10 seems to be a viable option.
Performance Results
Observing the results from 4 iterations on a 28-core machine for the Stage1Aot scenario, we observe that there were no regressions (all within a percentage diff of < 2%).
For all the ASP.NET benchmarks, we found that only for Stage1Aot is where we trigger a GC before the YieldProcessor Measurement is received by the GC. We found this by observing the timestamps of:
The perfview args to get the appropriate events are:
/BufferSizeMB:256 /StackCompression /KernelEvents:Process /ClrEvents:GC,Threading /Providers:"Benchmarks" /NoGui /NoNGenRundown".