-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Proxy scenario perf regression #9404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The first drop corresponds to when I changed the benchmarks to actually flow the headers. |
@stephentoub this benchmark is mostly about HttpClient under load, no SSL. Do you see anything in the changes I linked to that could explain the regression? |
Nothing jumps out at me. |
I will investigate then. |
After investigating, it looks like a regression between 3.0.0-preview-27408-1 -> 3.0.0-preview-27408-3 Microsoft.NetCore.App / Core FX Microsoft.NetCore.App / Core CLR Then I fetched both runtimes and tried to find which changed assemblies might cause this, and I can recover the previous performance by using Looking at the diff, there is this file that has changed:
Something I need to mention also is that after this runtime version is taken into account only this benchmarks is changed. Everything seems stable. Here is the code that the benchmark is running: |
/cc @filipnavara @janvorli as it seems to come from this commit: https://github.com/dotnet/coreclr/pull/22390/files I'd say that it's not because of this PR dotnet/coreclr@34d50b0 though they are pretty close but the diff starts on Feb 04. To be sure I could run a benchmark if someone could give me the System.Gloablization.Native.so that only reverts this commit. |
Are you absolutely positive about those commit lists? I'd expect dotnet/coreclr#22378 to be the primary suspect if System.Globalization.Native is involved. It's just one commit prior to the lists you linked. The change in https://github.com/dotnet/coreclr/pull/22390/files should be harmless unless the I'll try to reproduce the results on the weekend, but if you find anything out please share it here. |
I'm going through possible regressions from dotnet/coreclr#22378. This line is missing |
I have applied a delta to the current master that undoes the change made in dotnet/coreclr#22390 and shared the System.Globalization.Native.so with @sebastienros. He has verified that it fixes the perf regression. |
@filipnavara if you are not sure whether a change is beneficial or not, just build many versions and I can run the numbers for each build. Put them on //scratch2. |
@janvorli Wow. That's very surprising. I'll look at it again. The change itself doesn't fix any bug and it should have generally improved performance which makes it surprising that quite the opposite happened. If I don't come up with a good explanation I'll submit a PR to revert it. |
I am quite confident with the way I ran it, but there is always space for a human error (me). To be very confident feel free to give me versions with obvious regressions, even based on the reverted commit, to be sure I get the expected results. That would also make me fell more confident. Something I want to insist on also is that I just patch "this" file, no other one, and I am not sure if one or the other versions could actually trigger some behavior coming from a separate module. I can also share the full standalone application if you want to check which bits are measured. But you will see that only this file will have changed in the end. |
I don't doubt your findings. It must be some weird thing with the way the |
I'm getting a suspicion that the problem is actually the changed structure layout in that commit that causes the pthread mutex to be aligned differently. However that's kinda hard to measure. @janvorli Would it be possible for you to make a build that swaps these two lines: https://github.com/dotnet/coreclr/blob/42b5509b1b8f3be12876579aa23a3cb3ed2ca7a4/src/corefx/System.Globalization.Native/pal_collation.c#L45-L46 ? My only Linux build machines are in the cloud and it would take me a while to set it up properly. |
Sure, I will do that. |
@filipnavara that change had no effect. |
Thanks for testing it. I will submit a PR to revert the change and continue investigating on my Linux machine to see if it's anything I can reproduce. |
I reduced it to the following test case in plain C: https://gist.github.com/filipnavara/31791b216078706e0b465f7a103ff01f On macOS using |
How do you run measure the differences? I ran the code but it simply output |
@sebastienros by running it under
|
on my physical machines I get this:
|
A side question; if the regression is in Globalization; how is HttpClient using globalization? Wouldn't have thought it would be involved; though I may be wrong? i.e isn't it all mostly ascii formats and fairly invariant? |
It's not just HttpClient involved here... isn't this an ASP.NET site that in turn makes HttpClient requests (or HttpClient requests to an ASP.NET site)? |
Yeah; but it doesn't do anything particularly interesting, except this line...
Checking |
Made a PR for that class #9537 |
Thanks for checking (and fixing the incorrect usage of CurrentCulture)! |
Fixed, thanks @filipnavara @benaadams @janvorli And as a side note, we are now measuring on windows with the same hardware, and it's more than 200K RPS where Linux is at 80K. For some comparison, on the same proxy scenario HAProxy is at 110K RPS on linux with the same conditions. And we keep a 99th latency of 6ms. @NickCraver maybe something interesting for you here? Providing my numbers are correct, there is always room for errors. |
Uh oh!
There was an error while loading. Please reload this page.
Actually in CoreFx, but opening in aspnet until I get more information
Updated
3.0.0-preview-27408-1 -> 3.0.0-preview-27409-1
Microsoft.NetCore.App / Core FX
dotnet/corefx@c2bbb6a...e1c0e1d
Microsoft.NetCore.App / Core CLR
dotnet/coreclr@a51af08...4e5df11
Command lines:
The text was updated successfully, but these errors were encountered: