-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Description
Is there an existing issue for this?
- I have searched the existing issues
Describe the bug
I want to revive the #43651 issue, as now we have been able to create a repro case that fails basically every time for several apps.
The issue seems to happen when there is a big load of CPU and we have a lot of sites starting at the same time. Some of them will start successfully, but some of them will stay in a zombie-like state stuck initializing, and the processes are there but the service won't be able to serve requests ever again and IIS does nothing to stop/reset them either.
So the problem arises for us specifically in servers that have a lot of sites installed on them (about 200 per box) and we have
preload enabled for them. Some of them have a heavy initialization process, so every time we restart the server the box CPU utilization would max out for the initial 10 minutes or so. After the initialization is done the CPU comes down to normal levels.
But at this point, we would be left with a bunch of services that never came into life and are stuck in the load of aspnetcorev2_inprocess.dll as explained in the previous issue. We noticed there is a pattern in the memory consumption of the zombie processes, so it's easy to spot them once you know. In the repro cases that I created, we have a "SlowStartWebApi" which represents the app that needs to do some initialization when starting (and maximizing the CPU). Once the apps are initialized the memory is supposed to look like these (around 8 MB):

But the dead ones will look something similar to these (around 4 MB):

Description From old issue #43651 as a reference
We've recently converted one of our web applications to ASP.NET Core on .NET 6 (from ASP.NET MVC/WebAPI on .NET Framework 4.8) and the new version is slowly rolling out to our customer base. This is hosted in IIS. We've had this running in production for several weeks now with no problems... until now.
This morning one of those sites appears to be hanging and was not responding to web requests. Upon further investigation by taking a process dump of w3wp.exe and examining it, it appears that the ASP.NET Core Module had hung during application initialization.
Extremely strangely, this occurred identically across two different servers, each serving the same site. This affected application is also the only ASP.NET Core application on each of those servers.
The stack trace of the only thread actually doing any kind of work is:
ntdll.dll!NtWaitForSingleObject�()
ntdll.dll!LdrpDrainWorkQueue()
ntdll.dll!LdrpLoadDllInternal()
ntdll.dll!LdrpLoadDll�()
ntdll.dll!LdrLoadDll()
KERNELBASE.dll!LoadLibraryExW()
aspnetcorev2.dll!HandlerResolver::LoadRequestHandlerAssembly(const IHttpApplication & pApplication, const std::filesystem::path & shadowCopyPath, const ShimOptions & pConfiguration, std::unique_ptr<ApplicationFactory,std::default_delete<ApplicationFactory>> & pApplicationFactory, ErrorContext & errorContext) Line 111
at D:\a\_work\1\s\src\Servers\IIS\AspNetCoreModuleV2\AspNetCore\HandlerResolver.cpp(111)
aspnetcorev2.dll!HandlerResolver::GetApplicationFactory(const IHttpApplication & pApplication, const std::filesystem::path & shadowCopyPath, std::unique_ptr<ApplicationFactory,std::default_delete<ApplicationFactory>> & pApplicationFactory, const ShimOptions & options, ErrorContext & errorContext) Line 172
at D:\a\_work\1\s\src\Servers\IIS\AspNetCoreModuleV2\AspNetCore\HandlerResolver.cpp(172)
aspnetcorev2.dll!APPLICATION_INFO::TryCreateApplication(IHttpContext & pHttpContext, const ShimOptions & options, ErrorContext & error) Line 195
at D:\a\_work\1\s\src\Servers\IIS\AspNetCoreModuleV2\AspNetCore\applicationinfo.cpp(195)
aspnetcorev2.dll!APPLICATION_INFO::CreateApplication(IHttpContext & pHttpContext) Line 106
at D:\a\_work\1\s\src\Servers\IIS\AspNetCoreModuleV2\AspNetCore\applicationinfo.cpp(106)
aspnetcorev2.dll!APPLICATION_INFO::CreateHandler(IHttpContext & pHttpContext, std::unique_ptr<IREQUEST_HANDLER,IREQUEST_HANDLER_DELETER> & pHandler) Line 63
at D:\a\_work\1\s\src\Servers\IIS\AspNetCoreModuleV2\AspNetCore\applicationinfo.cpp(63)
aspnetcorev2.dll!ASPNET_CORE_PROXY_MODULE::OnExecuteRequestHandler(IHttpContext * pHttpContext, IHttpEventProvider * __formal) Line 103
at D:\a\_work\1\s\src\Servers\IIS\AspNetCoreModuleV2\AspNetCore\proxymodule.cpp(103)
iiscore.dll!NOTIFICATION_CONTEXT::RequestDoWork()
iiscore.dll!NOTIFICATION_CONTEXT::CallModulesInternal()
iiscore.dll!NOTIFICATION_CONTEXT::CallModules(int,unsigned long,long,unsigned long,class W3_CONTEXT_BASE *,class IHttpEventProvider *)
iiscore.dll!NOTIFICATION_MAIN::DoWork()
iiscore.dll!W3_CONTEXT_BASE::StartNotificationLoop(class NOTIFICATION_CONTEXT *,int)
iiscore.dll!APPLICATION_PRELOAD_PROVIDER::ExecuteRequest(class IHttpContext *,class IHttpUser *)
warmup.dll!DoApplicationPreload(class IGlobalApplicationPreloadProvider *)
iiscore.dll!W3_SERVER::GlobalNotify()
iiscore.dll!W3_SERVER::NotifyApplicationPreload(int)
iiscore.dll!IISCORE_PROTOCOL_MANAGER::PreloadApplication(unsigned long,unsigned short const *,int)
w3wphost.dll!W3WP_HOST::ProcessHttpPreloadApplications(int)
w3wphost.dll!W3WP_HOST::ProcessPreloadApplications(unsigned long)
w3wphost.dll!WP_IPM::AcceptMessage()
iisutil.dll!IPM_MESSAGE_PIPE::MessagePipeCompletion(void *,unsigned char)
ntdll.dll!RtlpTpWaitCallback()
ntdll.dll!TppExecuteWaitCallback()
ntdll.dll!TppWorkerThread()
kernel32.dll!BaseThreadInitThunk�()
ntdll.dll!RtlUserThreadStart�()
The path parameter being passed by `aspnetcorev2.dll`'s `LoadRequestHandlerAssembly` to `LoadLibrary` is `C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App\6.0.7\aspnetcorev2_inprocess.dll`.
The stack trace also suggests to me that this was happening in application preloading (we do have preload enabled).
I have a process dump available on request, if you have somewhere secure that I can upload it.
Expected Behavior
After slow start of all sites all of them are initialized correctly and able to serve requests properly
Steps To Reproduce
I can provide a VM with everything already installed and failing every time there is a reset. I can also provide memory dumps of the failing processes if required as well.
Introduction to the repro project
Uploaded the basic project to https://github.com/jdmerinor/aspdotnetcorehangingreprocase
This contains 3 pieces:
- SlowStartWebApi which is the server that represents the slow start service that consumes a good chunk of CPU, this is nothing super special really is the typical dotnet 6 example web api weather app but with these lines to make it use a bunch of CPU before calling
app.Run();:
using var pacho = SHA512.Create();
var buffer = Encoding.ASCII.GetBytes("sadfhasdhfhasdfklsadjhfklsdahfojhdsaf");
while (stopwatch.Elapsed < TimeSpan.FromMinutes(1.5))
{
count--;
pacho.ComputeHash(buffer);
count++;
}- AppInstaller which is just using the
Microsoft.Web.Administrationto install all the sites in a convenient way (100 of them) with the following settings:
Settings line
var appPool = serverManager.ApplicationPools.Add(appPoolName);
appPool.ProcessModel.LoadUserProfile = false;
appPool.Recycling.PeriodicRestart.Time = TimeSpan.Zero;
appPool.ManagedPipelineMode = ManagedPipelineMode.Integrated;
appPool.StartMode = StartMode.AlwaysRunning;
appPool.ProcessModel.IdleTimeout = TimeSpan.Zero;
appPool.Enable32BitAppOnWin64 = false;
appPool.ManagedRuntimeVersion = string.Empty;
application.ApplicationPoolName = appPool.Name;
application["preloadEnabled"] = true; //IMPORTANT FOR THE REPRODUCTION- UIServer which is just a normal dotnet 6 example web api weather app. I added this one because I wasn't sure if to repro the bug I needed more than one app per site... I have the feeling it will probably fail without it but I left it here for completeness.
Reproduction steps
I followed the following steps to reproduce the issue:
-
Download a clean Windows server vhd from https://www.microsoft.com/en-us/evalcenter/download-windows-server-2019
-
Add IIS role and make sure the Application Initialization inside the Application Development is also installed.

-
Install dotnet 6 bundle from https://dotnet.microsoft.com/en-us/download/dotnet/thank-you/runtime-aspnetcore-6.0.11-windows-hosting-bundle-installer
-
Put the apps publish folder in the desktop (using
dotnet publish --configuration Release) -
Make sure IIS_IUSRS is added to the folder permissions.
-
Make sure the server has a selfsigned or valid https certificate for the sites (otherwise you will get connection refused when trying to connect through https to the apps)
-
Use the app installer to add a bunch of apps (might need to change the code to suit your server paths)
-
Add hosts file entries to be able to hit the different sites from within the server
-
Restart the server (This is going to initially take a while because the "SlowServer" is doing a bunch of hashing to max the CPU...)
-
Once the initialization is done you can then try to call some of the endpoints and you will realize that they are just not responding at all... they will just load forever and they won't ever recover. It's easy to spot the ones that are broken because they would have really slow memory allocated to them in the w3wp.exe process
Exceptions (if any)
No response
.NET Version
6.0.11
Anything else?
No response