.NET Debugging Demos Lab 1: Hang- Walkthrough
Earlier this week I published the first .NET Debugging Lab. If you haven’t tried it out yet, here are the instructions.
This is a walkthrough of the lab with answers to the questions - based on results from my machine
Reproduce the problem
-
Browse to Featured Products (https://localhost:44350/Products/Featured).
This should take about 5 seconds to show, you can see the start time and execution time the bottom of the page.
-
Open up 5 browsers, all browsing to this site and refresh them simultaneously
Note the execution time for each of them and make sure that the start time is pretty much the same on all (otherwise you probably didn’t run the reg file)
Questions:
- What are the execution times?
5.13s, 8.25s, 11.532s, 15.206s
- What is the CPU usage of the w3wp.exe process when reproducing the problem? High or low CPU?
Very low CPU usage, 0-5%
- What are some potential reasons for a hang with these symptoms?
Since request times are increasing and CPU usage is low, it looks like we are waiting for a common resource that can only be used by one request at a time, and we are likely blocked waiting for something external to the process as we don’t have high CPU.
Get a memory dump
-
Start a command window and browse to your debuggers directory. Type the following command to prepare taking the dump but don’t hit enter quite yet.
procdump64.exe -ma iisexpress.exe
or with dotnet-dump
dotnet-dump -n iisexpress.exe
-
Reproduce the problem either by browsing with 5 browsers as you did before or by stressing the site with tinyget with the following command line
.\ToolsAndScripts\tinyget.ps1 -url https://localhost:44350/Products/Featured -numTimes 100
-
Hit enter in the
procdump
/dotnet-dump
command window to take the memory dump while the requests are still executing.
Questions:
- What triggers the generation of the memory dump?
This is a bit of a trick question. We trigger the dump capture with the command. There are other switches to procdump and dotnet-dump that will capture dumps on crashes, exceptions, CPU usage or hanging windows.
- What permissions do you need to take a memory dump of a process?
You need to be the owner of the process, or an admin. You don’t need to have access to the code though, so with dump debugging you can debug a process, and read stacks of processes that you don’t have the code for.
- What is the difference between running with
-ma
or without?-ma creates a mini dump with all options. This gives you the full user mode memory space, so you can browse through all the .net objects in the process, all the stacks etc. Without -ma you will get a mini dump. They are much smaller in size and only give you access to the memory immediately used by the stacks.
Open the dump in Windbg
- Open windbg and open the memory dump (.dmp file) with File/Open Crash dump.
- Set up the symbol path (see Information and Setup Instructions for more info)
- Load sos (see Information and Setup Instructions for more info)
Examine the stacks
-
Examine the native call stacks
~* kb 2000
-
Examine the .net call stacks
~* e !clrstack
Questions:
- Do you see any patterns or recognize any of the call stacks that suggests a thread is waiting for a synchronization mechanism?
I see a lot of threads with this native stack
38 Id: 60f0.a2c0 Suspend: 0 Teb: 000000f6`e7630000 Unfrozen "" # Child-SP RetAddr Call Site 00 000000f6`edc4dac8 00007ffc`7e0b0d40 ntdll!NtWaitForMultipleObjects+0x14 01 000000f6`edc4dad0 00007ffb`eeba6fb6 KERNELBASE!WaitForMultipleObjectsEx+0xf0 02 (Inline Function) --------`-------- coreclr!Thread::DoAppropriateAptStateWait+0x44 03 000000f6`edc4ddc0 00007ffb`eeba6d35 coreclr!Thread::DoAppropriateWaitWorker+0x242 04 000000f6`edc4dec0 00007ffb`eeba6355 coreclr!Thread::DoAppropriateWait+0x89 05 (Inline Function) --------`-------- coreclr!CLREventBase::WaitEx+0x53 06 000000f6`edc4df40 00007ffb`eeba60e9 coreclr!CLREventBase::Wait+0x59 07 000000f6`edc4df90 00007ffb`eeba5efa coreclr!AwareLock::EnterEpilogHelper+0x1d5 08 000000f6`edc4e0f0 00007ffb`eeba5eaa coreclr!AwareLock::EnterEpilog+0x4a 09 000000f6`edc4e150 00007ffb`eeba8851 coreclr!AwareLock::Enter+0xb2 0a (Inline Function) --------`-------- coreclr!SyncBlock::EnterMonitor+0x8 0b (Inline Function) --------`-------- coreclr!ObjHeader::EnterObjMonitor+0xd 0c (Inline Function) --------`-------- coreclr!Object::EnterObjMonitor+0x16 0d 000000f6`edc4e180 00007ffb`eeba89a0 coreclr!JIT_MonEnter_Helper+0x145 0e 000000f6`edc4e330 00007ffb`8f868c50 coreclr!JIT_MonReliableEnter_Portable+0xe0 0f 000000f6`edc4e370 00007ffb`8f867f6b 0x00007ffb`8f868c50 10 000000f6`edc4e450 00007ffb`8f7f132d 0x00007ffb`8f867f6b 11 000000f6`edc4e550 00007ffb`8faf9cf1 0x00007ffb`8f7f132d 12 000000f6`edc4e580 00007ffb`8faefc1e 0x00007ffb`8faf9cf1
and this .net stack
OS Thread Id: 0x419c (70) Child SP IP Call Site 000000F68B36E008 00007ffc8036cb04 [GCFrame: 000000f68b36e008] 000000F68B36E178 00007ffc8036cb04 [GCFrame: 000000f68b36e178] 000000F68B36E1C8 00007ffc8036cb04 [HelperMethodFrame_1OBJ: 000000f68b36e1c8] System.Threading.Monitor.ReliableEnter(System.Object, Boolean ByRef) 000000F68B36E320 00007ffb8f868c50 BuggyBits.Models.DataLayer.GetFeaturedProducts() 000000F68B36E400 00007ffb8f867f6b BuggyBits.Controllers.ProductsController.Featured() 000000F68B36E500 00007ffb8f7f132d SOS Warning: Loading symbols for dynamic assemblies is not yet supported DynamicClass.lambda_method(System.Runtime.CompilerServices.Closure, System.Object, System.Object[]) 000000F68B36E530 00007ffb8faf9cf1 Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor+SyncActionResultExecutor.Execute(Microsoft.AspNetCore.Mvc.Infrastructure.IActionResultTypeMapper, Microsoft.Extensions.Internal.ObjectMethodExecutor, System.Object, System.Object[]) 000000F68B36E570 00007ffb8faefc1e Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeActionMethodAsync() [/_/src/Mvc/Mvc.Core/src/Infrastructure/ControllerActionInvoker.cs @ 400] 000000F68B36E5F0 00007ffb8fad3e65 Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State ByRef, Scope ByRef, System.Object ByRef, Boolean ByRef) 000000F68B36E670 00007ffb8fae1ef3 Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeNextActionFilterAsync() [/_/src/Mvc/Mvc.Core/src/Infrastructure/ControllerActionInvoker.cs @ 338] 000000F68B36E6F0 00007ffb8fad3d92 Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State ByRef, Scope ByRef, System.Object ByRef, Boolean ByRef) 000000F68B36E770 00007ffb8fae1ef3 Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeNextActionFilterAsync() [/_/src/Mvc/Mvc.Core/src/Infrastructure/ControllerActionInvoker.cs @ 338] 000000F68B36E7F0 00007ffb8faf9c5c Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeNextActionFilterAwaitedAsync() [/_/src/Mvc/Mvc.Core/src/Infrastructure/ControllerActionInvoker.cs @ 356] 000000F68B36E830 00007ffb8fc8118d Microsoft.AspNetCore.Mvc.Controller.OnActionExecutionAsync(Microsoft.AspNetCore.Mvc.Filters.ActionExecutingContext, Microsoft.AspNetCore.Mvc.Filters.ActionExecutionDelegate) [/_/src/Mvc/Mvc.ViewFeatures/src/Controller.cs @ 350] 000000F68B36E870 00007ffb8fad3b81 Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(State ByRef, Scope ByRef, System.Object ByRef, Boolean ByRef) 000000F68B36E8F0 00007ffb8faeb4da Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.InvokeInnerFilterAsync() [/_/src/Mvc/Mvc.Core/src/Infrastructure/ControllerActionInvoker.cs @ 490] 000000F68B36E970 00007ffb8fadb090 Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(State ByRef, Scope ByRef, System.Object ByRef, Boolean ByRef) 000000F68B36EA10 00007ffb8fc80e13 Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeNextResourceFilter() [/_/src/Mvc/Mvc.Core/src/Infrastructure/ResourceInvoker.cs @ 976] 000000F68B36EA90 00007ffb8fadab9e Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(State ByRef, Scope ByRef, System.Object ByRef, Boolean ByRef) 000000F68B36EB30 00007ffb8fafd2da Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeFilterPipelineAsync() [/_/src/Mvc/Mvc.Core/src/Infrastructure/ResourceInvoker.cs @ 240] 000000F68B36EBB0 00007ffb8fc85b66 Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.InvokeAsync() [/_/src/Mvc/Mvc.Core/src/Infrastructure/ResourceInvoker.cs @ 188] 000000F68B36EC20 00007ffb8faee772 Microsoft.AspNetCore.Mvc.Routing.ActionEndpointFactory+c__DisplayClass7_0.b__0(Microsoft.AspNetCore.Http.HttpContext) [/_/src/Mvc/Mvc.Core/src/Routing/ActionEndpointFactory.cs @ 429] 000000F68B36ECC0 00007ffb8fafd0e2 Microsoft.AspNetCore.Routing.EndpointMiddleware.Invoke(Microsoft.AspNetCore.Http.HttpContext) [/_/src/Http/Routing/src/EndpointMiddleware.cs @ 86] 000000F68B36ED30 00007ffb8faed738 Microsoft.AspNetCore.Authorization.AuthorizationMiddleware+d__5.MoveNext() 000000F68B36EDE0 00007ffb8faed034 System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Microsoft.AspNetCore.Authorization.AuthorizationMiddleware+d__5, Microsoft.AspNetCore.Authorization.Policy]](d__5 ByRef) [/_/src/System.Private.CoreLib/shared/System/Runtime/CompilerServices/AsyncMethodBuilder.cs @ 1036] 000000F68B36EE40 00007ffb8faecf80 Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(Microsoft.AspNetCore.Http.HttpContext) 000000F68B36EEE0 00007ffb8fc8057e Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware.Invoke(Microsoft.AspNetCore.Http.HttpContext) [/_/src/Middleware/StaticFiles/src/StaticFileMiddleware.cs @ 93] 000000F68B36EF40 00007ffb8faf7985 Microsoft.AspNetCore.HttpsPolicy.HttpsRedirectionMiddleware.Invoke(Microsoft.AspNetCore.Http.HttpContext) [/_/src/Middleware/HttpsPolicy/src/HttpsRedirectionMiddleware.cs @ 114] 000000F68B36EFB0 00007ffb8fae728a Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware+d__9.MoveNext() 000000F68B36F030 00007ffb8faf7884 System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware+d__9, Microsoft.AspNetCore.Diagnostics]](d__9 ByRef) [/_/src/System.Private.CoreLib/shared/System/Runtime/CompilerServices/AsyncMethodBuilder.cs @ 1036] 000000F68B36F090 00007ffb8faf8bed Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware.Invoke(Microsoft.AspNetCore.Http.HttpContext) 000000F68B36F100 00007ffb8fc8ad3d Microsoft.AspNetCore.HostFiltering.HostFilteringMiddleware.Invoke(Microsoft.AspNetCore.Http.HttpContext) [/_/src/Middleware/HostFiltering/src/HostFilteringMiddleware.cs @ 75] 000000F68B36F140 00007ffb8faea8e7 Microsoft.AspNetCore.Server.IIS.Core.IISHttpContextOfT`1+d__2[[System.__Canon, System.Private.CoreLib]].MoveNext() 000000F68B36F1B0 00007ffb8fc867fc System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Microsoft.AspNetCore.Server.IIS.Core.IISHttpContextOfT`1+d__2[[System.__Canon, System.Private.CoreLib]], Microsoft.AspNetCore.Server.IIS]](d__2<System.__Canon> ByRef) [/_/src/System.Private.CoreLib/shared/System/Runtime/CompilerServices/AsyncMethodBuilder.cs @ 1036] 000000F68B36F220 00007ffb8fc866c7 Microsoft.AspNetCore.Server.IIS.Core.IISHttpContextOfT`1[[System.__Canon, System.Private.CoreLib]].ProcessRequestAsync() 000000F68B36F2A0 00007ffb8faebe66 Microsoft.AspNetCore.Server.IIS.Core.IISHttpContext+d__165.MoveNext() 000000F68B36F340 00007ffb8fc84784 System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Microsoft.AspNetCore.Server.IIS.Core.IISHttpContext+d__165, Microsoft.AspNetCore.Server.IIS]](d__165 ByRef) [/_/src/System.Private.CoreLib/shared/System/Runtime/CompilerServices/AsyncMethodBuilder.cs @ 1036] 000000F68B36F3A0 00007ffb8fc88e38 Microsoft.AspNetCore.Server.IIS.Core.IISHttpContext.HandleRequest() 000000F68B36F3F0 00007ffb8fc88dd9 Microsoft.AspNetCore.Server.IIS.Core.IISHttpContext.Execute() [/_/src/Servers/IIS/IIS/src/Core/IISHttpContext.cs @ 605] 000000F68B36F420 00007ffb8fadcb08 System.Threading.ThreadPoolWorkQueue.Dispatch() [/_/src/System.Private.CoreLib/shared/System/Threading/ThreadPool.cs @ 699] 000000F68B36F8A0 00007ffbeec46ba3 [DebuggerU2MCatchHandlerFrame: 000000f68b36f8a0]
So we can see from this stack (reading from the bottom up, just like in Visual Studio) that the
ProductsController.Featured()
callsDataLayer.GetFeaturedProducts()
and then tries to get a lock - so something like thisvoid GetFeaturedProducts(){ ... lock(...){ // <-- This is where we are stuck // do something } }
Troubleshoot the hang
-
Determine the ID of the thread owning the lock
!syncblk
- What thread owns the lock?
My !syncblk reports this
0:000> !syncblk Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner 38 000002021C41E2E8 59 1 000002021C7D1C50 3c94 51 000001ff908f1660 System.Object ----------------------------- Total 64 CCW 1 RCW 1 ComClassFactory 0 Free 16
The thread that owns the lock is Thread 51 - so this thread has entered the do something part of the code above
-
How many threads are waiting for the lock?
Hint: MonitorHeld = 1 for each owner and 2 for each waiter. For us this means (59-1)/2 = 29 - which also jives well with the number I get if I run
.shell -ci "~* e !clrstack" FIND /C "Monitor.ReliableEnter"
- What thread owns the lock?
-
Pick one of the waiters (Hint: waiters will sit in
AwareLock::Enter
) and take a look at what it is doing.~5s (move to thread 5, replace 5 with actual thread ID) kb 2000 (examine native stack) !clrstack (examine .net stack)
- In which .net function is it waiting for the lock?
DataLayer.GetFeaturedProduct()
- In which .net function is it waiting for the lock?
-
Determine what the owning thread is doing
~5s (move to thread 5, replace 5 with actual thread ID) kb 2000 (examine native stack) !clrstack (examine .net stack)
- Why is it blocking?
Thread 51 seems to be sitting in a
Thread.SleepInternal()
- blocking while holding a lock… this is just here to demonstrate the issue, normally you would block waiting for an API call or a database or similar.
- Why is it blocking?
-
Examine the code for .NET method owning the lock to verify your theory.
As expected GetFeaturedProducts() takes a lock and then sits in a sleep for 5 seconds, this is of course a place holder for a long running operation, but a sleep is easier to use to get a consistent repro in a demo. The resolution here will of course depend largely on the application but it would include looking into how long this operation is supposed to be and finding out if threads are blocking more often than expected, and after review finding ways to make sure we lock around as small a portion of code as possible to avoid having lots of requests waiting for a single resource.
Hope you found the lab interesting, I will try to get the next lab (a crash scenario) out by the end of the week.
Feel free to add your own comments if you think I missed something in my answers or if your answers differ from mine (other than thread ids, number of waiters etc.) or if you didn’t like the lab format at all and think it would be better to have more or for that matter less details in the instructions.
Have fun, Tess