ASP.NET Hang and OutOfMemory exceptions caused by STA components
I have talked about blocked finalizers before and this issue is a special case of blocked finalizers where we are not really finalizing a .NET object but rather cleaning up all Com Callable Wrappers (CCWs) and getting stuck doing that.
Problem description
The situation we are facing is that the ASP.NET process starts to go hey wire both in response times and in memory usage. All of a sudden memory just starts growing and growing and we need to figure out why.
Troubleshooting
We took a memory dump when memory usage was high (~837 MB private bytes which we could see from the dump size) with adplus -hang -pn w3wp.exe
, but really the dump could have just been taken anytime after the memory started rapidly increasing.
Around ~469 MB of the memory are pure .net objects (as we can see from GC Heap Size in !eeheap -gc
) and the large object segments are minimal so there isn’t a whole lot of large objects…
0:015> !eeheap -gc
Number of GC Heaps: 2
------------------------------
Heap 0 (000db128)
generation 0 starts at 0x33bc0c98
generation 1 starts at 0x338bc070
generation 2 starts at 0x02410038
ephemeral segment allocation context: none
segment begin allocated size reserved
00176d48 7b451688 7b467f9c 0x00016914(92,436) 00003000
000e96e0 7a721784 7a74248c 0x00020d08(134,408) 00003000
000e1ee0 790d6358 790f5800 0x0001f4a8(128,168) 00003000
02410000 02410038 0640f130 0x03fff0f8(67,105,016) 01fdf000
17d80000 17d80038 1bd69da8 0x03fe9d70(67,018,096) 0000e000
2a070000 2a070038 2e063b38 0x03ff3b00(67,058,432) 00009000
321d0000 321d0038 33caad50 0x01adad18(28,159,256) 01cb6000
Large object heap starts at 0x0a410038
segment begin allocated size reserved
0a410000 0a410038 0a4234f0 0x000134b8(79,032) 01fdf000
Heap Size 0xdb215fc(229,774,844)
------------------------------
Heap 1 (000dc138)
generation 0 starts at 0x38146e28
generation 1 starts at 0x37e5033c
generation 2 starts at 0x06410038
ephemeral segment allocation context: none
segment begin allocated size reserved
06410000 06410038 0a40ee00 0x03ffedc8(67,104,200) 01fdf000
1bd80000 1bd80038 1fce8e6c 0x03f68e34(66,489,908) 00077000
26070000 26070038 2a05b85c 0x03feb824(67,024,932) 03ff7000
36510000 36510038 38a8cc68 0x0257cc30(39,308,336) 01a7c000
Large object heap starts at 0x0c410038
segment begin allocated size reserved
0c410000 0c410038 0c410048 0x00000010(16) 01fff000
Heap Size 0xe4d0060(239,927,392)
------------------------------
GC Heap Size 0x1bff165c(469,702,236)
If we dump out the managed heaps with !dumpheap -stat
the most memory consuming objects are shown below.
...
0ec99d1c 38,575 617,200 Oracle.DataAccess.Client.OracleParameterCollection
68a35670 11,135 623,560 System.Web.UI.WebControls.HiddenField
0ec693f4 848 630,912 ASP.aspx_mypage_aspx
7a742f30 37,751 755,020 System.ComponentModel.EventHandlerList+ListEntry
68a652e8 73,493 881,916 System.Web.UI.WebControls.BorderStyle
7a753288 55,789 892,624 System.Collections.Specialized.NameObjectCollectionBase+NameObjectEntry
653c5b14 15,019 961,216 System.Data.DataRow
0ec96d64 34,834 975,352 Oracle.DataAccess.Client.MetaData
79124670 3,936 985,932 System.Char[]
68a2c690 14,989 1,019,252 System.Web.UI.HtmlControls.HtmlTableRow
653c3e94 7,703 1,140,044 System.Data.DataColumn
653d61c0 1,137 1,177,932 System.Data.RBTree`1+Node[[System.Data.DataRow, System.Data]][]
0ec9bd4c 36,872 1,179,904 Oracle.DataAccess.Client.OracleRowUpdatingEventHandler
10fa17dc 13,293 1,329,300 MyControls.MyCheckBox
791052bc 117,882 1,414,584 System.Int16
68a37884 17,827 1,426,160 System.Web.UI.WebControls.Label
79104f64 151,695 1,820,340 System.Boolean
68a3e7a0 155,042 1,860,504 System.Web.UI.WebControls.FontInfo
0ec634fc 25,512 2,040,960 MyControls.MyTextBox
68a7d910 26,604 2,128,320 System.Web.UI.WebControls.Button
68a2f704 105,846 2,540,304 System.Web.UI.CssStyleCollection
0ec96c84 98,136 2,747,808 Oracle.DataAccess.Client.ColMetaRef
68a2f694 202,158 3,234,528 System.Web.UI.AttributeCollection
79103b6c 75,803 3,335,332 System.Threading.ReaderWriterLock
68a2f7fc 94,486 3,401,496 System.Web.UI.ControlCollection
0ec9ae14 36,872 3,687,200 Oracle.DataAccess.Client.OracleDataAdapter
68a6423c 150,531 4,816,992 System.Web.UI.WebControls.FontUnit
68a2c918 85,276 5,457,664 System.Web.UI.HtmlControls.HtmlTableCell
0ec6322c 69,949 5,595,920 MyControls.MyLabel
0ec9b404 36,872 5,604,544 Oracle.DataAccess.Client.OracleCommandBuilder
68a31bdc 476,216 5,714,592 System.Web.UI.IndexedString
68a884a0 96,876 5,812,560 System.Web.UI.LiteralControl
68a7ca28 365,718 5,851,488 System.Web.UI.Pair
7ae75d7c 252,629 6,063,096 System.Drawing.Color
68a85ea0 392,440 6,279,040 System.Web.UI.StateBag
0ec9975c 40,043 6,406,880 Oracle.DataAccess.Client.OracleCommand
68a7745c 159,338 7,010,872 System.Web.UI.WebControls.Style
790fea70 139,382 7,805,392 System.Collections.Hashtable
68a92e2c 177,707 7,819,108 System.Web.UI.Control+OccasionalFields
68a86004 331,329 7,951,896 System.Web.UI.WebControls.Unit
790fed1c 698,227 8,378,724 System.Int32
791036b0 401,277 9,630,648 System.Collections.ArrayList
000daa58 296 10,030,596 Free
7a747bac 378,430 10,596,040 System.Collections.Specialized.ListDictionary
7a747ad4 608,586 12,171,720 System.Collections.Specialized.HybridDictionary
7a747c78 837,066 16,741,320 System.Collections.Specialized.ListDictionary+DictionaryNode
79124418 4,555 17,726,104 System.Byte[]
0ec9a664 165,910 17,918,280 Oracle.DataAccess.Client.OracleParameter
68a131b4 1,758,380 28,134,080 System.Web.UI.StateItem
79124228 705,459 31,266,188 System.Object[]
791242ec 142,053 43,835,496 System.Collections.Hashtable+bucket[]
790fa3e0 1,032,738 116,787,020 System.String
Total 11,377,657 objects, Total size: 469,685,264
So here, most of the memory is going to either data related items, UI related items or generic objects. In short, let’s say we got rid of all the ASP.aspx_mypage_aspx
objects for example, then it is very likely that we would get rid of most of the UI related items. And if we got rid of all the DataSets in the dump we would probably get rid of all or most of the data related items.
If I focus on the aspx pages and dump them out, and then !gcroot
a few of those aspx pages to see where they are rooted we get this.
0:000> !dumpheap -mt 0ec693f4
Using our cache to search the heap.
Address MT Size Gen
0251f0ec 0ec693f4 744 2 ASP.aspx_mypage_aspx
02555c90 0ec693f4 744 2 ASP.aspx_mypage_aspx
0259cfc0 0ec693f4 744 2 ASP.aspx_mypage_aspx
025d3ca0 0ec693f4 744 2 ASP.aspx_mypage_aspx
0260a85c 0ec693f4 744 2 ASP.aspx_mypage_aspx
02641294 0ec693f4 744 2 ASP.aspx_mypage_aspx
02677f44 0ec693f4 744 2 ASP.aspx_mypage_aspx
026aeaf4 0ec693f4 744 2 ASP.aspx_mypage_aspx
0270622c 0ec693f4 744 2 ASP.aspx_mypage_aspx
...
0:015> !gcroot 0251f0ec
Note: Roots found on stacks may be false positives. Run "!help gcroot" for
more info.
Scan Thread 15 OSTHread a28
Scan Thread 19 OSTHread 43c
Scan Thread 20 OSTHread b94
Scan Thread 21 OSTHread 15c
Scan Thread 13 OSTHread ce8
Scan Thread 22 OSTHread 750
Scan Thread 5 OSTHread d3c
Scan Thread 24 OSTHread 234
Scan Thread 7 OSTHread dc0
Scan Thread 25 OSTHread c28
Scan Thread 26 OSTHread 528
Scan Thread 8 OSTHread aec
Scan Thread 6 OSTHread 754
Scan Thread 27 OSTHread bc8
Scan Thread 28 OSTHread fb4
Scan Thread 29 OSTHread 20c
Scan Thread 31 OSTHread 88c
Finalizer queue:Root: 0251f0ec(ASP.aspx_mypage_aspx)
Finalizer queue:Root: 02523b0c(MyControls.MyTextBox)->
02523aa8(MyControls.MyWebTextBox)->
0251f0ec(ASP.aspx_mypage_aspx)
Finalizer queue:Root: 025243e0(MyControls.MyClientEvents)->
02523aa8(MyControls.MyWebTextBox)->
0251f0ec(ASP.aspx_mypage_aspx)
Finalizer queue:Root: 02524d88(MyControls.MyTextBox)->
02524d24(MyControls.MyWebTextBox)->
0251f0ec(ASP.aspx_mypage_aspx)
Finalizer queue:Root: 0252565c(MyControls.MyClientEvents)->
02524d24(MyControls.MyWebTextBox)->
0251f0ec(ASP.aspx_mypage_aspx)
The page is rooted in the finalizer queue, both by itself and as a member variable to a MyWebTextBox object (probably the _parent member variable).
This means two things
- The finalizer is probably blocked, otherwise items usually won’t be rooted in the Finalizer queue long enough for us to catch it in a dump
- The
ASP.aspx_mypage_aspx
, theMyTextBox
and theMyClientEvents
classes all have finalizers or derive from classes that have finalizers which seems very odd and is probably unnecessary. This article is a good resource for understanding how the finalizer/dispose pattern should be implemented, and this blog post explains why unnecessary finalizers are bad (under the topic “What can make us spend a lot of our time in GC?”)
If the finalizer is blocked no objects can be finalized which will lead to rapid memory growth, so a blocked finalizer will most definitely explain our problem. The reason these objects (UI and Data objects) don’t go away then is of course because they would have to be finalized to be garbage collected.
So, let’s have a look at the finalizer thread
0:015> !threads
ThreadCount: 27
UnstartedThread: 0
BackgroundThread: 17
PendingThread: 0
DeadThread: 10
Hosted Runtime: no
PreEmptive GC Alloc Lock
ID OSID ThreadOBJ State GC Context Domain Count APT Exception
15 1 a28 000d75f0 1808220 Enabled 3823b58c:3823bb08 000d3fe8 0 STA (Threadpool Worker)
19 2 43c 000dd5f0 b220 Enabled 00000000:00000000 000d3fe8 0 MTA (Finalizer)
20 3 b94 000f20b0 80a220 Enabled 00000000:00000000 000d3fe8 0 MTA (Threadpool Completion Port)
21 4 15c 000f5318 1220 Enabled 00000000:00000000 000d3fe8 0 Ukn
13 5 ce8 0014c3f0 880a220 Enabled 00000000:00000000 000d3fe8 0 MTA (Threadpool Completion Port)
22 6 750 00122f70 880b220 Enabled 00000000:00000000 000d3fe8 0 MTA (Threadpool Completion Port)
5 7 d3c 00160890 220 Enabled 00000000:00000000 000d3fe8 0 Ukn
24 8 234 0f8a90d8 180b220 Enabled 3816cea0:3816ee28 000d3fe8 0 MTA (Threadpool Worker)
7 a dc0 0f8ae008 220 Enabled 00000000:00000000 000d3fe8 0 Ukn
XXXX b 0 0015d0c8 1801820 Enabled 00000000:00000000 000d3fe8 0 Ukn (Threadpool Worker)
XXXX c 0 115a0f68 1801820 Enabled 00000000:00000000 000d3fe8 0 Ukn (Threadpool Worker)
XXXX d 0 11dadf88 1801820 Enabled 00000000:00000000 000d3fe8 0 Ukn (Threadpool Worker)
XXXX e 0 13168018 1801820 Enabled 00000000:00000000 000d3fe8 0 Ukn (Threadpool Worker)
25 f c28 1441b018 180b220 Enabled 3823bde4:3823db08 000d3fe8 0 MTA (Threadpool Worker)
XXXX 10 0 1352b3b8 1801820 Enabled 00000000:00000000 000d3fe8 0 Ukn (Threadpool Worker)
26 11 528 1433d008 220 Enabled 00000000:00000000 000d3fe8 0 Ukn
8 12 aec 14416bc0 220 Enabled 00000000:00000000 000d3fe8 0 Ukn
6 13 754 1444b828 220 Enabled 00000000:00000000 000d3fe8 0 Ukn
27 14 bc8 14526248 220 Enabled 00000000:00000000 000d3fe8 0 Ukn
28 15 fb4 145cb220 220 Enabled 00000000:00000000 000d3fe8 0 Ukn
XXXX 16 0 1461a008 1801820 Enabled 00000000:00000000 000d3fe8 0 Ukn (Threadpool Worker)
29 17 20c 147abac8 220 Enabled 00000000:00000000 000d3fe8 0 Ukn
XXXX 18 0 164a02e8 1801820 Enabled 00000000:00000000 000d3fe8 0 Ukn (Threadpool Worker)
XXXX 19 0 230be290 1801820 Enabled 00000000:00000000 000d3fe8 0 Ukn (Threadpool Worker)
XXXX 9 0 0f8a99a8 1801820 Enabled 00000000:00000000 000d3fe8 0 Ukn (Threadpool Worker)
XXXX 1a 0 227c2f18 1801820 Enabled 00000000:00000000 000d3fe8 0 Ukn (Threadpool Worker)
31 1b 88c 227d1c98 220 Enabled 00000000:00000000 000d3fe8 0 Ukn
0:015> ~19s
eax=01eff518 ebx=00099cd0 ecx=01eff518 edx=7743345e esi=00000740 edi=00000000
eip=7c97ed54 esp=01eff254 ebp=01eff2c4 iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
ntdll!KiFastSystemCallRet:
7c97ed54 c3 ret
0:019> kb 2000
ChildEBP RetAddr
01eff250 7c972124 ntdll!KiFastSystemCallRet
01eff254 7c82baa8 ntdll!NtWaitForSingleObject+0xc
01eff2c4 7c82ba12 kernel32!WaitForSingleObjectEx+0xac
01eff2d8 774854ef kernel32!WaitForSingleObject+0x12
01eff2f4 77549905 ole32!GetToSTA+0x6f
01eff314 77547ed7 ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0xcb
01eff3f4 77455349 ole32!CRpcChannelBuffer::SendReceive2+0xc1
01eff460 77484eee ole32!CAptRpcChnl::SendReceive+0xab
01eff4b4 77c8127e ole32!CCtxComChnl::SendReceive+0x91
01eff4d0 77c813ca rpcrt4!NdrProxySendReceive+0x43
01eff8b8 77c811bd rpcrt4!NdrClientCall2+0x206
01eff8d8 77bf3a12 rpcrt4!ObjectStublessClient+0x8b
01eff8e8 77484c23 rpcrt4!ObjectStubless+0xf
01eff970 774d28f3 ole32!CObjectContext::InternalContextCallback+0x126
01eff9c0 79f2340f ole32!CObjectContext::ContextCallback+0x85
01effa0c 79f23362 mscorwks!CtxEntry::EnterContextOle32BugAware+0x2b
01effb2c 79f231ef mscorwks!CtxEntry::EnterContext+0x2db
01effb60 7a0426dc mscorwks!RCWCleanupList::ReleaseRCWListInCorrectCtx+0xc4
01effbb0 79f23058 mscorwks!RCWCleanupList::CleanupAllWrappers+0x77
01effbf4 79f75f2d mscorwks!SyncBlockCache::CleanupSyncBlocks+0xec
01effdb8 79f34ff2 mscorwks!Thread::DoExtraWorkForFinalizer+0x40
01effdc8 79ecb4a4 mscorwks!SVR::GCHeap::FinalizerThreadWorker+0xc6
01effdd8 79ecb442 mscorwks!ManagedThreadBase_DispatchInner+0x4d
01effe6c 79ecb364 mscorwks!ManagedThreadBase_DispatchMiddle+0xb0
01effea8 79ed5e8b mscorwks!ManagedThreadBase_DispatchOuter+0x6d
01effed0 79ed5e56 mscorwks!ManagedThreadBase_NoADTransition+0x32
01effedc 79f6fd87 mscorwks!ManagedThreadBase::FinalizerBase+0xb
01efff14 79ecb00b mscorwks!SVR::GCHeap::FinalizerThreadStart+0xbb
01efffb8 7c826063 mscorwks!Thread::intermediateThreadProc+0x49
01efffec 00000000 kernel32!BaseThreadStart+0x34
The finalizer thread (which we found from the !threads
output) is definitely doing something and based on what we know about objects being rooted in the Finalizer queue it has probably been stuck here for a while.
The finalizer is not necessarily finalizing a managed object, instead it is cleaning up all com wrappers, and to release them it has to get on an STA thread.
If we look at the !threads
output again, specifically at the APT (Apartment) column we can see that our only STA thread is thread 15 so that is likely where it is trying to go.
0:019> ~15s
eax=2d220b44 ebx=000d75f0 ecx=2d220ca4 edx=000b4883 esi=0000022c edi=00000000
eip=7c97ed54 esp=01b1fc70 ebp=01b1fce0 iopl=0 nv up ei ng nz ac pe cy
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000297
ntdll!KiFastSystemCallRet:
7c97ed54 c3 ret
0:015> kbL 2000
ChildEBP RetAddr Args to Child
01b1fc6c 7c972124 7c82baa8 0000022c 00000000 ntdll!KiFastSystemCallRet
01b1fc70 7c82baa8 0000022c 00000000 01b1fcb4 ntdll!NtWaitForSingleObject+0xc
01b1fce0 79e77fd1 0000022c 00009c40 00000000 kernel32!WaitForSingleObjectEx+0xac
01b1fd24 79e77f9a 0000022c 00009c40 00000000 mscorwks!CLREventWaitHelper+0x2f
01b1fd74 79e77f50 00009c40 00000000 00000000 mscorwks!CLREvent::WaitEx+0x117
01b1fd84 79f5b69c 00009c40 00000000 00000000 mscorwks!CLREvent::Wait+0x17
01b1fe04 7a112052 000bf560 00009c40 00000000 mscorwks!ThreadpoolMgr::SafeWait+0x73
01b1fe2c 7a112206 eeb88f9d 00000000 7a1120b5 mscorwks!ThreadpoolMgr::EnterRetirement+0x8e
01b1fe94 79f71123 00000000 00000000 00000000 mscorwks!ThreadpoolMgr::WorkerThreadStart+0x360
01b1ffb8 7c826063 000bcbf0 00000000 00000000 mscorwks!ThreadpoolMgr::intermediateThreadProc+0x49
01b1ffec 00000000 79f710dd 000bcbf0 00000000 kernel32!BaseThreadStart+0x34
Thread 15 however is just a regular .net worker thread waiting for work, and as such it is not pumping messages so thread 19 (finalizer) will be indefinitely blocked in GetToSTA
.
The symptom is very very similar to that explained in this KB Article and in this kb the reason for the issue is that the main thread is configured with the attribute [STAThread] and the suggested solution is to change this to [MTAThread] or to manually pump messages using Thread.CurrentThread.Join(100)
, neither of which really apply in an ASP.NET application.
The problem in ASP.NET is that the first CLR threadpool worker thread is never initialized for COM+ so if someone instantiates a component there that calls CoInitialize, the thread can become an STA thread and cause this type of blocking.
Solution
A hotfix was recently released which changes this behavior, so if you run into this issue you can install that. If you also have a lot of unnecessary finalizers like in this particular case, I would recommend that you take care of them since it causes a lot of unnecessary memory usage and higher CPU in GC because of the amount of collections that have to be done to release them and the amount of data the GC has to work through.
Laters all,
Tess