Debugging .NET Core memory issues (on Linux) with dotnet dump
Hi all,
In my last few projects I’ve been spending a substantial portion of the time on Linux (more than I really care for TBH). This, together with a new found interest for debugging and .NET Core - after a long detour to Python land - led me to look into some of the new debugging tools for .NET Core.
I have a very simple console app that keeps accumulating memory in a static list, to create a high memory usage issue. I almost ventured to call it a leak, but technically it’s not a leak since we can still reclaim the memory by removing the Products from the list. Nevertheless, it does cause a high memory issue and could eventually lead to an OutOfMemoryException.
using System;
using System.Collections.Generic;
namespace simple_memory_leak
{
public class Product{
string name;
int id;
char[] details = new char[10000];
public Product(int id, string name)
{
this.id = id;
this.name = name;
}
}
class Program
{
static List<Product> products = new List<Product>();
static void Main(string[] args)
{
Console.WriteLine("NOTE! KEEP WATCHING THE GC HEAP SIZE IN COUNTERS");
string answer = "";
do{
for(int i = 0; i < 10000; i++){
products.Add(new Product(i, "product" + i));
}
Console.WriteLine("Leak some more? Y/N");
answer = Console.ReadLine().ToUpper();
} while (answer == "Y");
}
}
}
On Windows, we could have used tools like the task manager
, or procdump
to gather memory dumps and then reviewed them in WindDbg, as we have done in numerous case studies in the past like: .NET Memory Leak Case Study: The Event Handlers That Made The Memory balloon or ASP.NET Memory Investigation.
Edit: Paulo Morgado informed me that ProcDump is available for Linux as well - You learn something new every day :) thanks Paulo
So how do we do this on Linux?
The dotnet debugging tools
Luckily for us, the .NET team has released a number of dotnet tools for cross platform debugging:
- dotnet dump: Collects and Analyzes memory dumps from .NET core applications
- dotnet counters: Collects or Monitors .NET performance counters
- dotnet gcdump: Collects a snapshot of the .NET GC heaps - you can open this snapshot in Visual Studio to analyze the “leak”
- dotnet trace: Collects profiling traces for a .NET Core process
- dotnet symbol: Downloads .NET core symbols - useful when you debug a memory dump from another machine (in WinDbg or other native debuggers)
- dotnet sos: Installs the latest version of sos.dll for .NET Core debugging
Many of these are useful when troubleshooting memory leaks, (in particular dotnet gcdump
would have been a good choice here, but I will leave that for another post, and focus today on dotnet counters
and dotnet dump
).
dotnet counters
dotnet counters
helps you either monitor or collect .NET core performance counters, or ASP.NET core performance counters like the size of the GC heap (where all your .NET objects are stored), or the number of garbage collections, assemblies loaded, ASP.NET requests etc.
Some useful commands here are:
-
dotnet counters ps
- list all the .NET Core processes we can monitor, so we can get the Process ID -
dotnet counters monitor -p [Process ID] --refresh-interval 1
- displays all the counters and updates every second -
dotnet counters monitor --counters Microsoft.AspNetCore.Hosting
- monitors the ASP.NET Core counters
Note: Most of the commands have the ps switch that lets you list the processes that can be monitored
dotnet dump
dotnet dump
collects a memory dump similar to the dumps you collect with ProcDump or DebugDiag or any other debugging tool.
If you use it on Windows to collect memory dumps you can review the dumps in WindDbg or DebugDiag or any dump debugging tool.
You can also review these dumps (both from Windows and Linux) in Visual Studio - I will write a more modern post on how this works nowadays, but you can see a walkthrough of this in one of my old posts Debugging .NET 4.0 dumps in Visual Studio 2010 as the concepts are still very similar.
However, the really neat thing is that you can also debug these dumps with dotnet dump analyze [dumpfile]
both on Linux and Windows.
Once you are in the tool, it can feel a bit daunting if you’ve never used it before, but typing the command help
will list all the commands you can use. The commands are extremely similar to the sos commands you use in WinDbg (except you don’t have to start the commands with !
since it is not an extension here). So if you look through any of the case studies on this blog you can most likely replicate it in dotnet dump.
Debugging in action
Here you can see the tools in action on Ubuntu 20.04 running in WSL2 on Windows.
Note: This recording is not a video but an asciinema capture, so you can stop the recording and select text at any point. I noticed that the recording unfortunately is a little too wide - so if you can’t see the counters they grow by 200 MB each round - you can also see the recording in full view here
- Reproduce the problem by running the application (every loop adds some more products to the static list)
- Run
dotnet counters monitor -p [PID]
to start looking at the .NET Core counters - We can see that the GC Heap Size increases with around 200 MB each iteration - Run
dotnet dump collect -p [PID]
to collect a memory dump - watch the process informing us that we are capturing the dump. The dump is a snapshot of all the memory used by the process at a given point in time. - Run
dotnet dump analyze [filename]
to start analyzing the memory dump.
At this point we are looking to answer two questions
- Where is our memory going?
- Why is it not reclaimed?
The following commands are commands we run inside the dotnet dump debugger
- We run
eeversion
just to check the version of .NET Core we are running (5.0) and also the version of the GC. In this case we see that it is Workstation mode - this informs us that we will see one GC heap and we won’t have dedicated GC threads. If you want to dig deeper into what all this means - check my old post on how the GC works here if you want the cliff notes or Maoni’s much more detailed and up to date information about how the GC really works. -
eeheap -gc
lists the GC heaps where we can see how much GC memory we are using - and how much we are using for each generation. Of special interest here is the Large Ojbect Heap (LOH) where we store anything that is larger than 85000 bytes. Why is this interesting? Well, the LOH is collected way more seldom than the other heaps (only on full collections) - and if we use the LOH a lot, we end up “over triggering” the GC. Read more about that in my post about High CPU in GC or try it out for yourself with the debugging labs. In our case though we can see that we use about 404MB of .NET GC memory, but most of it is on the small object heaps. -
dumpheap -stat
is very useful as it lists statistics of all the objects on the heap, in ascending order, with the objects that collectively use the most memory last. You need to be a bit careful when reading this though, we can see that we have 20.000 Product objects totaling 800.000 bytes. This only includes the memory used by the “raw” object - i.e. in this case eachProduct
has a link to a String (name
), an int (id
) and a link to a System.Char[] (details
), so theProduct
object will always be 40 bytes, independent of how big thedetails
array is, or thename
string. Nonetheless, a lot of our memory seems to be devoted to these Product objects (SURPRISE SURPRISE :)). So now we know the what - next is finding out the why -
dumpheap -type simple_memory_leak.Product
dumps a list of the addresses of every individualProduct
object. Thanks to the GC and its layout, we know exactly where all the objects are which lets us traverse the heaps and dump the objects like this. Doing this in a C++ app or some language that doesn’t have a managed heap would be impossible since we wouldn’t have a common list of all the allocated memory, but this is very neat for troubleshooting memory issues. - Next we grab an object at random… in reality we would do this a couple of times to validate our findings, and run
dumpobj [address]
- this way we can see the actual contents of the object, and we see the links to the member variables (name, id, details). And then wedumpobj [address of the name string]
to get even more context - so this is product7844 - but why is this hanging around on the heap, why isn’t it garbage collected? -
gcroot [address]
either on the string or the Product, gives us a list of all the roots, or chains leading to the root, telling us why it is not collected yet. In this case we see that the String is belongs to a Product (as a member variable), and the Product belongs to a List of Products which in turn sits on the stack (rbp) on Thread 4559. So as long as that List is on the stack, and the Product is part of the list etc. we won’t garbage collect it, because it is still in use. -
threads
lists all the threads in the process, and we can see that 0x4559 is also known as thread 0. The * tells us that this is the active thread, but if we would have been on another thread we could have ransetthread 0
to switch to thread 0. - Finally we run
clrstack
to see what the thread is doing, and find that we are sitting inProgram.Main
so now we can go back to the code and check out what its doing.
Summary
This was an extremely simple case, and the memory leaks can get a bit more hairy, especially if you have a lot of different things on the heap that make it a bit harder to find the needle in the haystack. But, the technique stays the same whether it’s simple or harder. If you are working with .NET Core applications, try this out on your own applications and get to know your memory usage and your memory patterns a bit better, both to find potential issues to solve, but also to understand what good looks like, when things go bad.
Hope you have enjoyed this… if you have comments, or things you want me to blog about, reach out on twitter @tessferrandez and we’ll continue the conversation there (until I find a good comment solution for the blog)
Have a good one, Tess