C# application memory analysis
One of the most sought-after developer skills is the ability to diagnose and solve. Like many other
skills, it is very difficult to measure. Conducting tests and measuring skill against their score is not a viable
approach for a very difficult skill like this because setting the question that is worth the difficulty level itself is
a daunting task. Many developers and mostly budding developers thus get away with the phrase “excellent
debugging skills”, “excellent problem solver” or “excellent critical thinking skills”. When the problem strikes
in production these young developers end up seeking support from their seasoned mentors or seniors in
the project who might have seen such a problem earlier and solved it or at least offered a stop-gap solution.
Often this style of diagnosing and solving problems comes and bites a few years in future. They come in
the shape of large redesign programs. They are costly and dent the customer’s appetite to engage the
technology partner further. Technology partners must thus inculcate a systematic approach to such
problems, we will not get into the consulting mode but anchor ourselves to the technology pillars. Software
programs’ underpinning cause driving complexity is the vast labyrinth of memory and its management. P95
(95 th percentile) of performance problem will have this at the heart of its problem. Thus, the skill of
diagnosing and solving problems must first befriend this labyrinth. Depending on the programming
language an application uses the skill of diagnosing can differ significantly. For this article, we will use C#
and dotNet core as the programming language which could further help us with the tools available.
Signs of the trouble
The first sign of problems appears in the form of mounting private bytes of memory consumed by the
application. Needless to say, the most practical first sign of the trouble we have heard is the customer’s
complaining the application is slow and they could not get their work done. Either way, whichever sign you
observe the first thing you will need is to unleash the right tool. With the .NET (.NET Core) that tool is
dotnet-dump collect -p 3298
The dump
The dump is a snapshot of memory that dotnet uses while the application is running. In case you do not
have the tool mentioned earlier do this
dotnet tool install -g dotnet-dump
In our experience, we recommend that you take a few dumps. Do not ever work with one dump. When you
take a dump, you must exercise caution that the code has not changed otherwise the entire exercise is
futile. The dump file is typically large in size. The challenges in running these commands increase manifold
depending on your access to the customer’s production environment. So, keep all these operational
aspects (access, size available on disk etc.) in your mind when you start working with this tool. There is a
lightweight tool that you could use if the size of data on disk is a concern
The performance counters
These counters are a summary of the snapshot. So, instead of a detailed look, you look at the
totals/aggregations over time. You will use a command like this to get a handle on the situation
dotnet-counters monitor –refresh-interval 1 -p 3298
This will keep refreshing data on the console for you based on the process you are monitoring. Some of the
interesting counters you will notice are
1. # of assemblies loaded
2. % of time spent in GC
3. LOH (Large Object Heap) size
4. Gen X Size (in bytes), where X corresponds to a GC generation
5. Threadpool queue length
6. Threadpool threads count
Numbers on the screen don’t give you a direction board to the problem but it is the intimate knowledge of
your process (i.e., the application) and working of dotnet (the way it manages) memory that helps you
determine if there is a problem in performance. E.g.,
1. If LOH size is near equal to your CPU size you know it is a deep red kind of problem. Because you
run out of space to store things.
a. Additional inferences you can draw are many objects have survived multiple generations of
garbage collection.
2. Threadpool queue length signifies the number of threads waiting for processor time. A larger
number indicates a blocking operation which could be parallelized to free up the queue.
3. % of time spent in GC indicates complex objects which otherwise are fine but are not behaving
good when it comes to freeing up space.
If based on this you decide to dig deeper into the CLR (the runtime for the dotnet processes) you will have
to invest time in tracing.
The tracer
You can run a trace using
dotnet-trace collect –format NetTrace -p 3298
This tool helps you gather greater insights in the format of a diagnostic session with detail on CLR events
which can help you determine what has gone wrong and at what stage of your application’s run. Like in
other tools, this must be run when the application is running and the problematic operation is performed in
the application. Taking a trace otherwise will give you data but not be useful for you to solve the problem.
These are the cornerstone for diagnosing any kind of performance problem in the dot net world. But
knowledge of these tools alone is insufficient. True diagnostic and solving skill stems from interpreting the
data that is emitted from these tools. The advent of AI skills can be used to help there but that can be an
article for another dispatch.
Before you put any of these tools to work, set yourself the perf goal. It is typically covered as NFR in any
requirements document. The absence of a perf goal will take you down the rabbit hole in wonderland. So,
knowledge of perf goal is the first thing you need. In many practical instances, you might not be offered
one. In such cases, it is prudent that you use something like this as a baseline –
1. P95 (95 th percentile) of GC memory over the past X hours of operation
2. P99 (99 th percentile) of Threadpool queue length over the past X hours of operation
Where X could be agreed with the customer or your ability to monitor (remember in a production
environment).
Some other simplistic perf goal that could be monitored a lot more easily is
1. Average request count
2. Concurrent # of requests
3. Average request latency
In the world of dot net applications, you must understand GC more than threads. Once you understand
your perf goals, looking at GC offers you more information directing you in your analysis. One of the
toughest challenges for developers is to understand the GC Heap size. It being a large number is not a
singular indicator of a problem. It is a start but understanding GC Heap size proportion to process size is
very important. Thus, for a developer consuming all this data in context and then taking sound action to
solve the problem is what constitutes “excellent diagnostic skill” and helps you push away the scare of
performance analysis. In case we have served you just enough to kindle your appetite and you are hungry
of more insights; give this wiki a read.