Overview

The goal of this project is to make understanding where your program is using memory and why a simple and painless task. This is done via taking snapshots of the program heap during execution and then, with specialized graphical representation, directly visualizing the results. This enables you to (1) see where all the memory in your program is being used, (2) to determine what the role of any structure is in the program, and (3) to see how a structure of interest is related to the rest of the objects in memory. This contextualization allows you to quickly isloate the problem areas and to determine what needs to be done to solve your memory bloat problem.

General Description

Let's take a quick look at a simple example to see how this works in practice and compares to existing memory profiling tools.

Suppose we have a program that is computing the gravitational interaction of planets in space (like the BH program in our sample code set). We may decide that we want to reduce the memory footprint of the code and so we can run it under a standard memory profiler. The report will look something like this:
Profile Run Complete: 
Max Live Heap: 5KB
double[]:   2.92KB
MathVector: 0.97KB
Body:       0.80KB
...

It isn't very supprising that my program uses a lot of double[] objects since these are used to represent points in 3-space, and they are used all over the program. So this really doesn't tell me much and it really doesn't help me figure out if these arrays are being used appropriately. Even if I do know that I could reduce the memory consumption, since the arrays are used thoughout the code it isn't clear where I should look to fix the problem.

So, I could attempt to visualize the heap directly to see if I can better understand what data structures are being built, what they are used for, and where these double[] objects fit into the picture. This results in a graph that looks like the following:
objectgraph.png

And this is for a program with less than 200 objects, maybe we could dig through this to find what we are interested in but it isn't exactly a pleasant task and isn't going to scale well if we have 10K or 100K objects.

Here is where HeapDbg comes in. It works by abstracting the concrete heap into high level summary structures. This is done by grouping many objects (and pointers) into logically related summary structures to reduce the heap into a small number of conceptually related parts. This grouping is done by identifying recursive structures and structures that are stored in similar (congruent) predecessor heap locations. Thus it produces a model of the heap that captures the main high level data structures, relations between them, and important properties of the structures in a compact and easy to understand way. So with HeapDbg we get the follwing graph:
hdbgbasic.png

This is much nicer, now we can see that really the heap is quite simple. The this reference points to an object of type BTree which has two lists List<Body> stored in its bodyTab and bodyTabRev fields. The chevron indicates that the Body node actually represents a set of composite structures but that they are all encapsulated inside the Body objects.

Now we can get down to figuring out where our memory use is. To do this we just turn on the node memory use heat-map in the legend. Which will highlight any nodes using more than 5% or 15% of the total heap. The result is shown below:
hdbgheatmap.png

Ok, it is pretty clear that the problem is inside the set of Body object composite structures. To explore this we can just click on the chevron on the node to expand it and see the details of the internal structures.
hdbgexpand.png

This shows a simple composite structure where each Body has an acceleration acc, a velocity vel, a position pos}, and a new acceleration value (for the current timestep) {{newacc. Each of these is represented with a MathVector and each MathVector uses a double[] object to store the appropriate 3-dimensional vector values. The memory use heat-map also shows that each of the sets of double[] is resonsible for a non-trivial amount of the total in-use memory. By hovering over one of the nodes we can see that each of them represents 20 objects and is responsible for ~14% of the total in-use memory (so all 4 in total contain 80 objects and are 56% of the in-use memory). If we look at the class definition for MathVector we will see the following:
class MathVector
{
   private const int NDIM = 3; 
    private double[] data = new double[NDIM];
...
}

This makes for a nice clean object-oriented design and is clearly very flexible for adjusting the numbers of dimensions this code will work with. But since we don't plan on changing this parameter and the memory overhead it is causing is extreme we can easily rewrite the MathVector class as:
class MathVector
{
    private double x;
    private double y;
    private double z;
...
}

This simple refactoring removes the array overhead and the pointers stored in MathVector objects (for 16 bytes per array). This simple change then reduces the memory used by around 1.28KB which is and ~25% reduction in the memory used by the program. As a bonus the improved locality and reduced GC turnover improve runtime by ~12%. Not bad for 15 minutes of work.

More Info

The profiler and abstraction tools are simple standalone executable code that rewrite an assembly to an instrurmented version. Then when you run this version it is profiled, the heap is sampled and abstracted, and the resulting graphs are written out in DGML format. You will need Visual Studio 2010 Ultimate or Premium to visualize them. The Readme included in the souce distribution contains more information on running the tool and interpreting the results.

Alternatively, you can try out a simplified version online at rise4fun or watch a brief video. More detail on how the abstraction works, other information that is provides, and more extensive case studies can be found in our technical paper on the tool.

Last edited Jul 11, 2011 at 12:26 AM by DefaultSteve, version 16

Comments

No comments yet.