cakelab
Home Projects Research Misc. Contact

Memory Management Analysis

License: LGPL v3

Version: 0.2.2

Last Update: 13.12.2013

sources

When analysing and evaluating memory load of concurrent applications a lot of problems may occur: Retrieving information through operating system commands cyclicly you may miss certain peaks. Information functions provided by the runtime library (malloc_info) do not support multi-threading. The memory analysis tool from valgrind called massif scales so badly that it is getting hard to observe the actual application behaviour. Those are the reasons why we have developed a very simple and light-weight memory analysis tool which produces quite less interference. It automatically hooks calls to memory management functions and logs them or determins statistics such as maximum and total memory usage considering time and average memory usage.

INTRODUCTION

This is a thin layer to memory management functions of malloc to analyse memory management without any manual instrumentation. This is achieved by small preloaded libraries which intercept calls to malloc, free and similar.

The package provides two tools for memory analysis:

  1. mmstats: During runtime mmstats gathers statistics to memory management in the observed application.
  2. mmprof: This generates a function call trace with time stamps, thread id, function id, parameters and return values.

It also includes an offline analysis tool:

  • mmoa: Reads an mma log file produced by mmprof and calculates the same statistics as mmstats does, but post mortem.

Valgrind actually provides a tool called 'massif' which does similar things but it has the slightly disadvantage that it produces too much contention in parallel programs which results in bad scalability and therefore does not reflect the actual memory usage of the application.

PREREQUISITES

  1. GCC
  2. GNU MP library (v. 10) and header files.

BUILD

> make depend
> make

Result are two libraries called libmmprof.so and libmmstats.so and a small test application called test_mmprof.

INSTALLATION

Just install it, if you really need it many times. Otherwise, I would suggest to build some kind of analysis environment in your home directory.
  1. Place the generated libraries to a suitable location in your filesystem (e.g. cp *.so /usr/local/lib).
  2. Place the wrapper scripts 'mma.sh', 'mmprof.sh' and 'mmstats.sh' in a suitable location (e.g. cp *.sh /usr/local/bin).
  3. Customize the path to the library in the installed main script 'mma.sh' (e.g. open it with vi and set the variable LIBDIR=/usr/local/lib)

USAGE

Once the sources have been compiled you get a library for each tool called libmmprof.so and libmmstats.so (more to come).

Both libraries have to be prepended to the libc e.g. by using the LD_PRELOAD environment variable. The script mma.sh does exactly this. It expects the first command line argument to be the name of the tool (mmprof or mmstats) and the remaining command line is interpreted as the command to be executed with its arguments.

SYNOPSIS:

mma.sh <tool> <command>

Example to use the tool 'mmprof' on the command 'ls *.c':

> mma.sh mmprof ls *.c

The scripts called 'mmprof.sh' and 'mmstats.sh' are just shortcuts to mma.sh which preselect either mmprof or mmstat as tool.

SYNOPSIS:

mmprof.sh <command>
mmstats.sh <command>

Example:

> mmprof.sh ls *.c

The selected tool automatically initialises itself and profiles function calls for:

  • malloc
  • calloc
  • realloc
  • free

Statistics:

When the last process of a group of related processes (i.e. parent or child process) exits mmstats writes its statistics in a log file (default: ./mma.log):

  • duration: Total duration of measurement in nano seconds.
  • load_max: maximum allocated detected over the whole application run.
  • load_current: currently allocated size of heap
  • load_total: Total memory load in byte nano seconds calculated as sum of products of allocated memory [byte] and allocation duration [nano secs]. The integral of the graph of the total amount of allocated memory over the process execution time.
  • load_avg: Average memory load in byte calculated as quotient of load and execution time.

Profiling:

The mmprof tool writes for each event the relevant data into the log file. Each line contains one log entry. A log entry describes one event. Events are calls to functions or a simple time stamp to be placed in the log file:

  • free
  • malloc
  • calloc
  • realloc
  • tmstmp

Each log entry consists of a header and the parameters of the event which is the corresponding function that has been called.

The header contains:

  • time in nanoseconds
  • thread id
  • and the event type.

The parameters section contains the parameters and return values of the function that has been called or the empty set (nil) in case of the tmstmp event.

mmprof writes either just the value of each element or creates humanized output, which adds a name for each element in the log file (e.g.

<name>'='<value>
).

The detailed format of a log file is:

logfile: (<entry> '\n')*
entry: <header> <parameters>
header: <time> ' ' <tid> ' ' <event_type> 
parameters: <free_params> | <malloc_params> | <calloc_params> | <realloc_params> | <tmstmp_params>
free_params: <hex> | 'ptr='<hex>
malloc_params: <hex> ' ' <int> | 'ptr=<hex>' ' size='<int>
calloc_params: <hex> ' ' <int> ' ' <int> | 'ptr=' <hex> ' num=' <int> ' elem=' <int>
realloc_params: 'newptr=' <hex> ' ptr=' <hex> ' size=' <int>
tmstmp_params: {}
time: <int> | 'secs='<int> ' nsecs='<int>
tid: <int> | 'tid='<int>
event_type: <type_id> | <type_string>
type_id: {0..4}
type_string: 'free' | 'malloc' | 'calloc' | 'realloc' | 'tmstmp'

Customize Behaviour

Behaviour of the mma tools is customized via environment variables. The following environment variables are considered:

MMA_OUTPUT_FILE:

Default: ./mma.log

Use this environment variable to set another file name for the output.

MMA_OUTPUT_HUMANIZED:

Default: 1

Use this environment to customize the output.

mma generates detailed 'humanized' output per default. You can disable it by setting this environment variable to zero (0).

Humanized output means:

  1. Some progress information is printed (e.g. "mmprof: activated")
  2. Statistics are printed in the following format: "name [unit]: value"

Non-Humanized output means:

  1. No progress information is printed
  2. Statistics are printed in a single line each with spaces as separators for values (i.e. "value value value value ..\n"). The order of values is the same as in humanized output.
MMA_LIBC_NAME:

Default: "libc.so.6"

Use this environment variable to customize the name of the libc library to be loaded and wrapped dynamically.

Note that you can customize the function names to be instrumented in the makefile!

TEST / EXAMPLE

To run a simple test you can profile the test application using the following command

> mmprof.sh ./test_mmprof 

Note that the output is written to ./mma.log per default. See Section Cutomization above to change log file or format. If the log file is not in humanized form (i.e. MMA_OUTPUT_HUMANIZED=0), you can calculate statistics from the log file using mmoa:

> mmoa mma.log

Alternatively you can directly calculate statistics using mmstats:

> mmstats.sh ./test_mmprof

LICENSING

This software is provided under the GNU LESSER GENERAL PUBLIC LICENSE Version 3, 29 June 2007. A copy of the license statement can be found in the file LICENSE.

KNOWN ISSUES

  • Profiling does not work with statically linked runtime or allocator library. Anyway, this case is very rare.
  • Suspending or stopping profiling (e.g. to increase performance) is not supported.
  • Due to shared data and a single global lock profiling affects memory allocation performance significantly (even if it performs much better than valgrind).