Memory Management Reference
Frequently Asked Questions

 Contents | News | Glossary | FAQ | Articles | Bibliography | Links | Feedback

This is a list of questions that represent the problems people often have with memory management. Some answers appear below, with links to helpful supporting material, such as the glossary, the bibliography, and external sites. For a full explanation of any terms used, see the glossary.

C-specific questions
C++-specific questions
Common objections to garbage collection
Miscellaneous

C-specific questions

Can I use garbage collection in C? 

Yes. Various conservative garbage collectors for C exist as add-on libraries.

Related terms: C; conservative garbage collection

Useful websites:

Why do I need to test the return value from malloc? Surely it always succeeds? 

For small programs, and during light testing, it is true that malloc usually succeeds. Unfortunately, there are all sorts of unpredictable reasons why malloc might fail one day; for example:

In this case, malloc will return NULL, and your program will attempt to store data by resolving the null pointer. This might cause your program to exit immediately with a helpful message, but it is more likely to provoke mysterious problems later on.

If you want your code to be robust, and to stand the test of time, you must check all error or status codes that may be returned by functions you call, especially those in other libraries, such as the C run-time library.

If you really don't want to check the return value from malloc, and you don't want your program to behave mysteriously when out of memory, wrap malloc up in something like this:


#include <stdio.h>
#include <stdlib.h>

void *my_malloc(size_t size)
{
  void *p = malloc(size);

  if(p == NULL) {
    fputs("Out of memory.\n", stderr);
    exit(EXIT_FAILURE);
  }

  return p;
}

Undefined behavior is worth eliminating even in small programs.

Related terms: malloc

What's the point of having a garbage collector? Why not use malloc and free? 

Manual memory management, such as malloc and free, forces the programmer to keep track of which memory is still required, and who is responsible for freeing it. This works for small programs without internal interfaces, but becomes a rich source of bugs in larger programs, and is a serious problem for interface abstraction.

Automatic memory management frees the programmer from these concerns, making it easier for him to code in the language of his problem, rather than the tedious details of the implementation.

Related terms: garbage collection

What's wrong with ANSI malloc in the C library? 

malloc provides a very basic manual memory management service. However, it does not provide the following things, which may be desirable in your memory manager:

Many of these can be added on top of malloc, but not with full performance.

Related terms: C; malloc

C++-specific questions

Can I use garbage collection in C++? 

Yes. The C++ specification has always permitted garbage collection. Bjarne Stroustrup (C++'s designer) has proposed that this be made explicit in the standard. There exist various conservative and semi-conservative garbage collectors for C++.

Related terms: C++; conservative garbage collection; semi-conservative garbage collection

Useful websites:

Why is delete so slow? 

Often delete must perform a more complex task than simply freeing the memory associated with an object; this is known as finalization. Finalization typically involves releasing any resources indirectly associated with the object, such as files that must be closed or ancillary objects that must be finalized themselves. This may involve traversing memory that has been unused for some time and hence is paged out.

With a manual memory manager (such as new/delete), it is perfectly possible for the deallocation operation to vary in complexity. Some systems do quite a lot of processing on freed blocks to coalesce adjacent blocks, sort free blocks by size (in a buddy system, say), or sort the free block chain by address. In the last case, deallocating blocks in address order (or sometimes reverse address order) can result in poor performance.

Related terms: deallocation; manual memory management

What happens if you use class libraries that leak memory? 

In C++, it may be that class libraries expect you to call delete on objects they create, to invoke the destructor(2). Check the interface documentation.

Failing this, if there is a genuine memory leak in a class library for which you don't have the source, then the only thing you can try is to add a garbage collector. The Boehm-Weiser collector will work with C++.

Useful websites:

Can't I get all the benefits of garbage collection using C++ constructors and destructors? 

Carefully designed C++ constructors(2) and destructors(2) can go a long way towards easing the pain of manual memory management. Objects can know how to deallocate all their associated resources, including dependent objects (by recursive destruction). This means that clients of a class library do not need to worry about how to free resources allocated on their behalf.

Unfortunately, they still need to worry about when to free such resources. Unless all objects are allocated for precisely one purpose, and referred to from just one place (or from within one compound data structure that will be destroyed atomically), then a piece of code that has finished with an object cannot determine that it is safe to call the destructor; it cannot be certain (especially when working with other people's code) that there is not another piece of code that will try to use the object subsequently.

This is where garbage collection has the advantage, because it can determine when a given object is no longer of interest to anyone (or at least when there are no more references to it). This neatly avoids the problems of having multiple copies of the same data or complex conditional destruction. The program can construct objects and store references to them anywhere it finds convenient; the garbage collector will deal with all the problems of data sharing.

Common objections to garbage collection

What languages use garbage collection? 

Java, Lisp, Smalltalk, Prolog, ML,... the list goes on. It surprises many to learn that many implementations of BASIC use GC to manage character strings efficiently.

C++ is sometimes characterized as the last holdout against GC, but this is not accurate. See FAQ: Can I use garbage collection in C++? for details.

The notion of automatic memory management has stood the test of time and is becoming a standard part of modern programming environments. Some will say "the right tool for the right job", rejecting automatic memory management in some cases; few today are bold enough to suggest that there is never a place for GC among tools of the modern programmer -- either as part of a language or as an add-on component.

Related terms: garbage collection

What's the advantage of garbage collection? 

Garbage collection frees you from having to keep track of which part of your program is responsible for the deallocation of which memory. This freedom from tedious and error-prone bookkeeping allows you to concentrate on the problem you are trying to solve, without introducing additional problems of implementation.

This is particularly important in large-scale or highly modular programs, especially libraries, because the problems of manual memory management often dominate interface complexity. Additionally, garbage collection can reduce the amount of memory used because the interface problems of manual memory management are often solved by creating extra copies of data.

In terms of performance, garbage collection is often faster than manual memory management. It can also improve performance indirectly, by increasing locality of reference and hence reducing the size of the working set, and decreasing paging.

Related terms: garbage collection

Relevant publications:

Programs with GC are huge and bloated; GC isn't suitable for small programs or systems. 

While it is true that the major advantages of garbage collection are only seen in complex systems, there is no reason for garbage collection to introduce any significant overhead at any scale. The data structures associated with garbage collection compare favorably in size with those required for manual memory management.

Some older systems give garbage collection a bad name in terms of space or time overhead, but many modern techniques exist that make such overheads a thing of the past. Additionally, some garbage collectors are designed to work best in certain problem domains, such as large programs; these may perform poorly outside their target environment.

Relevant publications:

I can't use GC because I can't afford to have my program pause 

While early garbage collectors had to complete without interruption and hence would pause observably, many techniques are now available to ensure that modern collectors can be unobtrusive.

Related terms: incremental garbage collection; concurrent

Isn't it much cheaper to use reference counts rather than garbage collection? 

No, updating reference counts is quite expensive, and they have a couple of problems:

There are many systems that use reference counts, and avoid the problems described above by using a conventional garbage collector to complement it. This is usually done for real-time benefits. Unfortunately, experience shows that this is generally less efficient than implementing a proper real-time garbage collector, except in the case where most reference counts are one.

Related terms: reference counting

Relevant publications:

Isn't garbage collection unreliable? I've heard that GCs often kill the program. 

Garbage collectors usually have to manipulate vulnerable data structures and must often use poorly-documented, low-level interfaces. Additionally, any GC problems may not be detected until some time later. These factors combine to make most GC bugs severe in effect, hard to reproduce, and difficult to work around.

On the other hand, commercial GC code will generally be heavily tested and widely used, which implies it must be reliable. It will be hard to match that reliability in a manual memory manager written for one program, especially given that manual memory management doesn't scale as well as the automatic variety.

In addition, bugs in the compiler or run-time (or application if the language is as low-level as C) can corrupt the heap in ways that only the GC will detect later. The GC is blamed because the GC found the corruption. This is a classic case of shooting the messenger.

I've heard that GC uses twice as much memory. 

This may be true of primitive GCs (like the two-space collector), but this is not generally true of garbage collection. The data structures used for GC need be no larger than those for manual memory management.

Doesn't garbage collection make programs slow? 

No. In The Measured Cost of Conservative Garbage Collection, Zorn finds that:

the CPU overhead of conservative garbage collection is comparable to that of explicit storage management techniques. [...] Conservative garbage collection performs faster than some explicit algorithms and slower than others, the relative performance being largely dependent on the program.

Note also that the version of the conservative collector used in this paper is now rather old and the collector has been much improved since then.

Relevant publications:

Manual memory management gives me control -- it doesn't pause. 

It is possible for manual memory management to pause for considerable periods, either on allocation or deallocation. It certainly gives no guarantees about performance, in general.

With automatic memory management, such as garbage collection, modern techniques can give guarantees about interactive pause times, and so on.

Related terms: incremental garbage collection; concurrent

Miscellaneous

Why does my disk rattle so much? 

When you are using a virtual memory(1) system, the computer may have to fetch pages of memory from disk before they can be accessed. If the total working set of your active programs exceeds the physical memory(1) available, paging will happen continually, your disk will rattle, and performance will degrade significantly. The only solutions are to install more physical memory, run fewer programs at the same time, or tune the memory requirements of your programs.

The problem is aggravated because virtual memory systems approximate the theoretical working set with the set of pages on which the working set lies. If the actual working set is spread out onto a large number of pages, then the working page-set is large.

When objects that refer to each other are distant in memory, this is known as poor locality of reference. This happens either because the program's designer did not worry about this, or the memory manager used in the program doesn't permit the designer to do anything about it.

Note that a copying garbage collector can dynamically organize your data according to the program's reference patterns and thus mitigate this problem.

Related terms: thrash

Relevant publications:

Where can I find out more about garbage collection? 

Many modern languages have garbage collection built in, and the language documentation should give details. For some other languages, garbage collection can be added, see the Boehm-Weiser collector for an example of a C/C++ addition. See also The Garbage Collection FAQ.

Related terms: garbage collection

Relevant publications:

Where can I get a garbage collector? 

The Boehm-Weiser collector is suitable for C or C++. The best way to get a garbage collector, however, is to program in a language that provides garbage collection.

Related terms: garbage collection

Useful websites:

Why does my program use so much memory? 

If you are using manual memory management (for example, malloc and free in C), it is likely that your program is failing to free memory blocks after it stops using them. When your code allocates memory on the heap, there is an implied responsibility to free that memory. If a function uses heap memory for returning data, you must decide who takes on that responsibility. Pay special attention to the interfaces between functions and modules. Remember to check what happens to allocated memory in the event of an error or an exception.

If you are using automatic memory management (almost certainly garbage collection), it is probable that your code is remembering some blocks that it will never use in future. This is known as the difference between liveness and reachability. Consider clearing variables that refer to large blocks or networks of blocks, when the data structure is no longer required.

I use a library, and my program grows every time I call it. Why? 

If you are using manual memory management, it is likely that the library is allocating data structures on the heap every time it is used, but that they are not being freed. Check the interface documentation for the library; it may expect you to take some action when you have finished with returned data. It may be necessary to close down the library and re-initialize it to recover allocated memory.

Unfortunately, it is all too possible that the library has a memory management bug. In this case, unless you have the source code, there is little you can do except report the problem to the supplier. It may be possible to add a garbage collector to your language, and this might solve your problems.

With a garbage collector, sometimes objects are retained because there is a reference to them from some global data structure. Although the library might not make any further use of the objects, the collector must retain the objects because they are still reachable.

If you know that a particular reference will never be used in future, it can be worthwhile to overwrite it. This means that the collector will not retain the referred object because of that reference. Other references to the same object will keep it alive, so your program doesn't need to determine whether the object itself will ever be accessed in future. This should be done judiciously, using the garbage collector's tools to find what objects are being retained and why.

If your garbage collector is generational, it is possible that you are suffering from premature tenuring, which can often be solved by tuning the collector or using a separate memory area for the library.

Related terms: memory leak; premature tenuring

Should I write my own memory allocator to make my program fast? 

If you are sure that your program is spending a large proportion of its time in memory management, and you know what you're doing, then it is certainly possible to improve performance by writing a suballocator. On the other hand, advances in memory management technology make it hard to keep up with software written by experts. In general, improvements to memory management don't make as much difference to performance as improvements to the program algorithms.

In The Measured Cost of Conservative Garbage Collection, Zorn finds:

In four of the programs investigated, the programmer felt compelled to avoid using the general-purpose storage allocator by writing type-specific allocation routines for the most common object types in the program. [...] The general conclusion [...] is that programmer optimizations in these programs were mostly unnecessary. [...] simply using a different algorithm appears to improve the performance even more.

and concludes:

programmers, instead of spending time writing domain-specific storage allocators, should consider using other publicly-available implementations of storage management algorithms if the one they are using performs poorly.

Relevant publications:

Why can't I just use local data on the stack or in global variables? 

Global, or static, data is fixed size; it cannot grow in response to the size or complexity of the data set received by a program. Stack-allocated data doesn't exist once you leave the function (or program block) in which it was declared.

If your program's memory requirements are entirely predictable and fixed at compile-time, or you can structure your program to rely on stack data only while it exists, then you can entirely avoid using heap allocation. Note that, with some compilers, use of large global memory blocks can bloat the object file size.

It may often seem simpler to allocate a global block that seems "probably large enough" for any plausible data set, but this simplification will almost certainly cause trouble sooner or later.

Related terms: stack allocation; heap allocation; static allocation

Why should I worry about virtual memory? Can't I just use as much memory as I want? 

While virtual memory can greatly increase your capacity to store data, there are three problems typically experienced with it:

Related terms: virtual memory(1); thrash

Why do I need to reset my X server every week? 

Some X servers are notorious for leaking memory. This is probably because the sheer complexity of the X library interface, together with those for toolkits such as Motif, makes manual memory management a nightmare for the programmer.

There have been reports of successful use of the Boehm-Weiser collector with X servers.

Related terms: memory leak