The .NET Framework provides memory management techniques that differ from the way memory management worked in a COM-based world. The memory management in COM was through reference counting. .NET provides an automatic memory management technique that involves reference tracing. In this article, we'll take a look at the garbage collection technique used by the Common Language Runtime (CLR).
Automatic memory management: How different is it?
In reference counting, the reference count is increased each time memory is allocated for an object. The count is decreased whenever an object goes out of scope. The memory is reclaimed when the reference count reaches zero. This reference counting is manually managed by the developer in a language such as C++. If the developer fails to decrease the reference count when an object is freed, a memory leak is created. Also, the developer might decrease the reference count when he or she is not supposed to, which leads to memory being reclaimed before the proper time.
In comparison, in the managed world of .NET, the memory is handled by CLR. The task of garbage collection runs in the background, and the developer doesn't have to spend time checking for memory leaks.
The garbage collection algorithm
The garbage collector runs when the memory heap is full. It starts from the root objects that are identified by the JIT compiler and traverses the object chain, tracing the references and adding them to the graph. The application roots are usually global and static object pointers.
When there's an attempt to add an object that is already present on the graph, the garbage collector stops. In this way, the garbage collector recursively traverses all objects that are linked from the application root objects. Once the traversal is complete, the graph contains all objects that are somehow reachable. Any object that isn't part of this graph is not reachable and therefore considered garbage. Each object that can't be reached is marked and then collected.
In Figure A, blocks 1, 3, and 5 are reachable from the application roots. Blocks 2 and 4 are not reachable and hence can be marked for collection. Once the collection is complete, the memory space is compacted. That is, all the objects are moved so that they occupy a contiguous block of memory.
However, there's one caveat to automatic memory management. The garbage collection algorithm is complex and runs only periodically, which means the memory is not freed immediately after a variable goes out of scope. This type of memory management is referred to as nondeterministic finalization. Only when the memory usage reaches a threshold will the garbage collector be triggered and trace through the object references to reclaim the memory. The main drawback of this method is that it doesn't give the programmer the precise control as to when the objects are destroyed.
Circular reference is a problem that occurs when there are two objects that refer to each other. Let's say you have Class A that refers to Class B. If Class B also refers to Class A, then we have a circular reference. This happens in many situations. A typical example for this is a parent-child relationship between objects, where the child interacts with the parent object and also holds a reference to the parent object. This could lead to objects that would not get cleaned up until the application was shut down. The .NET way of garbage collection solves the problem of circular reference because the garbage collector is able to clean up any object that is reachable from the root.
The garbage collector algorithm is highly optimized. As it forms the graph and traces the references, the garbage collector further divides the graph into subgraphs called generations. The CLR classifies the heap in three generations. The objects that are created newly are maintained in generation 0. When the references to these objects are held for a long time, they survive garbage collection. Such objects are then promoted to generation 1 and generation 2. This classification increases performance because the garbage collector can perform collections on a specific generation, as shown in Figure B.
Usually, the short-lived objects that are frequently created and destroyed remain in generation 0. The garbage collector performs a collection on generation 0 only when generation 0 is full. This happens when a request is made for the creation of a new object and there's insufficient memory to allocate for that object. If the memory reclaimed from generation 0 is sufficient to create the object, the garbage collector does not perform the collection on other generations.
Good coding practices
Knowing the intricacies of how the CLR manages garbage collection should aid developers in writing management code. Automatic garbage collection is designed to complement good coding practices, but not to replace it. In the next installment in this two-part series, we'll look at the aspects of garbage collection that are under the developer's control.