Witscale Test Center

 6.4 How does the garbage collector work? > 6.4.1 Garbage collection algorithms


6.4.1 Garbage collection algorithms

Any garbage collection algorithm should be able to do two things, detect the garbage objects and then recycle the heap space used by them. The detection is usually done by first identifying a set of root references and then determining reachability of the given object from these roots. You can think of roots as references, which are immediately accessible to the program. In the simplest form, the object references on the stack are root references. An object is reachable if there is a path of references from the root reference to the object. Some algorithms use this logic for implementing garbage collection. Following are two basic approaches used by these algorithms for distinguishing the reachable objects from garbage objects:

Reference counting

In reference counting, the garbage collectors distinguish the reachable objects by keeping a count for each object on the heap. The count keeps track of the number of references to that particular object. The object with reference count 0 is no longer reachable and hence considered as garbage. This technique has the advantage that it can reclaim objects promptly as the object is marked as garbage the moment reference-count is 0. This makes it particularly useful for real-time environments where the program can't be interrupted for garbage collection cycles for very long time. A disadvantage of reference counting is that it does not detect cyclic references (two or more objects that refer to each other) These objects will never have a reference count of zero even though they may be unreachable from the root references. We will see example cyclic references later in the chapter. Another disadvantage is the overhead caused while storing the reference count for each object and keeping it up-to-date.

Tracing

Some garbage collectors actually trace out the references starting from the root references to determine the reachability. Objects that are encountered during the trace are marked as reachable. After the trace is complete, unmarked objects are known to be unreachable and can be garbage collected. The basic tracing algorithm is called mark and sweep. As the name suggests, it follows a two-phase approach. In the mark phase, the garbage collector traverses the tree of references and marks each object it encounters. In the sweep phase, memory allocated to the unmarked objects is freed and is made available to the executing program.