A Brief History Of Garbage Collection
By Brian Goetz
2004-01-22
Reader Rating:

How does garbage collection work?
There are several basic strategies for garbage collection: reference counting, mark-sweep, mark-compact, and copying. In addition, some algorithms can do their job incrementally (the entire heap need not be collected at once, resulting in shorter collection pauses), and some can run while the user program runs (concurrent collectors). Others must perform an entire collection at once while the user program is suspended (so-called stop-the-world collectors). Finally, there are hybrid collectors, such as the generational collector employed by the 1.2 and later JDKs, which use different collection algorithms on different areas of the heap.
When evaluating a garbage collection algorithm, we might consider any or all of the following criteria:
Pause time.
Does the collector stop the world to perform collection? For how long? Can pauses be bounded in time?
Pause predictability.
Can garbage collection pauses be scheduled at times that are convenient for the user program, rather than for the garbage collector?
CPU usage.
What percentage of the total available CPU time is spent in garbage collection?
Memory footprint.
Many garbage collection algorithms require dividing the heap into separate memory spaces, some of which may be inaccessible to the user program at certain times. This means that the actual size of the heap may be several times bigger than the maximum heap residency of the user program.
Virtual memory interaction.
On systems with limited physical memory, a full garbage collection may fault nonresident pages into memory to examine them during the collection process. Because the cost of a page fault is high, it is desirable that a garbage collector properly manage locality of reference.
Cache interaction.
Even on systems where the entire heap can fit into main memory, which is true of virtually all Java applications, garbage collection will often have the effect of flushing data used by the user program out of the cache, imposing a performance cost on the user program.
Effects on program locality.
While some believe that the job of the garbage collector is simply to reclaim unreachable memory, others believe that the garbage collector should also attempt to improve the reference locality of the user program. Compacting and copying collectors relocate objects during collection, which has the potential to improve locality.
Compiler and runtime impact.
Some garbage collection algorithms require significant cooperation from the compiler or runtime environment, such as updating reference counts whenever a pointer assignment is performed. This creates both work for the compiler, which must generate these bookkeeping instructions, and overhead for the runtime environment, which must execute these additional instructions. What is the performance impact of these requirements? Does it interfere with compile-time optimizations?
Regardless of the algorithm chosen, trends in hardware and software have made garbage collection far more practical. Empirical studies from the 1970s and 1980s show garbage collection consuming between 25 percent and 40 percent of the runtime in large Lisp programs. While garbage collection may not yet be totally invisible, it sure has come a long way.
First published by IBM developerWorks
If you found this article interesting, you may want to read these as well:
» Build and Implement A Single Sign-On Solution
» Scheduling Recurring Tasks In Java Applications
» Eye On Performance: A Load Of Stress
|