In the previous section we learned that Java uses a garbage collector for memory managment. But how does a garbage collector actually work? We will take a closer look at that in this section.
Within the HotSpot JVM, the Garbage Collector isn't a single unified concept, but has multiple implementations. Which garbage collector implementation to use will depend upon the hardware resources available and the performance requirements of your application.
Heap memory is an allocation of memory that is controlled by the JVM. The size of heap available to the JVM is primarily controlled with the -Xms<value>
and -Xmx<value>
JVM args, setting initial heap size and max heap size respectively.
When any thread in the JVM creates an object, they are stored in the heap. For this reason objects stored in heap are not thread safe. This is in contrast to local variables which are allocated in stack memory, which is thread safe, and automatically cleared when the stack leaves scope.
If heap memory becomes full it will cause the JVM to throw java.lang.OutOfMemoryError
exceptions, when the JVM attempts to allocate space for new objects. For most implementations of garbage collectors in Java, heap memory is divided into multiple regions based on the "age" of an object. The number and types of regions will vary depending on the specific implementation of the garbage collector.
Most garbage collectors in Java are implemented as generational garbage collectors (every garbage collector except Z GC). The idea behind a generational garbage collector is that most objects are short lived, and need to be removed soon after creation. Alternatively as an object increases in age, it becomes less and less likely to become a candidate for removal. Generational garbage collectors divide the heap into multiple regions, with new objects in a more frequently checked young region and long-lived objects in a less frequently checked old region.
By dividing the heap into multiple regions this reduces system pause time associated with garbage collections, improving throughput and responsiveness of applications running on the JVM. Garbage collectors can be further tuned to favor specific characteristics; throughput, responsiveness, resource usage, and so on, depending upon the needs of the application.
As mentioned earlier, the memory heap in generational garbage collectors is divided into multiple regions. Let's look at these regions in more detail.
Young Region - The Young Region, as the name suggests, is the heap region that contains recently created objects. The Young Region is itself subdivided into more regions.
Old Region - If an object gains enough "age", by surviving garbage collections, it will be promoted to the old region.
Permanent/Metaspace Region - The final region is the permanent or metaspace region. Objects stored in here are typically JVM metadata, core system classes, and other data that typically exist for near the entire duration of the JVM life. Objects stored in this region are checked by the garbage collector, often only when the heap has reached a critical consumed memory threshold.
At a high level, a garbage collection has three phases; mark, sweep, and compaction. Each of these steps have distinct responsibilities. Though note that dependening on the garbage collector implementation, there might be additional sub-phases within each phase that are not covered here.
On object creation, every object is given, by the VM, a 1 bit marking value, initially set to false (0
). This value is used by the garbage collector to mark if an object is reachable. At the start of a garbage collection, the garbage collector traverses the object graph and marks any object it can reach as true (1
).
The garbage collector doesn't scan each object individually, but insteads starts from "root" objects. Examples of root objects are; local variabes, static class fields, active Java threads, and JNI references. The below animation visualizes what the object mark phase looks like:
During the sweep phase all objects that are unreachable, those whose marking bit currently false (0
), are removed.
The final phase of a garbage collection is the compaction phase. Live objects in the eden region or an occupied survivor region are moved and/or copied to an empty survivor region. If an object in a survivor region has gained enough tenureship, it is moved or copied to an old region.
During a garbage collection there might be periods where some, or even all, processing within the JVM is paused, these are called Stop-the-World Events. As mentioned in the introduction of the Heap Memory section, objects stored in heap memory are not thread safe. This in turn means that during a garbage collection, part, or all, of the JVM must be paused for a period while the garbage collector works to prevent errors from occuring as objects are checked for usage, deleted, and moved or copied.
Tools like JDK Flight Recorder (JFR) and Visual VM can be used to monitor the frequency and duration of pauses occuring from garbage collection. How to tune a garbage collector is outside the scope of this tutorial, but monitoring garbage collector behavior, and subseqently tuning it through JVM arguments, can be key way to improve the performance of an application.
Just like there are different regions of heap memory, there are also different types of garbage collections.
The below animation visualizes what a garbage collection looks like: