Imagine you have a legacy Java application running in IBM WebSphere that you have upgraded finally to newer version. Yet, customer is reporting serious performance regression. Why would that be? Well, one reason maybe a change in default JVM behavior between WebSphere versions, something that one of my customers discovered the “hard way”
Garbage collection (GC) is an integral part of the Java Virtual Machine (JVM) as it collects unused Java heap memory so that the application can continue allocating new objects. The effectiveness and performance of the GC play an important role in application performance and determinism. The IBM JVM provided with IBM WebSphere Application Server provides four different GC policy algorithms:
Each of these algorithms provides different performance and deterministic qualities. In addition, the default policy in WebSphere Application Server V8 has changed from -Xgcpolicy:optthruput to the policy -Xgcpolicy:gencon. So lets dive in a bit what this really means.
The garbage collector
Different applications naturally have different memory usage patterns. A computationally intensive number crunching workload will not use the Java heap in the same way as a highly transactional customer-facing interface. To optimally handle these different sorts of workloads, different garbage collection strategies are required. The IBM JVM supports several garbage collection policies to enable you to choose the strategy that best fits your application
The parallel mark-sweep-compact collector: optthruput, formerly default
The simplest possible garbage collection technique is to continue allocating until free memory has been exhausted, then stop the application and process the entire heap. While this results in a very efficient garbage collector, it means that the user program must be able to tolerate the pauses introduced by the collector. Workloads that are only concerned about overall throughput might benefit from this strategy.
The optthruput policy (-Xgcpolicy:optthruput) implements this strategy. This collector uses a parallel mark-sweep algorithm. In a nutshell, this means that the collector first walks through the set of reachable objects, marking them as live data. A second pass then sweeps away the unmarked objects, leaving behind free memory than can be used for new allocations. The majority of this work can be done in parallel, so the collector uses additional threads (up to the number of CPUs by default) to get the job done faster, reducing the time the application remains paused.
The problem with a mark-sweep algorithm is that it can lead to fragmentation . There might be lots of free memory, but if it is in small slices interspersed with live objects then no individual piece might be large enough to satisfy a particular allocation.
The solution to this is compaction. In theory, the compactor slides all the live objects together to one end of the heap, leaving a single contiguous block of free space. This is an expensive operation because every live object might be moved, and every pointer to a moved object must be updated to the new location. As a result, compaction is generally only done when it appears to be necessary. Compaction can also be done in parallel, but it results in a less efficient packing of the live objects — instead of a single block of free space, several smaller ones might be created.
The concurrent collector: optavgpause
For applications that are willing to trade some overall throughput for shorter pauses, a different policy is available. The optavgpause policy (-Xgcpolicy:optavgpause) attempts to do as much GC work as possible before stopping the application, leading to shorter pauses . The same mark-sweep-compact collector is used, but much of the mark and sweep phases can be done as the application runs. Based on the program’s allocation rate, the system attempts to predict when the next garbage collection will be required. When this threshold approaches, a concurrent GC begins. As application threads allocate objects, they will occasionally be asked to do a small amount of GC work before their allocation is fulfilled. The more allocations a thread does, the more it will be asked to help out. Meanwhile, one or more background GC threads will use idle cycles to get additional work done. Once all the concurrent work is done, or if free memory is exhausted ahead of schedule, the application is halted and the collection is completed. This pause is generally short, unless a compaction is required. Because compaction requires moving and updating live objects, it cannot be done concurrently.
The generational collection: gencon
has long been observed that the majority of objects created are only used for a short period of time. This is the result of both programming techniques and the type of application. Many common Java idioms create helper objects that are quickly discarded; for example StringBuffer/StringBuilder objects, or Iterator objects. These are allocated to accomplish a specific task, and are rarely needed afterwards. On a larger scale, applications that are transactional in nature also tend to create groups of objects that are used and discarded together. Once a reply to a database query has been returned, then the reply, the intermediate state, and the query itself are no longer needed.
This observation lead to the development of generational garbage collectors. The idea is to divide the heap up into different areas, and collect these areas at different rates. New objects are allocated out of one such area, called the nursery (or newspace). Since most objects in this area will become garbage quickly, collecting it offers the best chance to recover memory. Once an object has survived for a while, it is moved into a different area, called tenure (or oldspace). These objects are less likely to become garbage, so the collector examines them much less frequently. For the right sort of workload the result is collections that are faster and more efficient since less memory is examined, and a higher percentage of examined objects are reclaimed. Faster collections mean shorter pauses, and thus better application responsiveness.
IBM’s gencon policy (-Xgcpolicy:gencon) offers a generational GC (“gen-“) on top of the concurrent one described above (“-con”). The tenure space is collected as described above, while the nursery space uses a copying collector. This algorithm works by further subdividing the nursery area into allocate and survivor spaces . New objects are placed in allocate space until its free space has been exhausted. The application is then halted, and any live objects in allocate are copied into survivor. The two spaces then swap roles; that is, survivor becomes allocate, and the application is resumed. If an object has survived for a number of these copies, it is moved into the tenure area instead.
The region-based collector: balanced
A new garbage collection policy has been added in WebSphere Application Server V8. This policy, called balanced (-Xgcpolicy:balanced), expands on the notion of having different areas of the heap. It divides the heap into a large number of regions, which can be dealt with individually. Frankly I haven’t seen it used by any customer I worked with yet.
For more on WebSphere IBM JVM GC see – http://www.perfdaddy.com/2015/10/ibm-jvm-tuning-gencon-gc-policy.html, http://www.ibmsystemsmag.com/ibmi/administrator/websphere/Tuning-Garbage-Collection-With-IBM-Technology-for/, http://javaeesupportpatterns.blogspot.com/2012/03/ibm-jvm-tuning-gencon-gc-policy.html
Hope this helps