Purgamentum Init Venari – Analyzing Java GC Using IBM Pattern Modeling Tool

Recently again looking at some GC issues on IBM Websphere platform for a buddy of mine I got to learn new tool – Pattern Modeling and Analysis Tool for IBM Java Garbage Collector (PMAT). This is second post-mortem IBM analysis tool I had priviledge to work with from IBM , as I previously profiled JCA – Javacore Dump Analysis Tool here https://gennadny.wordpress.com/2016/03/28/javacore-dump-analysis-using-jca-ibm-thread-and-monitor-dump-analyzer-for-java/ .

The PMAT tool parses verbose GC trace, analyzes Java heap usage, and recommends key configurations based on pattern modeling of Java heap usage.

Why do we need it?
When the JVM (Java virtual machine) cannot allocate an object from the current heap because of lack of space, a memory allocation fault occurs, and the Garbage Collector is invoked. The first task of the Garbage Collector is to collect all the garbage that is in the heap. This process starts when any thread calls the Garbage Collector either indirectly as a result of allocation failure or directly by a specific call to System.gc(). The first step is to get all the locks needed by the garbage collection process. This step ensures that other threads are not suspended while they are holding critical locks. All other threads are then suspended. Garbage collection can then begin. It occurs in three phases: Mark, Sweep, and Compaction (optional).

Sometimes you run into issues, most common are either performance issues in applications due to especially long running , aka “violent” GCs or rooted objects on the heap not getting cleaned up and application crashing with dreaded OutOfMemory error and causing you have to analyze Garbage Collection with Verbose GC on.

JRE Java.Lang.OutOfMemory
Verbose GC is a command-line option that one can supply to the JVM at start-up time. The format is: -verbose:gc or -verbosegc. This option switches on a substantial trace of every garbage collection cycle. The format for the generated information is not designed and therefore varies among various platforms and releases.
This trace should allow one to see the gross heap usage in every garbage collection cycle. For example, one could monitor the output to see the changes in the free heap space and the total heap space. This information can be used to determine whether garbage collections are taking too long to run; whether too many garbage collections are occurring; and whether the JVM crashed during garbage collection.

How does it work?
PMAT analyzes verbose GC traces by parsing the traces and building pattern models. PMAT recommends key configurations by executing a diagnosis engine and pattern modeling algorithm. If there are any errors related with Java heap exhaustion or fragmentation in the verbose GC trace, PMAT can diagnose the root cause of failures. PMAT provides rich chart features that graphically display Java heap usage.
The following features are included:

Where do I get it?

You can download this tool from – ftp://public.dhe.ibm.com/software/websphere/appserv/support/tools/pmat/ga456.jar

Running the tool
Run gaNNN.jar with the Java Run-time Environment. (NNN is the version number).
You will see following initial screen

pmat1

Select and open verbosegc log

pmat2

Process Log and View Summary\Reports

pmat3

More detailed information on the tool can be found in presentation here – http://www-01.ibm.com/support/docview.wss?uid=swg27007240&aid=1

Jinwoo Hwang was technical leader at IBM WebSphere Application Server Technical Support that created this tool, as well as JCA. I recommend that everyone reads his articles on JVM internals – http://websphere.sys-con.com/node/1229281, http://www-01.ibm.com/support/docview.wss?uid=swg27011855 , http://www-01.ibm.com/support/docview.wss?uid=swg27018423

Hope this short note helps

Ocultos Exitus–JDBC Driver Unicode Settings and SQL Server Performance

While troubleshooting JDBC client apps that connect to SQL SErver I ran into this issue few times, latest very recently.

As you may already know well, SQL Server differentiates its data types that support Unicode from the ones that just support ASCII. For example, the character data types that support Unicode are nchar, nvarchar, longnvarchar where as their ASCII counter parts are char, varchar and longvarchar respectively. By default, all Microsoft’s JDBC drivers send the strings in Unicode format to the SQL Server, irrespective of whether the datatype of the corresponding column defined in the SQL Server supports Unicode or not.

In the case where the data types of the columns support Unicode, everything is smooth. But, in cases where the data types of the columns do not support Unicode, serious performance issues arise especially during data fetches. SQL Server tries to convert non-unicode datatypes implicitly in the table to unicode datatypes before doing the comparison. Moreover, if an index exists on the non-unicode column, it will be ignored. This would ultimately lead to a whole table scan during data fetch, thereby slowing down the search queries drastically.

This can be corrected by resetting one of the default parameter in the Java driver. The parameter name and value to be set might vary from driver to driver, depending on the vendor.

Vendor Parameter
JSQLConnect asciiStringParameters
JTDS sendStringParametersAsUnicode
DataDirectConnect sendStringParametersAsUnicode
Microsoft JDBC sendStringParametersAsUnicode
WebLogic Type 4 JDBC sendStringParametersAsUnicode

Reading – https://msdn.microsoft.com/en-us/library/ms378857(v=sql.110).aspx you will see- “ For optimal performance with CHAR, VARCHAR or LONGVARCHAR type of non-Unicode parameters, set the sendStringParametersAsUnicode connection string property to “false” and use the non-national character methods.” , issue was also reported via http://www.sqlconsulting.com/jdbc.shtml, http://www.jochenhebbrecht.be/site/2014-05-01/java/fixing-slow-queries-running-sql-server-using-jpa-hibernate-and-jtds and http://www.codeproject.com/Articles/281364/Solving-Performance-issues-in-data-migration-to-SQ

Hope this helps.

Et Obscure Ex Alto–TDS Compatibility for SQL Server Upgrades

As majority of folks may know time is counting down on SQL Server extended support for SQL Server 2005 , it officially ends April 12, 2016. Working with folks on upgrading to fairly old systems from SQL Server 2000\2005 to newer versions like SQL Server 2012\2014 we ran into fairly rare issue of TDS backward compatibility, mainly with very old clients running older versions of JDBC\ODBC drivers.

Tabular Data Stream (TDS) is an application layer protocol, used to transfer data between a database server and a client. It was initially designed and developed by Sybase Inc. for their Sybase SQL Server relational database engine in 1984, and later by Microsoft in Microsoft SQL Server

tds

Microsoft published TDS specification in 2008 – https://msdn.microsoft.com/en-us/library/cc448435.aspx , Still TDS and especially TDS compatibility on SQL Server upgrades are pretty obscure topic. Microsoft published TDS version list matched to SQL Server and .NET Framework here – https://msdn.microsoft.com/en-us/library/dd339982.aspx

TDS version SQL Server version .NET Framework version
7.0 SQL Server 7.0 .NET Framework 1.1
7.1 SQL Server 2000 .NET Framework 1.1
7.1 Revision 1 SQL Server 2000 SP1 .NET Framework 1.1
7.2 SQL Server 2005 .NET Framework 2.0
7.3.A SQL Server 2008 .NET Framework 4.0
7.3.B SQL Server 2008 R2 N/A
7.4 SQL Server 2012 .NET Framework 4.5
7.4 SQL Server 2014 .NET Framework 4.5
7.4 SQL Server 2016 CTP2 .NET Framework 4.6

Although TDS compatibility isn’t well documented by Microsoft , from what I have experienced it appears Microsoft keeps at least two versions of TDS backwardly compatible.

So how do I find out what versions of TDS are connecting to your server? On newer versions like SQL Server 2005 and above where DMVs are present following query can be used:

select protocol_type, protocol_version from sys.dm_exec_connections

Result will be something like:

TSQL   1930035203 

Well, the big number still doesn’t say which version of data access driver, does it? Actually it does. The integer value is the representation of TDS version (for TSQL). It needs to be translated as follows, converting protocol version to Hex:

SELECT CONVERT(BINARY(4), 1930035203) 

Result will be as follows:

0x730A0003

Take the first two hex digits (0x73) and map it to correct SQL Server version based on the table above. So now we know that connection was made via TDS 7.3x , possibly via SQL Server 2008 Client stack.

In one of its older posts SQLCAT actually combined al;l of these steps into one query:

SELECT session_id, protocol_type, driver_version = 
CASE SUBSTRING(CAST(protocol_version AS BINARY(4)), 1,1)
WHEN 0x70 THEN ‘SQL Server 7.0′
WHEN 0x71 THEN ‘SQL Server 2000′
WHEN 0x72 THEN ‘SQL Server 2005′
WHEN 0x73 THEN ‘SQL Server 2008′
ELSE ‘Unknown driver’
END
FROM sys.dm_exec_connections

But what if I am on SQL Server 2000, well before DMVs were available and need to check my TDS versions connecting before upgrade?

Here things will become a lot more difficult and labor intensive. The only way I think is via checking version in TDS pre-login packet via network capture on SQL Server side.

As per – https://msdn.microsoft.com/en-us/library/dd357559.aspx pre-login is message sent by the client to set up context for login. The server responds to a client PRELOGIN message with a message of packet header type 0x04 and the packet data containing a PRELOGIN structure. As part there is optional version value:

PL_OPTION_TOKEN Value Description
VERSION 0x00 PL_OPTION_DATA =   UL_VERSION

                   US_SUBBUILD

UL_VERSION is represented in network byte order (big-endian).

The server SHOULD use the VERSION sent by the client to the server. The client SHOULD use the version returned from the server to determine which features are enabled or disabled. The client SHOULD do this only if it is known that this feature is supported by that version of the database.<27>

Theoretically upon capture of such data via network trace it can be filtered in wireshark , from – https://www.wireshark.org/docs/dfref/t/tds.html

tds.prelogin.option.version	Version	Unsigned integer, 4 bytes	2.0.0

According to spec, version is an option – client should provide version by docs, however possibly some clients don’t. Assume that would be pretty rare, at least all mainstream clients I have seen do.

So dealing with older, obscure SQL Server clients is hard, but doesn’t have to impossible.

For more see – https://blogs.msdn.microsoft.com/jenss/2009/03/02/tds-protocol-versions-meet-client-stacks/, https://blogs.msdn.microsoft.com/sql_protocols/2008/07/15/connection-failure-because-of-mismatched-tds-version/, https://msdn.microsoft.com/en-us/library/dd339982.aspx

Mission Control To Major Tom – Exploring Java Mission Control (JMC) for Nearly Zero Overhead Troubleshooting

In this post I will attempt to take a look at Java Mission Control, a tool born out of Oracle’s merger with Sun Microsystems and therefore a convergence between Oracle JRockit VM and Hotspot VM. Included in the latest Java 7 JDK update (‘7u40′) is a new powerful monitor tool: Java Mission Control (JMC). JMC is a production time tool that has its roots in the JRockit JVM tooling. It is located in the bin folder of your  JDK. Oracle actually has done a good job advertising this tool via JavaOne conference and on their site – http://www.oracle.com/technetwork/java/javaseproducts/mission-control/index.html

Mission Control provides largely the same functionality as Java Visual VM. Both tools allow connecting to local or remote Java processes to collect JMX data. Mission Control supports automatic discovery of running remote JVMs via the Java Discovery Protocol. To use it the JVM needs to be started with

-Dcom.sun.management.jmxremote.autodiscovery=true -Dcom.sun.management.jdp.name=JVM_Name

Similarly to Java Visual VM, Mission Control has a plugin mechanism, allowing for customization. But unlike Java VisualVM, Mission Control can also be used to create new views on the data collected. Two experimental plugins available today are JOverflow Heap Analyzer for finding inefficient use of Collections and DTrace Recorder for correlating DTrace profiles. Mission Control has a JMX browser as part of its core features and offers slightly more powerful functionality across the board. For example, the thread monitoring can provide per thread allocation information and on the fly stack traces. Because Mission Control is based on the Eclipse Platform, it is not only available as standalone tool within the JDK, but also as Eclipse plugin which can be obtained on the Oracle Mission Control Update Site.

Java Mission Control uses JMX to communicate with remote Java processes. The JMX Console is a tool for monitoring and managing a running JVM instance. The tool presents live data about memory and CPU usage, garbage collections, thread activity, and more. It also includes a fully featured JMX MBean browser that you can use to monitor and manage MBeans in the JVM and in your Java application.

So after installing JDK 7u40 and above you should note tool right there in your JDK bin folder:

image

After starting JMC you will note it has few parts that can be helpful, just like Java Visual VM again.  Note pretty nice JMS Console that allows you to see\monitor general parameters on the machine like JVM CPU, Heap Memory, etc..

 

image

But more interesting to me is feature called Flight Recorder. For reason of illustrating of how it works I created a rather very simple application that does a tight loop and should create some CPU usage and contention on my laptop.  The application is pretty basic and somewhat embarrassing, but since its not the point here, here it is:

package highcpu;

/**
 *
 * @author gennadyk
 */
public class HighCPU {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        LoopMeToHighCPU(100000000);
    }
    
    public static void LoopMeToHighCPU(int iterations){
       
       int counter;
       for (counter=0;counter< } counter); + ? is: System.out.println(?Count {>

Looking at JMS Console I can see that its working.

image

But that’s not overly interesting to me, what I am interesting in is taking capture with Flight Control and finding out what is using my CPU from that capture.

To Take Flight Recorder Capture:

  • Start the application you want to profile with the following arguments to enable the flight recorder:
    -XX:+UnlockCommercialFeatures -XX:+FlightRecorder 

If you don’t set that up you will see an error like:

jmc

  • Next start Mission Control. You can just double click on jmc in the bin folder of your 7u40 JDK. (Close the Welcome screen if this is your first time starting JMC.) .Right click on the JVM you wish to start a flight recording on in the JVM browser and select Start Flight Recording.

Capture

  • Leave all the default settings and select the ‘Profiling – on server’ template for your Event Settings. You can usually just hit finish at this point, but I’d like to talk a bit on how you can control the method sampler.
  • Click Next to go to the higher level event settings. These are groupings of named settings in the template. Here you can select how often you want JFR to sample methods by changing the Method Sampling setting.

MethodSamplingSettings

  • Hit Finish and we are in business – application is now recorded:

Capture2

Now lets open Flight Recorder capture in JMC. You can do that via File->Open or Ctrl->O

image

I am obviously interested in Threads View here and looking at Hot Threads Call Tree can easily see my self created issue (see highlighted):

image

 

Disclaimer: A Word On Licensing
The tooling is part of the Oracle JDK downloads. In particular the JMC 5.4 is part of JDK 8u20 and JDK 7u71 and is distributed under the Oracle Binary Code License Agreement for Java SE Platform products and commercially available features for Java SE Advanced and Java SE Suite. IANAL, but as far as I know this allows for using it for your personal education and potentially also as part of your developer tests. Make sure to check back with whomever you know that could answer this question, most likely at Oracle. This blog uses tool for educational purposes only and as a how to for developer testing and production code troubleshooting.

Fore more on JMC see – http://hirt.se/blog/?p=343, http://hirt.se/blog/?p=364, and http://www.oracle.com/technetwork/java/javaseproducts/mission-control/index.html

Hope this helps.

Classic Heap Dump Analysis in JVM Using JMap and JHat

In last couple of posts I have shown how to use various tools that are shipped with JDK to troubleshoot various issues , such as hangs, by analyzing stacks. However, obviously there are times where you may have a suspected GC issue or memory leak in production and you will have to capture a heap dump using tools that ship with JDK as well. I already noted previously that Visual VM can be used for that task.

image

Just right-click on your pid of the process you wish to dump on left pane, pick “Heap Dump” and next you will see lots of information on right pane that may be of interest to you:

Below for example footprint of my class objects by instance count and size:

 

image

 

You can drill down further and see those roots holding these objects live, etc. To me interface and actions are very similar to SciTek .NET Heap Profiler I have used extensively in the past, so it makes it easy.

JMap is a tool that comes with JDK installation as well. Its perhaps less graphical than Visual VM above, but also smaller and has even less overhead for production troubleshooting use. To use it, just like with JStack we will need first to get pid that we are planning to dump:

C:\Program Files\Java\jdk1.7.0_25\bin>jps -l –m

12020 hangme2.HangMe2

30184 sun.tools.jps.

So in my case pid 12020 will need to be dumped.

C:\Program Files\Java\jdk1.7.0_25\bin>jmap -dump:file=C:\Users\gennadyk\Document s\NetBeansProjects\HangMe2.map 12020

Dumping heap to C:\Users\gennadyk\Documents\NetBeansProjects\HangMe2.map ...

Heap dump file created

Now, how do we analyze this dump? Heap dump files can be browsed by the heap dump browser – JHat. Again, as with JStack and JMap, need to do a bit of command line here:

C:\Program Files\Java\jdk1.7.0_25\bin>jhat C:\Users\gennadyk\Documents\NetBeansP
rojects\hangme2.map
Reading from C:\Users\gennadyk\Documents\NetBeansProjects\hangme2.map...
Dump file created Sun Aug 31 16:48:04 EDT 2014
Snapshot read, resolving...
Resolving 132221 objects...
Chasing references, expect 26 dots..........................
Eliminating duplicate references..........................
Snapshot resolved.
Started HTTP server on port 7000
Server is ready.

Now lets point my browser to http://localhost:7000 and here is a report one can see:

image

And here is all my live roots\references as I click on hyperlink to drill down on my class:

image

So with a bit of command line we can now troubleshoot our memory utilization\leaks in JVM with minimal production overhead as needed. Hope above was useful and happy leak hunting.

More information on the tools –

JMap – http://docs.oracle.com/javase/7/docs/technotes/tools/share/jmap.html

JHat – http://docs.oracle.com/javase/7/docs/technotes/tools/share/jhat.html

Troubleshooting guide for Java 6 with HotSpot VM – http://www.oracle.com/technetwork/java/javase/memleaks-137499.html#gdyrr

Ambulans Mortuus – Troubleshooting JVM Thread Deadlock with VisualVM

In my previous post I shown how you can troubleshoot various Java hangs and performance conditions by using JStack utility, which is a part of JDK. Today I will show a lot cooler and more advanced tool called VisualVM that allows you to take and analyze thread and heap dumps in Java.

First, what is threading deadlock? A deadlock is a situation where an application locks up because two or more activities are waiting for each other to finish. This occurs in multithreading software where a shared resource is locked by one thread and another thread is waiting to access it and something occurs so that the thread holding the locked item is waiting for the other thread to execute.  In the past I have had multiple opportunities to troubleshoot threading deadlocks in .NET and Windows unmanaged code with WinDBG , but not in Java. However, concepts are the same.

However, first things first. Lets create a quick deadlock condition in code. Easiest way that I am aware of is via lock nesting. Following little application creates deadlock and terminates due to that condition:

package threaddeadlock
 
public class ThreadDeadlock {
 
    public static void main(String[] args) throws InterruptedException {
        Object obj1 = new Object();
        Object obj2 = new Object();
        Object obj3 = new Object();
     
        Thread t1 = new Thread(new SyncThread(obj1, obj2), "t1");
        Thread t2 = new Thread(new SyncThread(obj2, obj3), "t2");
        Thread t3 = new Thread(new SyncThread(obj3, obj1), "t3");
         
        t1.start();
        Thread.sleep(5000);
        t2.start();
        Thread.sleep(5000);
        t3.start();
         
    }
 
}
 
class SyncThread implements Runnable{
    private Object obj1;
    private Object obj2;
 
    public SyncThread(Object o1, Object o2){
        this.obj1=o1;
        this.obj2=o2;
    }
    @Override
    public void run() {
        String name = Thread.currentThread().getName();
        System.out.println(name + " acquiring lock on "+obj1);
        synchronized (obj1) {
         System.out.println(name + " acquired lock on "+obj1);
         work();
         System.out.println(name + " acquiring lock on "+obj2);
         synchronized (obj2) {
            System.out.println(name + " acquired lock on "+obj2);
            work();
        }
         System.out.println(name + " released lock on "+obj2);
        }
        System.out.println(name + " released lock on "+obj1);
        System.out.println(name + " finished execution.");
    }
    private void work() {
        try {
            Thread.sleep(30000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

And here is the output once we run in NetBeans:

image

Now we need to download VisualVM, The tool actually, just like JStack ships in JDK, or it’s a quick download from Project Kenai site – http://visualvm.java.net/.  I downloaded latest stable version

 

image

Once you downloaded and launched executable ( on Windows visualvm.exe) from /bin folder you see this:

image

On left pane you will see my two instances of code running in NetBeans – latest is pid 1919. Right click on the icon and you can create Thread Dump:

Once its created you will see thread dump right under the pid you clicked:

image

And here in the thread dump output in the right pain you can see easily all of the details on our deadlock –

2014-08-16 17:47:06
Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode):

"DestroyJavaVM" prio=6 tid=0x000000000213d800 nid=0x4164 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
	- None

"t3" prio=6 tid=0x000000001b779000 nid=0x37c4 waiting for monitor entry [0x000000001e02f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at threaddeadlock.SyncThread.run(SyncThread.java:28)
	- waiting to lock <0x0000000756456b20> (a java.lang.Object)
	- locked <0x0000000756456b40> (a java.lang.Object)
	at java.lang.Thread.run(Thread.java:724)

   Locked ownable synchronizers:
	- None

"t2" prio=6 tid=0x000000001b778800 nid=0x1ff0 waiting for monitor entry [0x000000001df2f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at threaddeadlock.SyncThread.run(SyncThread.java:28)
	- waiting to lock <0x0000000756456b40> (a java.lang.Object)
	- locked <0x0000000756456b30> (a java.lang.Object)
	at java.lang.Thread.run(Thread.java:724)

   Locked ownable synchronizers:
	- None

"t1" prio=6 tid=0x000000001b777800 nid=0x1824 waiting for monitor entry [0x000000001de2f000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at threaddeadlock.SyncThread.run(SyncThread.java:28)
	- waiting to lock <0x0000000756456b30> (a java.lang.Object)
	- locked <0x0000000756456b20> (a java.lang.Object)
	at java.lang.Thread.run(Thread.java:724)

   Locked ownable synchronizers:
	- None

"Service Thread" daemon prio=6 tid=0x000000001b753000 nid=0x62b4 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
	- None

"C2 CompilerThread1" daemon prio=10 tid=0x000000001b750800 nid=0x653c waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
	- None

"C2 CompilerThread0" daemon prio=10 tid=0x000000001b74e800 nid=0x61ec waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
	- None

"Attach Listener" daemon prio=10 tid=0x000000001b74b800 nid=0x2f80 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
	- None

"Signal Dispatcher" daemon prio=10 tid=0x000000001b746800 nid=0x4718 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
	- None

"Finalizer" daemon prio=8 tid=0x000000001b6e1000 nid=0x314c in Object.wait() [0x000000001d72f000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x0000000756405798> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
	- locked <0x0000000756405798> (a java.lang.ref.ReferenceQueue$Lock)
	at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
	at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)

   Locked ownable synchronizers:
	- None

"Reference Handler" daemon prio=10 tid=0x000000001b6d9800 nid=0x55ec in Object.wait() [0x000000001d62f000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x0000000756405320> (a java.lang.ref.Reference$Lock)
	at java.lang.Object.wait(Object.java:503)
	at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
	- locked <0x0000000756405320> (a java.lang.ref.Reference$Lock)

   Locked ownable synchronizers:
	- None

"VM Thread" prio=10 tid=0x000000001b6d4000 nid=0x2c74 runnable 

"GC task thread#0 (ParallelGC)" prio=6 tid=0x0000000001f4e800 nid=0x50c8 runnable 

"GC task thread#1 (ParallelGC)" prio=6 tid=0x0000000001f50800 nid=0x4a58 runnable 

"GC task thread#2 (ParallelGC)" prio=6 tid=0x0000000001f52000 nid=0x5264 runnable 

"GC task thread#3 (ParallelGC)" prio=6 tid=0x0000000001f53800 nid=0x860 runnable 

"GC task thread#4 (ParallelGC)" prio=6 tid=0x0000000001f57000 nid=0x6290 runnable 

"GC task thread#5 (ParallelGC)" prio=6 tid=0x0000000001f59000 nid=0x5de4 runnable 

"GC task thread#6 (ParallelGC)" prio=6 tid=0x0000000001f5c000 nid=0x4968 runnable 

"GC task thread#7 (ParallelGC)" prio=6 tid=0x0000000001f5d800 nid=0x6180 runnable 

"VM Periodic Task Thread" prio=10 tid=0x000000001b76b000 nid=0x5fc4 waiting on condition 

JNI global references: 108


Found one Java-level deadlock:
=============================
"t3":
  waiting to lock monitor 0x000000001e031588 (object 0x0000000756456b20, a java.lang.Object),
  which is held by "t1"
"t1":
  waiting to lock monitor 0x000000001b6df378 (object 0x0000000756456b30, a java.lang.Object),
  which is held by "t2"
"t2":
  waiting to lock monitor 0x000000001b6e07d0 (object 0x0000000756456b40, a java.lang.Object),
  which is held by "t3"

Java stack information for the threads listed above:
===================================================
"t3":
	at threaddeadlock.SyncThread.run(SyncThread.java:28)
	- waiting to lock <0x0000000756456b20> (a java.lang.Object)
	- locked <0x0000000756456b40> (a java.lang.Object)
	at java.lang.Thread.run(Thread.java:724)
"t1":
	at threaddeadlock.SyncThread.run(SyncThread.java:28)
	- waiting to lock <0x0000000756456b30> (a java.lang.Object)
	- locked <0x0000000756456b20> (a java.lang.Object)
	at java.lang.Thread.run(Thread.java:724)
"t2":
	at threaddeadlock.SyncThread.run(SyncThread.java:28)
	- waiting to lock <0x0000000756456b40> (a java.lang.Object)
	- locked <0x0000000756456b30> (a java.lang.Object)
	at java.lang.Thread.run(Thread.java:724)

Found 1 deadlock.

But VisualVM is more than JStack on steroids. As I stated you can troubleshoot memory issues with heap dumps, and even use it as profiler to see where application is spending its time via snapshots.  Below for example JVM resource utilization in remote Tomcat via VisualVM:

image

Profiling, however, isn’t something you will do in production, but periodic thread stack dumps are quite an affective weapon to troubleshoot all kinds of hangs and performance issues, including such complex and dreaded as threading deadlock as you can see above.

For more see – http://visualvm.java.net/, http://docs.oracle.com/javase/7/docs/technotes/guides/visualvm/ and http://docs.oracle.com/javase/8/docs/technotes/guides/visualvm/intro.html

Hey what happenned to my Isolation Level

As with one of my customers they do a lot of J2EE on IBM Websphere AS with SQL Server I observed an interesting phenomenon while troubleshooting excessive blocking.

First lets remind everyone on SQL Server Isolation Levels. SQL Server Supports six isolation levels. These are: READ COMMITTED (default), READ UNCOMMITTED, REPEATABLE READ, SNAPSHOT, SERIALIZABLE and READ COMMITTED SNAPSHOT. As you may understand using correct isolation level has a big impact on performance and concurrency, because:
1. It controls whether locks are taken, what kind of locks and how long these are held.
2. Whether transaction allows for “dirty reads” , uncommitted modifications by another transaction , reads only committed changes or even places exclusive lock and blocks until row is freed.
More on this is in SQL BOL –
The default isolation level in SQL Server is READ COMMITTED, which means that query retrieves the committed version of the row that existed at the time the statement or transaction started.

As customer reported database blocking problems I fired up old trusty blocking DMV output and noted that for some odd reason despite default ISOLATION LEVEL in database being READ COMMITTED my head blocker was running under REPEATABLE READ.

iso2

As per SQL Server BOL for READ COMMITTED– “Specifies that statements cannot read data that has been modified but not committed by other transactions. This prevents dirty reads. Data can be changed by other transactions between individual statements within the current transaction, resulting in nonrepeatable reads or phantom data.” Vs. REPEATABLE READ – “Specifies that statements cannot read data that has been modified but not yet committed by other transactions and that no other transactions can modify data that has been read by the current transaction until the current transaction completes.Shared locks are placed on all data read by each statement in the transaction and are held until the transaction completes. This prevents other transactions from modifying any rows that have been read by the current transaction. Other transactions can insert new rows that match the search conditions of statements issued by the current transaction. If the current transaction then retries the statement it will retrieve the new rows, which results in phantom reads. Because shared locks are held to the end of a transaction instead of being released at the end of each statement, concurrency is lower than the default READ COMMITTED isolation level. Use this option only when necessary.”
Obviously REPEATABLE READ is more strict isolation level, therefore prone to more blocking.
So why? To do get to the bottom of this I had to do some research on Websphere front, mainly into something called Websphere access intent policies. As per IBM – “WebSphere Application Server access intent policies provide a consistent way of defining the isolation level for CMP bean data across the different relational databases in your environment. Within a deployed application, the combination of an access intent policy concurrency definition and access type signifies the isolation level value that Application Server sets on a database connection”
And looks also that default with SQL Server is Repeatable Read as referenced in IBM docs.

As soon as customer changed policy to wsOptimisticRead which translated to READ COMMITTED in SQL Server world vs. wsPessimisticUpdateWeakestLockAtLoad WebSphere default blocking in the system subsided dramatically. So note to all SQL DBAs troubleshooting blocking and concurrency performance issues from J2EE\IBM WebSphere AS – check isolation level, don’t just assume that if database is set to READ COMMITTED by default that’s what your client connections are running under. If you see unusually restrictive isolation level (REPEATABLE READ, SERIALIZABLE) that you don’t expect, check with your development team on Access Intent Policy in WebSphere.