Previous Topic: Java Agent Log FilesNext Topic: Mainframe Bridge


Java Agent Troubleshooting

Important: Investigate the target environment ahead of time and ensure that it has been tested, or test it yourself. Most Java agent issues are OS or JVM-dependent, instead of application-dependent. Be sure to follow the instructions in this documentation. If that does not help, this topic reviews the most common issues.

Most serious issues involving the DevTest Java Agent occur at start-up (for example, the agent is not found, or the process crashes or hangs). Generally, after you get past issues at start-up, you are in good shape. From there, it is a matter of tweaking the configuration or writing extensions.

Notes:

 

Error at startup: Error occurred during initialization of VM. Could not find agent library in absolute path...

Verify that the library is indeed in that path and is using the same architecture as Java (both 32 bit or 64 bit).

Also verify that the library is not missing any dependencies. For example, use depends.exe on Win32, otool -L on Mac OS X, or ldd -d on Linux and UNIX. If libraries are missing, ensure LD_LIBRARY_PATH is set to include the directories where they reside. Containers can override LD_LIBRARY_PATH in their start-up scripts. Do not assume that it is correctly set if you set it in a shell instead of from inside the script or a container-specific administration tool.

If ldd does not return cleanly, the agent does not run properly. Therefore, verifying ldd is the first thing to get right. If you cannot find an agent version for the operating system, consider using the pure Java agent.

 

The agent exits immediately (or shortly after starting the process).

If you see the message LISA AGENT: VM terminated, it is likely that the process ended normally. Several containers have launcher processes, and it is normal for them to exit quickly.

If you do not see this message or a crash dump happens (after ldd returns cleanly), you could have a legitimate agent bug. Notify Support and try the pure Java agent. If it still crashes or produces a dump, you could have a JVM bug. One such occurrence is for the IBM JVM 1.5 on some operating systems, caused by trying to instrument threads. In that case, try supplying the command-line JVM argument -Dlisa.debug=true.

 

The agent exits with the message "GetEnv on jvmdi returned -3 (JNI_EVERSION)".

Try specifying the command-line option: -Xsov to instruct the JVM to use its debug-enabled libraries. If that does not work (for example, you get an invalid option error message), then this operating system is not supported.

 

The agent exits with the message "UTF ERROR" ["../../../src/solaris/instrument/EncodingSupport_md.c":66]: ..."

The message can vary depending on the exact version of the operating system, the JVM, or both. The message typically has one of the following elements: UTF ERROR ["../../../src/solaris/instrument/EncodingSupport_md.c":66]: Failed to complete iconv_open() setup.

This issue is due to a bug in some Solaris JVMs when some language packs are not installed. To fix this issue, install the en-US language pack by running pkg install SUNWlang-enUS and then export LANG=en_US.UTF-8. Alternatively, you can try using the native agent.

 

Agent hangs or throws numerous exceptions at startup (LinkageErrors, CircularityErrors, and so on).

A side effect of instrumenting Java bytecode using Java is that it can subtly change some of the class-loading order for early classes (java.* and such), resulting in a deadlock or bytecode verification errors.

We have eliminated all known occurrences of these issues for all combinations of JVMs and operating systems. However, it is possible that you have encountered an untested combination. Notify Support of this issue. If this issue is a hang, include a thread dump with your support issue. Produce a thread dump by entering CTRL+Break on Windows, CTRL+\ or kill -3 <pid> on UNIX/Linux. You can also try the Java agent, as the class-loading order is slightly different. If a class or package seems involved every time in the hung thread, try adding an exclude directive for it in the rules.xml file.

 

Agent throws java.lang.VerifyErrors or hot swapping has no effect.

Some older 1.4 JVMs have bugs in their support for hot swapping (instrumenting classes after they are loaded).

In that case, turn off hot swapping by disabling the Enable hot instrumentation property.

You can configure this property from the Agents window of the DevTest Portal. The property appears in the Settings tab.

You must determine in advance the classes, methods, or both that you want to intercept or virtualize, add the rules for these classes or methods in the rules.xml file, and bounce the server. This process is more tedious than doing it on a live server, but it is the only known way to work around this issue.

Sometimes the java.lang.VerifyError is not even seen in the logs, but the agent behaves as though the designated class has not been instrumented or yields random results, including crashes.

 

Agent starts but the consoles or broker cannot see the agent.

Typically, the cause is a firewall or port issue between the agent and the broker.

The top of the agent log can include the following warning: Can't connect to broker at tcp://ip:port. Verify that the IP address and port that the agent is using are correct. Then ensure that the broker is listening on the specified port on the specified IP address. netstat -ano | grep port should show a port listening on the supplied ip or 0.0.0.0. Finally, look for firewall issues by running telnet ip port from the agent computer to ensure that it can see the broker. If it can see the broker, the broker is likely in a bad state. Review the registry and broker logs (and restart it if necessary).

 

Agent causes some operations to time out.

Some JVMs (IBM JVMs in particular) do not behave properly after some of their networking classes are instrumented. As a result, networking calls can fail or time out without apparent reason.

To prevent those classes from being instrumented, try supplying the command-line JVM argument -Dlisa.debug=true.

If the issue is not resolved, edit the rules.xml file to exclude the following networking packages:

<exclude class="java.net.**"/>
<exclude class="java.nio.**"/>
<exclude class="sun.nio.**"/>

java.lang.NoClassDefFoundError on com/itko/lisa/remote/transactions/TransactionDispatcher.class

Verify that LisaAgent.jar is available and has read permissions and that it is not corrupted. The easiest way to verify is to run java -jar LisaAgent.jar -v.

Verify whether the application uses OSGi. If it does (as in JBoss 7), add com.itko to the system or bootstrap packages. The method for adding com.itko is container-dependent, which makes it difficult to provide specific instructions. However, it is typically a configuration file with a property that specifies a list of packages or a similar JVM argument.

 

Security-related exceptions are thrown when the agent is enabled.

You may see SecurityExceptions or PermissionExceptions thrown by the application only when the agent is turned on with the security enabled. This setting is now the default setting.

The reason and workaround are explained in Java Agent Security.

 

Abnormal resource consumption (CPU, memory, file handles, ...).

If the CPU usage is abnormally high or spikes periodically, note the period of the spikes because it will help determine the faulty thread or threads. Also try turning off CAI and VSE.

If you get OutOfMemory errors, monitor the Java heap usage. If it exceeds the -Xmx limit, then increase that limit. However, avoid increasing the limit if it is already high and well in excess of normal application usage without the agent. In that case, generate a heap dump. You can use the WAS HeapDump utility for WebSphere or the free Eclipse MAT tool for older versions. If the memory usage is below the -Xmx limit at the time of the error, it is likely a leak in native code. In this case, notify Support.

If you get unexplained IOExceptions (such as too many file handles), especially on UNIX or Linux, verify the ulimit on the box (ulimit -n -H and ulimit -n -S). If it is low, consider asking the administrator of the box to raise it (under 4096 is considered low for modern J2EE apps). Do not forget to restart afterward.

The agent tries to keep the number of file handles under that limit by triggering garbage collection when the limit is approached. If the agent cannot read the limit at startup, the agent uses a default value of 1024. Using this default value can result in excessive garbage collection and semi-random, frequent CPU spikes. The following messages in the logs indicate this situation:

Max (or preferred) handles limit approaching - triggering GC... 

In that case, increase the value of the Maximum number of handles property.

You can configure this property from the Agents window of the DevTest Portal. The property appears in the Settings tab.

 

I can see the agent started, but CAI data is missing or incomplete.

The data lifecycle process includes the following stages:

Missing or incomplete data can be the result of an issue in any of these stages.

Review the agent log for capture exceptions. Review the broker and console logs for transfer or persistence exceptions.

If there are no exceptions, and you are still unable to locate the missing data, turn on debug or dev logging in the agents. If you do not see statements such as "Sent partial transaction," then CAI is probably not turned on.

If none of these steps provide conclusive results, contact Support.

 

I turned on Java VSE recording or playback, but VSE does not receive any requests from the agent.

First, verify that the agent is in VSE record or playback mode. Open the agent log and search for "Starting VSE record/playback..."

Then look for exceptions in the agent logs and the VSE logs. If they are clean, it is possible that the class you think is virtualized is not virtualized. Look in the agent log for a statement such as "Virtualized com.xxx...." If the statement is missing, it is possible that the application has not yet loaded the class. Another possibility is that if you are using an early version of Java 1.4, some flavors do not support hot swapping. Java VSE uses hot swapping by default. In that case, specify the virtualized classes in the rules.xml file of the agent and restart it.

 

Other issues with Java VSE.

If you encounter any issues (functional or abnormal resource usage) while doing Java VSE, turn off CAI on the agent side.

You can turn off CAI by disabling the Auto-start property. Then restart the JVM.

You can configure this property from the Agents window of the DevTest Portal. The property appears in the Settings tab.

Do not turn off CAI on the broker side or Java VSE stops working altogether.

 

Case functionality is not working.

First, ensure that the agent is connected to a broker.

If the agent is connected to a broker, review the HTML source of the page that has the issue. Verify that the bottom of the page has a block of JavaScript that is easily identifiable by the variable names it uses (for example, com_itko_pathfinder_defectcapture_xxx). If the block is missing, review the agent log for possible exceptions. Other reasons for the absence of the block include:

If the block is present, verify whether a native web server (such as Apache) or a load balancer is present in front of the Java container. If it is the case, configure it to forward the request of the CAI JavaScript and resource files to the Java container. Those will have the word defectcapture as part of their URL. IT administrators generally know how to perform this task.

 

DevTest is configured to use an IBM DB2 database. In the DevTest Portal, I selected a JDBC node in a path graph. The Statements and Connection Url columns are blank.

Disable progressive streaming by adding the string progressiveStreaming=2 to the JDBC connection URL. For example:

lisadb.pool.common.url=jdbc:db2://myhostname:50000/dbname:progressiveStreaming=2;