How analyse an OutOfMemory GWT Error and customize heap?

307 views Asked by At

I work on a old Java 1.6 application which use GWT 2.4.0 Since few weeks, it started to show a lot of OutOfMemoryError, and I can't find why... I can't do a lot of modification on this old app, I can't migrate to JDK 1.8, or migrate to a newer GWT version, too much risky.

Example of stacktrace :

2021-01-27 09:40:39,937 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[localhost].[/borneIntranet]] Exception while dispatching incoming RPC call
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:2734)
        at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
        at java.util.ArrayList.add(ArrayList.java:351)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.append(ServerSerializationStreamWriter.java:583)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeInt(AbstractSerializationStreamWriter.java:100)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeString(AbstractSerializationStreamWriter.java:134)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter$ValueWriter$10.write(ServerSerializationStreamWriter.java:165)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeValue(ServerSerializationStreamWriter.java:536)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeClass(ServerSerializationStreamWriter.java:709)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeImpl(ServerSerializationStreamWriter.java:748)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serialize(ServerSerializationStreamWriter.java:621)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeObject(AbstractSerializationStreamWriter.java:126)
        at com.google.gwt.user.client.rpc.core.java.util.Collection_CustomFieldSerializerBase.serialize(Collection_CustomFieldSerializerBase.java:44)
        at com.google.gwt.user.client.rpc.core.java.util.ArrayList_CustomFieldSerializer.serialize(ArrayList_CustomFieldSerializer.java:39)
        at com.google.gwt.user.client.rpc.core.java.util.ArrayList_CustomFieldSerializer.serializeInstance(ArrayList_CustomFieldSerializer.java:51)
        at com.google.gwt.user.client.rpc.core.java.util.ArrayList_CustomFieldSerializer.serializeInstance(ArrayList_CustomFieldSerializer.java:28)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeImpl(ServerSerializationStreamWriter.java:740)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serialize(ServerSerializationStreamWriter.java:621)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeObject(AbstractSerializationStreamWriter.java:126)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter$ValueWriter$8.write(ServerSerializationStreamWriter.java:153)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeValue(ServerSerializationStreamWriter.java:539)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeClass(ServerSerializationStreamWriter.java:709)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeImpl(ServerSerializationStreamWriter.java:748)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serialize(ServerSerializationStreamWriter.java:621)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeObject(AbstractSerializationStreamWriter.java:126)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter$ValueWriter$8.write(ServerSerializationStreamWriter.java:153)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeValue(ServerSerializationStreamWriter.java:539)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeClass(ServerSerializationStreamWriter.java:709)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeImpl(ServerSerializationStreamWriter.java:748)
        at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serialize(ServerSerializationStreamWriter.java:621)
        at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeObject(AbstractSerializationStreamWriter.java:126)
        at com.google.gwt.user.client.rpc.core.java.util.Collection_CustomFieldSerializerBase.serialize(Collection_CustomFieldSerializerBase.java:44)
2021-01-27 09:40:41,594 ERROR [STDERR] 8312399 [scheduler_QuartzSchedulerThread] ERROR org.quartz.core.ErrorLogger - An error occurred while scanning for the next triggers to fire.
2021-01-27 09:40:42,206 ERROR [STDERR] org.quartz.JobPersistenceException: Failed to obtain DB connection from data source 'springNonTxDataSource.scheduler': java.sql.SQLException: Exception d'E/S: Connection reset [See nested exception:$
2021-01-27 09:40:42,208 ERROR [STDERR]  at org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConnection(JobStoreCMT.java:168)
2021-01-27 09:40:42,208 ERROR [STDERR]  at org.quartz.impl.jdbcjobstore.JobStoreSupport.executeInNonManagedTXLock(JobStoreSupport.java:3784)
2021-01-27 09:40:42,208 ERROR [STDERR]  at org.quartz.impl.jdbcjobstore.JobStoreSupport.acquireNextTriggers(JobStoreSupport.java:2756)
2021-01-27 09:40:42,208 ERROR [STDERR]  at org.quartz.core.QuartzSchedulerThread.run(QuartzSchedulerThread.java:272)
2021-01-27 09:40:42,208 ERROR [STDERR] Caused by: java.sql.SQLException: Exception d'E/S: Connection reset
2021-01-27 09:40:42,208 ERROR [STDERR]  at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:111)
2021-01-27 09:40:42,208 ERROR [STDERR]  at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:145)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:254)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:386)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:413)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:164)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:34)
2021-01-27 09:40:42,209 ERROR [STDERR]  at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:752)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1188)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.springframework.scheduling.quartz.LocalDataSourceJobStore$2.getConnection(LocalDataSourceJobStore.java:134)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.quartz.utils.DBConnectionManager.getConnection(DBConnectionManager.java:108)
2021-01-27 09:40:42,209 ERROR [STDERR]  at org.quartz.impl.jdbcjobstore.JobStoreCMT.getNonManagedTXConnection(JobStoreCMT.java:165)
2021-01-27 09:40:42,209 ERROR [STDERR]  ... 3 more

I try this :

What is weird, is when I watch JMX stats, the heapspace is 25% used, and permspace is 40% used. But in server.log, I see the outOfMemoryError, so how it's possible ?

The server have 5120M of RAM.

The actual launch parameters are :

/opt/jdk/bin/java -Dprogram.name=run.sh -server -Xms2560m -Xmx2560m -XX:MaxPermSize=768M 
-Djava.awt.headless=true -Xdebug -Xrunjdwp:transport=dt_socket,address=8787,server=y,suspend=n 
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9999 
-Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false 
-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/
-Djava.net.preferIPv4Stack=true ...

I don't know how to process. I don't have much application log either, so I can't know what users really do, like display a GWT screen which use a lot of data, or something else.

EDIT :

Here is some gc log files, each generated in a day, until the app crashed with an OOM.

https://file.io/e1140Wc80yhd

https://file.io/8jRxM1KIgPPB

https://file.io/WvUgIX3ftQ1o

2

There are 2 answers

1
user2178964 On BEST ANSWER

I finally solved my OOM problem with the following steps :

1\ Create thread dump every 10 seconds, and keep the last 4 hours of dumps. I can't store full heap dump because I'm limited in space disk, so with this method, I only store a lot of "small" files. And I can follow the activity minutes before the OOM and after.

I modify my crontab to add :

#Check threads for OOM
* * * * * ( pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
* * * * * ( sleep 10 ; pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
* * * * * ( sleep 20 ; pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
* * * * * ( sleep 30 ; pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
* * * * * ( sleep 40 ; pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
* * * * * ( sleep 50 ; pidof java | xargs /opt/jdk/bin/jstack > /tmp/jstack`date "+\%d-\%m-\%Y:\%H:\%M:\%S"`.log )  
0 * * * * find /tmp -name "jstack*" -type f -mmin +240 -delete

2\

When an OOM happens. I download the thread dump files around 20 minutes before and 10 seconds after the time of OOM.

3\

I open all the files, and search for a class linked to my own app : "com.al6.xxxxx"

I see that there is a thread wich begins like 10 minutes before OOM, and base on his ajp-xxx number, I found it in every file until the OOM :

"ajp-0.0.0.0-8009-23" #7088 daemon prio-5 os-prio-0 tid-0x00007f7ffle1b000 nid-0x673 runnable [0x000000004cfa8000]
java. lang. Thread. State: RUNNABLE
at java.util.Arrays.copy0f(Arrays. java:3181)
at java.util.ArrayList.grow(ArrayList. java: 265)
at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:239)
at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:231)
at java.util.ArrayList.add(ArrayList. java:462)
at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.append(ServerSerializationStreamWriter.java:583)
at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeInt(AbstractSerializationStreamWriter.java:100)
at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeString(AbstractSerializationStreamWriter.java:134)
at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter$ValueWriter$10.write(ServerSerializationStreamWriter.java:165)
at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeValue(ServerSerializationStreamWriter.java:536) at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeClass(ServerSerializationStreamWriter.java:709)
at com.google.gwt.user.server.rpc.impl.ServerSerializationStreamWriter.serializeImpl(ServerSerializationStreamWriter.java:748)
at com.google.gwt.user .server. rpc. impl.ServerSerializationStreanWriter.serialize(ServerSerializationStreamWriter.java:621)
at com.google.gwt.user.client.rpc.impl.AbstractSerializationStreamWriter.writeObjectAbstractSerializationStreamWriter.java:126)

So, it's the guilty thread, clearly ! But I only have always the same stack tracelinked to GWT, without the launcher/parent of this running code.

So searching with the name "ajp-0.0.0.0-8009-23", I finally trace back the thread through his creation, and found a portion of "server code" !

com.al6.serviceBaseEntry.getXXXXX 

This code return an ArrayList with a lot of elements in it, sometimes 100, sometimes 1500. It's not a lot, but for each element of the list, there is like 30 attributes. This list of element is sent to GWT to generate a drop-down list with autocompletion.

In theory, the user had to type a sufficiently restrictive filter to return 10-20 elements. But sometimes, the user type a very generic filter, and so the method server side return 1500 elements. GWT tries to serialize the 1500 elements with all their attributes, and boum, OOM.

I just modify the IHM side code to force the user to enter an enough restrictive filter, and made a check in server side, limiting the result number to 100.

Since, no more OOM, it's solved !

In complement, for those who can, think about update GWT, even a little, there is some changes which add details on stack trace. And in my case of OOM, I think with a newer version of GWT, I would have more details on the cause of the OOM in the final stack trace, without doing all this search process.

10
JohannesB On

Nice puzzle you got there :-)

I've got some suggestions to try for you as I suppose you are the only one that can reproduce this (if you are the only one that has the code and the data that the code is operating on):

  1. Instead of the JMX bean, I would use GC logging because the information is realtime, directly from your Garbage Collector, see: https://gceasy.io for setup and analysis. E.g.: -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<GC-log-file-path>
  2. Try to increase -Xmx (instead of MaxPermSize) your way out of this one (at some point that is only going to succeed on a 64-bit instance which might cost extra memory, use compressed oops to avoid that: https://www.baeldung.com/jvm-compressed-oops), but stay clear from java having to use swap, and don't forget Java also needs memory for every Threads stacks and native code, on top of the Java heap.
  3. You could attach a Java debugger (e.g. Eclipse) to port 8787 and create an Exception breakpoint that catches java.lang.OutOfMemoryError and inspect the stack for the relevant values, to find that large dataset or circular reference?). WARNING: running Java with debugger breakpoints might slow down your application so it might not be a great idea for production environments, but on the other hand, making heap dumps isn't good for your applications response times either.
  4. You could load the heapdump from -XX:+HeapDumpOnOutOfMemoryError in (j)visualvm for analysis, should be easier than jhat but that might work too). Only dumps made with HeapDumpOnOutOfMemoryError contain all the information up to the point of the OOMError. Do not share the heapdump on the internet, every byte of private information (personal data, passwords, maybe bytecode from classes), becomes public information 1, Finally tracing (JDBC, network, statspack) of queries might help as well but may not be worth the hassle as it might be too much info to easily analyse

If you need more help try adding the GC logging and the output at startup time of: -XX:+PrintFlagsFinal to this post.