Infinispan with hanging threads in level 1 clean up

251 views Asked by At

We are trying to upgrade from Wildfly 18 to 22 and are experiencing some trouble with threads left dangling in the jgroups thread group. When the number of threads reaches the default max number (200) everything stops up (naturally). This takes two to three days in our test environment.

In the thread dump we are left with 200 dangling threads that look like this:

at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:234)
at java.util.concurrent.CompletableFuture$Signaller.block([email protected]/CompletableFuture.java:1798)
at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3128)
at java.util.concurrent.CompletableFuture.timedGet([email protected]/CompletableFuture.java:1868)
at java.util.concurrent.CompletableFuture.get([email protected]/CompletableFuture.java:2021)
at org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:125)
at org.infinispan.util.concurrent.CompletionStages.join(CompletionStages.java:80)
at org.infinispan.interceptors.distribution.L1NonTxInterceptor.lambda$handleDataWriteCommand$10(L1NonTxInterceptor.java:399)
at org.infinispan.interceptors.distribution.L1NonTxInterceptor$$Lambda$1806/0x0000000841ace040.apply(Unknown Source)
at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.invokeQueuedHandlers(QueueAsyncInvocationStage.java:125)
at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:88)
at org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:33)
at java.util.concurrent.CompletableFuture.uniWhenComplete([email protected]/CompletableFuture.java:859)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire([email protected]/CompletableFuture.java:837)
at java.util.concurrent.CompletableFuture.postComplete([email protected]/CompletableFuture.java:506)
at java.util.concurrent.CompletableFuture.complete([email protected]/CompletableFuture.java:2073)
at org.infinispan.interceptors.distribution.PrimaryOwnerOnlyCollector.primaryResult(PrimaryOwnerOnlyCollector.java:30)
at org.infinispan.interceptors.distribution.TriangleDistributionInterceptor.lambda$forwardToPrimary$6(TriangleDistributionInterceptor.java:526)
at org.infinispan.interceptors.distribution.TriangleDistributionInterceptor$$Lambda$1953/0x0000000841d51040.apply(Unknown Source)
at java.util.concurrent.CompletableFuture.uniHandle([email protected]/CompletableFuture.java:930)
at java.util.concurrent.CompletableFuture$UniHandle.tryFire([email protected]/CompletableFuture.java:907)
at java.util.concurrent.CompletableFuture.postComplete([email protected]/CompletableFuture.java:506)
at java.util.concurrent.CompletableFuture.complete([email protected]/CompletableFuture.java:2073)
at org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:67)
at org.infinispan.remoting.transport.impl.SingleTargetRequest.onResponse(SingleTargetRequest.java:45)
at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1402)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1305)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:131)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1445)
at org.jgroups.JChannel.up(JChannel.java:784)
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:913)
at org.jgroups.protocols.FRAG3.up(FRAG3.java:165)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:343)
at org.jgroups.protocols.FlowControl.up(FlowControl.java:343)
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:876)
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:243)
at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1049)
at org.jgroups.protocols.UNICAST3.addMessage(UNICAST3.java:772)
at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:753)
at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:405)
at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:592)
at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:132)
at org.jgroups.protocols.FailureDetection.up(FailureDetection.java:186)
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:254)
at org.jgroups.protocols.MERGE3.up(MERGE3.java:281)
at org.jgroups.protocols.Discovery.up(Discovery.java:300)
at org.jgroups.protocols.TP.passMessageUp(TP.java:1396)
at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87)
at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1128)
at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:628)
at java.lang.Thread.run([email protected]/Thread.java:834)

These threads are hanging on a CompletableFuture.get(...) with a infinispan-hardcoded timeout of 24 hours. This is probably not something that should happen, which is confirmed by the exception we get when the timeout is finally reached:

2021-03-03 09:00:46,638 ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (jgroups-263,xxxxxxxxxxxx) ISPN000136: Error executing command ReadWriteKeyCommand on Cache 'sessionmanager.session', writing keys [xxxx#xxxxxx]: java.lang.IllegalStateException: This should never happen!
        at [email protected]//org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:127)
        at [email protected]//org.infinispan.util.concurrent.CompletionStages.join(CompletionStages.java:80)
        at [email protected]//org.infinispan.interceptors.distribution.L1NonTxInterceptor.lambda$handleDataWriteCommand$10(L1NonTxInterceptor.java:399)
        at [email protected]//org.infinispan.interceptors.impl.QueueAsyncInvocationStage.invokeQueuedHandlers(QueueAsyncInvocationStage.java:125)
        at [email protected]//org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:88)
        at [email protected]//org.infinispan.interceptors.impl.QueueAsyncInvocationStage.accept(QueueAsyncInvocationStage.java:33)
        at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:859)
        at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:837)
        at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
        at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
        at [email protected]//org.infinispan.interceptors.distribution.PrimaryOwnerOnlyCollector.primaryResult(PrimaryOwnerOnlyCollector.java:30)
        at [email protected]//org.infinispan.interceptors.distribution.TriangleDistributionInterceptor.lambda$forwardToPrimary$6(TriangleDistributionInterceptor.java:526)
        at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:930)
        at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:907)
        at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
        at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2073)
        at [email protected]//org.infinispan.remoting.transport.AbstractRequest.complete(AbstractRequest.java:67)
        at [email protected]//org.infinispan.remoting.transport.impl.SingleTargetRequest.onResponse(SingleTargetRequest.java:45)
        at [email protected]//org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
        at [email protected]//org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1402)
        at [email protected]//org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1305)
        at [email protected]//org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:131)
        at [email protected]//org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1445)
        at [email protected]//org.jgroups.JChannel.up(JChannel.java:784)
        at [email protected]//org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:913)
        at [email protected]//org.jgroups.protocols.FRAG3.up(FRAG3.java:165)
        at [email protected]//org.jgroups.protocols.FlowControl.up(FlowControl.java:343)
        at [email protected]//org.jgroups.protocols.FlowControl.up(FlowControl.java:343)
        at [email protected]//org.jgroups.protocols.pbcast.GMS.up(GMS.java:876)
        at [email protected]//org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:243)
        at [email protected]//org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1049)
        at [email protected]//org.jgroups.protocols.UNICAST3.addMessage(UNICAST3.java:772)
        at [email protected]//org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:753)
        at [email protected]//org.jgroups.protocols.UNICAST3.up(UNICAST3.java:405)
        at [email protected]//org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:592)
        at [email protected]//org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:132)
        at [email protected]//org.jgroups.protocols.FailureDetection.up(FailureDetection.java:186)
        at [email protected]//org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:254)
        at [email protected]//org.jgroups.protocols.MERGE3.up(MERGE3.java:281)
        at [email protected]//org.jgroups.protocols.Discovery.up(Discovery.java:300)
        at [email protected]//org.jgroups.protocols.TP.passMessageUp(TP.java:1396)
        at [email protected]//org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.util.concurrent.TimeoutException
        at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
        at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
        at [email protected]//org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:125)
        ... 44 more

Have anyone encountered a similar problem? What can possibly lead to this situation?

0

There are 0 answers