- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.7k
Description
Description
Some low probability race condition causes lmax to loop indefinitely on Thread.yield
Configuration
-DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector
-Dlog4j2.enable.threadlocals=true
-Dlog4j2.enable.direct.encoders=true
-XX:CompileCommand=dontinline,org.apache.logging.log4j.core.async.perftest.NoOpIdleStrategy::idle
Version: 2.17.1
Operating system: Centos 8
JDK: OpenJDK 64-Bit Server VM GraalVM LIBGRAAL 20.3.3 (build 11.0.12+5-jvmci-20.3-b06)
Logs
It would seem that the queue datastructure becomes corrupted and cas=uses the polling thread to loop infinitely with the following stacktrace:
"Log4j2-TF-1-AsyncLogger[AsyncContext@46f5f779]-1" #18 daemon prio=5 os_prio=0 cpu=4404755.82ms elapsed=4733.22s tid=0x00007fed7ade9800 nid=0xcd runnable  [0x00007fed7bffd000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.Thread.yield([email protected]/Native Method)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.transferAfterCancelledWait([email protected]/AbstractQueuedSynchronizer.java:1752)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos([email protected]/AbstractQueuedSynchronizer.java:2119)
        at com.lmax.disruptor.TimeoutBlockingWaitStrategy.waitFor(TimeoutBlockingWaitStrategy.java:38)
        at com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:56)
        at com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:159)
        at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125)
        at java.lang.Thread.run([email protected]/Thread.java:829)
Reproduction
This was found on a large production system with very low probability. Using log4j with the above settings. We don't know the exact case where the polling thread reaches this state