Skip to content

Conversation

@justinjc
Copy link
Collaborator

@justinjc justinjc commented Oct 1, 2020

This test sets up:

  1. A reporter that reports every second.
  2. A UDP listener that the reporter sends its metrics to.
  3. An emitter that constantly emits a unique timer. A timer is used because timers are directly reported, whereas counters and gauges have their own buffering.

The scenario:

  1. The emitter is allowed to emit for five seconds.
  2. The socket backing the UDP listener is closed, simulating the reporter's communicating endpoint being unreachable for some reason.
  3. The emitter is allowed to proceed for ten more seconds.
  4. Output a thread dump to see the state of the world.

Observations when run on git ref:

  • 978287b: timers are recorded, then after the socket is closed, timers continue to be recorded.

    • Travis run (search SocketCloseTest)
    • Thread dump of emitter thread:
      "EMITTER THREAD" Id=44 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5181cf78
          at sun.misc.Unsafe.park(Native Method)
          -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5181cf78
          at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
          at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
          at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
          at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
          ...
      
  • 3280651: timers are recorded, then after the socket is closed, timers are blocked from being recorded.

    • Travis run (search SocketCloseTest)
    • Thread dump of emitter thread:
      "EMITTER THREAD" Id=44 WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3dac736c
          at sun.misc.Unsafe.park(Native Method)
          -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@3dac736c
          at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
          at java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:350)
          at com.uber.m3.tally.m3.M3Reporter.queueSizedMetric(M3Reporter.java:445)
          at com.uber.m3.tally.m3.M3Reporter.reportTimer(M3Reporter.java:317)
          at com.uber.m3.tally.TimerImpl.record(TimerImpl.java:55)
          at com.uber.m3.tally.m3.SocketCloseTest$MetricsEmitter.run(SocketCloseTest.java:116)
          ...
      
  • 0589dcc (the latest 0.9.0 release): timers are recorded, then after the socket is closed, timers continue to be recorded.

    • Travis run (search SocketCloseTest)
    • Thread dump of emitter thread:
      "EMITTER THREAD" Id=55 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@56882ba4
          at sun.misc.Unsafe.park(Native Method)
          -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@56882ba4
          at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
          at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
          at java.util.concurrent.LinkedBlockingQueue.offer(LinkedBlockingQueue.java:385)
          at com.uber.m3.tally.m3.M3Reporter.enqueue(M3Reporter.java:465)
          at com.uber.m3.tally.m3.M3Reporter.reportTimer(M3Reporter.java:354)
          at com.uber.m3.tally.TimerImpl.record(TimerImpl.java:55)
          at com.uber.m3.tally.m3.SocketCloseTest$MetricsEmitter.run(SocketCloseTest.java:115)
          ...
      

@justinjc justinjc requested review from prateek and vdarulis October 1, 2020 04:48
@justinjc justinjc changed the title [WIP] Add SocketCloseTest [WIP] Add test simulating unreachable socket for reporter Oct 1, 2020
@alexeykudinkin
Copy link
Contributor

alexeykudinkin commented Oct 1, 2020

@justinjc your test isn't simulating conditions properly: Processor thread is continues to run in the thread dump of your first test-run:

   "pool-9-thread-1" Id=28 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@29b2c15a
    	at sun.misc.Unsafe.park(Native Method)
    	-  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@29b2c15a
    	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    	at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    	at com.uber.m3.tally.m3.M3Reporter$Processor.run(M3Reporter.java:485)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)

@prateek
Copy link
Collaborator

prateek commented Oct 1, 2020

The same thread you're pointing out continues to run in the second test as well:

"pool-9-thread-1" Id=28 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@29b2c15a
    	at sun.misc.Unsafe.park(Native Method)
    	-  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@29b2c15a
    	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
    	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
    	at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
    	at com.uber.m3.tally.m3.M3Reporter$Processor.run(M3Reporter.java:485)
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    	at java.lang.Thread.run(Thread.java:748)
    	Number of locked synchronizers = 1
    	- java.util.concurrent.ThreadPoolExecutor$Worker@a4ca1d1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants