There was a long-standing bug recently discovered in the ndt_e2e.sh test which was causing tests to run at more random intervals and likely more frequently than every 10 minutes.
Yesterday, @stephen-soltesz discovered this:

After the event (around 08:00 UTC) there is a very clear 10m period. After investigation, discovered that for some reason the tests had become more synchronized that they were before. In the event of a test failure (perhaps due to a network outage) all tests will begin running every single minute until a success is registered. When the network finally comes back the tests will start succeeding, but by this time they are all running close to within one minute of each other. From there forward, they will be less spaced out. It turns out that the bug referenced above was probably protecting us from this sort of synchronization.
We need to figure out a more robust way of keeping tests spaced as uniformly across 10m as possible.
There was a long-standing bug recently discovered in the
ndt_e2e.shtest which was causing tests to run at more random intervals and likely more frequently than every 10 minutes.Yesterday, @stephen-soltesz discovered this:
After the event (around 08:00 UTC) there is a very clear 10m period. After investigation, discovered that for some reason the tests had become more synchronized that they were before. In the event of a test failure (perhaps due to a network outage) all tests will begin running every single minute until a success is registered. When the network finally comes back the tests will start succeeding, but by this time they are all running close to within one minute of each other. From there forward, they will be less spaced out. It turns out that the bug referenced above was probably protecting us from this sort of synchronization.
We need to figure out a more robust way of keeping tests spaced as uniformly across 10m as possible.