In the event of a network outage, e2e tests can be come synchronized

[There was a long-standing bug recently discovered](https://github.com/m-lab/script-exporter-support/pull/28) in the `ndt_e2e.sh` test which was causing tests to run at more random intervals and likely more frequently than every 10 minutes.

Yesterday, @stephen-soltesz discovered this:

![script_exporter_e2e_synchronization](https://user-images.githubusercontent.com/1392825/81434807-0069b800-9124-11ea-8957-49c606bb27f9.png)

After the event (around 08:00 UTC) there is a very clear 10m period. After investigation, discovered that for some reason the tests had become more synchronized that they were before. In the event of a test failure (perhaps due to a network outage) all tests will begin running every single minute until a success is registered. When the network finally comes back the tests will start succeeding, but by this time they are all running close to within one minute of each other. From there forward, they will be less spaced out. It turns out that the bug referenced above was probably protecting us from this sort of synchronization.

We need to figure out a more robust way of keeping tests spaced as uniformly across 10m as possible.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In the event of a network outage, e2e tests can be come synchronized #31

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

In the event of a network outage, e2e tests can be come synchronized #31

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions