perf(flagd): speed up e2e test execution via container pool and parallel scenarios#1752
perf(flagd): speed up e2e test execution via container pool and parallel scenarios#1752
Conversation
18e3e22 to
09f16db
Compare
There was a problem hiding this comment.
Code Review
This pull request enables parallel end-to-end test execution for the flagd provider by implementing a ContainerPool to manage multiple Docker Compose environments. It introduces ContainerEntry and ContainerPool classes, refactors test steps to use pooled containers, and updates Maven and Cucumber configurations for parallel forking. A review comment identifies a potential resource leak in ContainerPool.initialize() and suggests using a try-finally block to ensure the ExecutorService is always shut down and to prevent container leaks if an exception occurs.
providers/flagd/src/test/java/dev/openfeature/contrib/providers/flagd/e2e/ContainerPool.java
Show resolved
Hide resolved
e985cfb to
8057aa4
Compare
a985b64 to
209cce0
Compare
Replace the single shared Docker Compose stack with a pre-warmed ContainerPool. Each Cucumber scenario borrows its own isolated ContainerEntry (flagd + envoy + temp dir), eliminating the process-level contention that prevented parallel execution. Key changes: - ContainerEntry: encapsulates a single Docker Compose stack + temp dir - ContainerPool: manages a fixed-size pool with acquire/release semantics and reference counting so multiple suite runners sharing a JVM only start/stop containers once - ProviderSteps: borrows a container per scenario, replaces global API.shutdown() with per-provider NoOpProvider swap through the SDK lifecycle (properly detaches event emitters) - State: carries the borrowed ContainerEntry and provider domain name Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>
Enable cucumber.execution.parallel.enabled=true with fixed parallelism matching the container pool size (2). Correctness safeguards: - @env-var scenarios serialised behind an ENV_VARS exclusive resource lock (requires @env-var tag in test-harness, see companion PR) - @Grace scenarios serialised behind a CONTAINER_RESTART lock to avoid reconnection timeouts under parallel container restarts - ConfigCucumberTest disables parallelism entirely (env-var mutations in <0.4s suite — no benefit, avoids races) - EventSteps: drain-based event matching replaces clear() to prevent stale events from satisfying later assertions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>
Temporary: CI needs the @env-var tag from flagd-testbed#359. Revert to released branch once that PR is merged and tagged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>
Switch Cucumber plugin from 'pretty' (prints every step) to 'summary' (only prints failures and a final count). Keeps CI logs readable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>
209cce0 to
f9e647c
Compare
Switch Cucumber strategy from 'fixed' to 'dynamic' (factor=1.0, i.e. one thread per available processor). ContainerPool default pool size also scales with availableProcessors() so pool slots match thread count. Both are still overridable: -Dflagd.e2e.pool.size=N -Dcucumber.execution.parallel.config.dynamic.factor=N Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>
Default pool size was Runtime.availableProcessors() which on large machines (22 CPUs) spawned too many simultaneous Docker Compose stacks and caused ContainerLaunchException. Cap at min(availableProcessors, 4). Cucumber threads still scale with CPUs (dynamic factor=1) — extra threads simply block waiting for a free container, which is safe. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>
| // later assertion that expects a *new* event of the same type, while still | ||
| // preserving events that arrived *after* the match for subsequent steps. | ||
| Event matched = null; | ||
| while (!state.events.isEmpty()) { |
There was a problem hiding this comment.
Are w esure that no new events can be emitted while or immediately after this runs? Otherwise this loop might not be sufficient
There was a problem hiding this comment.
what do you mean, we want to specifically keep events which are generated while this loop runs. because an ready event can happen shortly after a disconnect, and while we wait for the disconnect including cleanup, we might even remove the new ready
There was a problem hiding this comment.
If we are worried about events shortly after a disconnect, we should wait for some time and check events afterwards or in the meantime. This loop might be done after 0 or 1 iterations, and might be done before we receive such an event that we would want to wait for
There was a problem hiding this comment.
this cleans all the events which has happened till our matched event. so cleaning the list till out event. so that if there are events in the meantime, they stay in the list, and we can match in the next check against all of them.
| break; | ||
| } | ||
| } | ||
| state.lastEvent = java.util.Optional.ofNullable(matched); |
There was a problem hiding this comment.
The naming is not ideal, this is not the last event, it's the last event that matches the eventType
There was a problem hiding this comment.
the last event is for the current test state, the last tracked event. we do not care about the type. this is only used to verify some event information.
| // properly calls detachEventProvider (nulls onEmit) and shuts down the emitter | ||
| // executor — neither of which happens when calling provider.shutdown() directly. |
There was a problem hiding this comment.
Is this something that should happen when we call shutdown?
There was a problem hiding this comment.
the tests are now running in parallel, so shutting down the api is a no go, as it is also messing with other tests
There was a problem hiding this comment.
No, but I mean in general, not specifically this test case
There was a problem hiding this comment.
maybe, but currently the goal is speeding up tests ;) - when we reset a new provider, we are actually cleaning up
| } | ||
|
|
||
| public static void shutdown() { | ||
| int remaining = refCount.decrementAndGet(); |
There was a problem hiding this comment.
Could it be possible that all current users call shutdown, even though there are still outstanding users, who have not called initialize yet? Then we would shutdown the pool, even though there are still tests lined up
There was a problem hiding this comment.
this is happening on the beforeAll - per test suites, tests suites are not parallel anyways, there is another improvement for this.
There was a problem hiding this comment.
If the test suites do not run in parallel, then we don't need this sync mechanism. If they do run in parallel, the scenrio in my comment could (even though it is unlikely) occur
There was a problem hiding this comment.
i ran it shortly in parallel, and the next iteration will add more flexibility to it, where all the tests are actually parallel. but this is in the next follow up pr. i shortly ran all 3 tests in parallel with some hacks, but it was not worth the effort.
| "flagd.e2e.pool.size", Math.min(Runtime.getRuntime().availableProcessors(), 4)); | ||
|
|
||
| private static final BlockingQueue<ContainerEntry> pool = new LinkedBlockingQueue<>(); | ||
| private static final List<ContainerEntry> all = new ArrayList<>(); |
There was a problem hiding this comment.
I think this needs to be a concurrent data structure too, so that we guarantee that all changes to the list are also visible to another thread calling shutdown
Summary
Reduces flagd provider e2e test wall-clock time by ~75% through a pre-warmed container pool and Cucumber-level parallel scenario execution — no Surefire fork changes needed.
Changes
1. Container pool (
ContainerPool+ContainerEntry)The previous setup used a single shared Docker Compose stack for all scenarios within a runner. Since the flagd-testbed launchpad controls a single flagd process via
/start,/stop,/restart, and/changeHTTP endpoints, scenarios sharing one container would race on these operations and could not run concurrently.The fix is a pre-warmed container pool:
@BeforeAllstarts N containers in parallel (~45s, once per JVM), and each Cucumber scenario borrows oneContainerEntryfor its duration, giving it a fully isolated flagd process. After teardown the entry is returned to the pool.Pool size is tunable via
-Dflagd.e2e.pool.size=N(default: 2).A reference counter ensures that when multiple suite runners share the same JVM (
reuseForks=true), containers are only started on the firstinitialize()call and stopped on the lastshutdown()call.2. Parallel Cucumber scenarios
With
cucumber.execution.parallel.enabled=trueandfixed.parallelism=2(matching the default pool size), scenarios within each runner execute concurrently.Correctness safeguards via exclusive resource locks:
@env-varscenarios serialised behindENV_VARSlock (requires companion PR flagd-testbed#359)@gracescenarios (container restart + reconnection timing) serialised behindCONTAINER_RESTARTlockConfigCucumberTestdisables parallelism entirely (env-var mutations in <0.4s suite — no benefit)3. Per-provider teardown
Replaced
OpenFeatureAPI.getInstance().shutdown()(global — tears down all providers) with a per-providerNoOpProviderswap through the SDK lifecycle. This properly detaches event emitters and is safe for parallel execution.4. Event drain fix
EventStepsnow drains events up to and including the first match instead ofclear()-ing the entire queue. This prevents stale events (e.g. a READY from before a disconnect) from satisfying later assertions.Architecture
Dependencies
@env-vartag to config scenarios (submodule temporarily pointed at PR branch)Draft — watching CI
Opening as draft to observe CI behaviour before merging.