perf(flagd): speed up e2e test execution via container pool and parallel scenarios by aepfli · Pull Request #1752 · open-feature/java-sdk-contrib

aepfli · 2026-03-30T19:11:48Z

Summary

Reduces flagd provider e2e test wall-clock time by ~75% through a pre-warmed container pool and Cucumber-level parallel scenario execution — no Surefire fork changes needed.

Changes

1. Container pool (`ContainerPool` + `ContainerEntry`)

The previous setup used a single shared Docker Compose stack for all scenarios within a runner. Since the flagd-testbed launchpad controls a single flagd process via /start, /stop, /restart, and /change HTTP endpoints, scenarios sharing one container would race on these operations and could not run concurrently.

The fix is a pre-warmed container pool: @BeforeAll starts N containers in parallel (~45s, once per JVM), and each Cucumber scenario borrows one ContainerEntry for its duration, giving it a fully isolated flagd process. After teardown the entry is returned to the pool.

Pool size is tunable via -Dflagd.e2e.pool.size=N (default: 2).

A reference counter ensures that when multiple suite runners share the same JVM (reuseForks=true), containers are only started on the first initialize() call and stopped on the last shutdown() call.

2. Parallel Cucumber scenarios

With cucumber.execution.parallel.enabled=true and fixed.parallelism=2 (matching the default pool size), scenarios within each runner execute concurrently.

Correctness safeguards via exclusive resource locks:

@env-var scenarios serialised behind ENV_VARS lock (requires companion PR flagd-testbed#359)
@grace scenarios (container restart + reconnection timing) serialised behind CONTAINER_RESTART lock
ConfigCucumberTest disables parallelism entirely (env-var mutations in <0.4s suite — no benefit)

3. Per-provider teardown

Replaced OpenFeatureAPI.getInstance().shutdown() (global — tears down all providers) with a per-provider NoOpProvider swap through the SDK lifecycle. This properly detaches event emitters and is safe for parallel execution.

4. Event drain fix

EventSteps now drains events up to and including the first match instead of clear()-ing the entire queue. This prevents stale events (e.g. a READY from before a disconnect) from satisfying later assertions.

Architecture

Before:
  Runner 1 (sequential) → start 1 container → scenario 1,2,...N (sequential) → stop
  Runner 2 (sequential) → start 1 container → scenario 1,2,...N (sequential) → stop
  Runner 3 (sequential) → start 1 container → scenario 1,2,...N (sequential) → stop

After (pool + parallel):
  Runner 1: start pool(2) → 2 parallel scenarios → ... → defer shutdown
  Runner 2: reuse pool    → 2 parallel scenarios → ... → defer shutdown
  Runner 3: reuse pool    → 2 parallel scenarios → ... → stop pool

Dependencies

flagd-testbed#359 — adds @env-var tag to config scenarios (submodule temporarily pointed at PR branch)

Draft — watching CI

Opening as draft to observe CI behaviour before merging.

gemini-code-assist

Code Review

This pull request enables parallel end-to-end test execution for the flagd provider by implementing a ContainerPool to manage multiple Docker Compose environments. It introduces ContainerEntry and ContainerPool classes, refactors test steps to use pooled containers, and updates Maven and Cucumber configurations for parallel forking. A review comment identifies a potential resource leak in ContainerPool.initialize() and suggests using a try-finally block to ensure the ExecutorService is always shut down and to prevent container leaks if an exception occurs.

providers/flagd/src/test/java/dev/openfeature/contrib/providers/flagd/e2e/ContainerPool.java

Replace the single shared Docker Compose stack with a pre-warmed ContainerPool. Each Cucumber scenario borrows its own isolated ContainerEntry (flagd + envoy + temp dir), eliminating the process-level contention that prevented parallel execution. Key changes: - ContainerEntry: encapsulates a single Docker Compose stack + temp dir - ContainerPool: manages a fixed-size pool with acquire/release semantics and reference counting so multiple suite runners sharing a JVM only start/stop containers once - ProviderSteps: borrows a container per scenario, replaces global API.shutdown() with per-provider NoOpProvider swap through the SDK lifecycle (properly detaches event emitters) - State: carries the borrowed ContainerEntry and provider domain name Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>

@Grace

Enable cucumber.execution.parallel.enabled=true with fixed parallelism matching the container pool size (2). Correctness safeguards: - @env-var scenarios serialised behind an ENV_VARS exclusive resource lock (requires @env-var tag in test-harness, see companion PR) - @Grace scenarios serialised behind a CONTAINER_RESTART lock to avoid reconnection timeouts under parallel container restarts - ConfigCucumberTest disables parallelism entirely (env-var mutations in <0.4s suite — no benefit, avoids races) - EventSteps: drain-based event matching replaces clear() to prevent stale events from satisfying later assertions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>

Temporary: CI needs the @env-var tag from flagd-testbed#359. Revert to released branch once that PR is merged and tagged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>

Switch Cucumber plugin from 'pretty' (prints every step) to 'summary' (only prints failures and a final count). Keeps CI logs readable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>

Switch Cucumber strategy from 'fixed' to 'dynamic' (factor=1.0, i.e. one thread per available processor). ContainerPool default pool size also scales with availableProcessors() so pool slots match thread count. Both are still overridable: -Dflagd.e2e.pool.size=N -Dcucumber.execution.parallel.config.dynamic.factor=N Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>

Default pool size was Runtime.availableProcessors() which on large machines (22 CPUs) spawned too many simultaneous Docker Compose stacks and caused ContainerLaunchException. Cap at min(availableProcessors, 4). Cucumber threads still scale with CPUs (dynamic factor=1) — extra threads simply block waiting for a free container, which is safe. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Simon Schrottner <simon.schrottner@dynatrace.com>

chrfwow · 2026-04-01T09:00:20Z

providers/flagd/src/test/java/dev/openfeature/contrib/providers/flagd/e2e/steps/EventSteps.java

+        // later assertion that expects a *new* event of the same type, while still
+        // preserving events that arrived *after* the match for subsequent steps.
+        Event matched = null;
+        while (!state.events.isEmpty()) {


Are w esure that no new events can be emitted while or immediately after this runs? Otherwise this loop might not be sufficient

what do you mean, we want to specifically keep events which are generated while this loop runs. because an ready event can happen shortly after a disconnect, and while we wait for the disconnect including cleanup, we might even remove the new ready

If we are worried about events shortly after a disconnect, we should wait for some time and check events afterwards or in the meantime. This loop might be done after 0 or 1 iterations, and might be done before we receive such an event that we would want to wait for

this cleans all the events which has happened till our matched event. so cleaning the list till out event. so that if there are events in the meantime, they stay in the list, and we can match in the next check against all of them.

chrfwow · 2026-04-01T09:01:15Z

providers/flagd/src/test/java/dev/openfeature/contrib/providers/flagd/e2e/steps/EventSteps.java

+                break;
+            }
+        }
+        state.lastEvent = java.util.Optional.ofNullable(matched);


The naming is not ideal, this is not the last event, it's the last event that matches the eventType

the last event is for the current test state, the last tracked event. we do not care about the type. this is only used to verify some event information.

chrfwow · 2026-04-01T09:03:00Z

...ers/flagd/src/test/java/dev/openfeature/contrib/providers/flagd/e2e/steps/ProviderSteps.java

+        // properly calls detachEventProvider (nulls onEmit) and shuts down the emitter
+        // executor — neither of which happens when calling provider.shutdown() directly.


Is this something that should happen when we call shutdown?

the tests are now running in parallel, so shutting down the api is a no go, as it is also messing with other tests

No, but I mean in general, not specifically this test case

maybe, but currently the goal is speeding up tests ;) - when we reset a new provider, we are actually cleaning up

chrfwow · 2026-04-01T09:08:59Z

providers/flagd/src/test/java/dev/openfeature/contrib/providers/flagd/e2e/ContainerPool.java

+    }
+
+    public static void shutdown() {
+        int remaining = refCount.decrementAndGet();


Could it be possible that all current users call shutdown, even though there are still outstanding users, who have not called initialize yet? Then we would shutdown the pool, even though there are still tests lined up

this is happening on the beforeAll - per test suites, tests suites are not parallel anyways, there is another improvement for this.

If the test suites do not run in parallel, then we don't need this sync mechanism. If they do run in parallel, the scenrio in my comment could (even though it is unlikely) occur

i ran it shortly in parallel, and the next iteration will add more flexibility to it, where all the tests are actually parallel. but this is in the next follow up pr. i shortly ran all 3 tests in parallel with some hacks, but it was not worth the effort.

chrfwow · 2026-04-01T09:35:25Z

providers/flagd/src/test/java/dev/openfeature/contrib/providers/flagd/e2e/ContainerPool.java

+            "flagd.e2e.pool.size", Math.min(Runtime.getRuntime().availableProcessors(), 4));
+
+    private static final BlockingQueue<ContainerEntry> pool = new LinkedBlockingQueue<>();
+    private static final List<ContainerEntry> all = new ArrayList<>();


I think this needs to be a concurrent data structure too, so that we guarantee that all changes to the list are also visible to another thread calling shutdown

github-actions bot assigned beeme1mr, Kavindu-Dodan, thisthat and toddbaert Mar 30, 2026

github-actions bot requested review from Kavindu-Dodan, beeme1mr, thisthat and toddbaert March 30, 2026 19:12

aepfli force-pushed the feat/speed-up-flagd-e2e-tests branch from 18e3e22 to 09f16db Compare March 30, 2026 19:13

gemini-code-assist bot reviewed Mar 30, 2026

View reviewed changes

providers/flagd/src/test/java/dev/openfeature/contrib/providers/flagd/e2e/ContainerPool.java Show resolved Hide resolved

aepfli mentioned this pull request Mar 31, 2026

feat: add @env-var tag to env-var config scenarios open-feature/flagd-testbed#359

Merged

aepfli force-pushed the feat/speed-up-flagd-e2e-tests branch from e985cfb to 8057aa4 Compare March 31, 2026 08:50

aepfli changed the title ~~perf(flagd): speed up e2e test execution via parallel runners and container pool~~ perf(flagd): speed up e2e test execution via container pool and parallel scenarios Mar 31, 2026

aepfli force-pushed the feat/speed-up-flagd-e2e-tests branch 2 times, most recently from a985b64 to 209cce0 Compare March 31, 2026 08:57

aepfli and others added 4 commits March 31, 2026 11:01

aepfli force-pushed the feat/speed-up-flagd-e2e-tests branch from 209cce0 to f9e647c Compare March 31, 2026 09:01

aepfli and others added 2 commits March 31, 2026 11:08

aepfli mentioned this pull request Apr 1, 2026

perf(flagd): run all 3 e2e resolver modes concurrently via @TestFactory #1753

Draft

aepfli marked this pull request as ready for review April 1, 2026 08:05

aepfli requested a review from a team as a code owner April 1, 2026 08:05

chrfwow reviewed Apr 1, 2026

View reviewed changes

		// properly calls detachEventProvider (nulls onEmit) and shuts down the emitter
		// executor — neither of which happens when calling provider.shutdown() directly.

Conversation

aepfli commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

1. Container pool (ContainerPool + ContainerEntry)

2. Parallel Cucumber scenarios

3. Per-provider teardown

4. Event drain fix

Architecture

Dependencies

Draft — watching CI

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

aepfli commented Mar 30, 2026 •

edited

Loading

1. Container pool (`ContainerPool` + `ContainerEntry`)