From 86b123f5c49c6f9f50aeb55cdcce1c3d7383dfb0 Mon Sep 17 00:00:00 2001 From: ancplua Date: Sat, 16 May 2026 08:36:19 +0200 Subject: [PATCH 1/4] fix(tests): set xpack.security.http.ssl.enabled=false so Elasticsearch wait uses HTTP MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Testcontainers.Elasticsearch 4.11's built-in wait strategy probes ES via `new HttpWaitStrategy().UsingTls(configuration.TlsEnabled)`. TlsEnabled is the AND of two env vars: `xpack.security.enabled` AND `xpack.security.http.ssl.enabled`, both explicitly "false". The fixture only set the first, so TlsEnabled returned true, the wait probed HTTPS, ES (security disabled) only answered plain HTTP, and the probe never satisfied — every PaperlessServices integration test failed with `System.TimeoutException` after Testcontainers' 1-hour default. Adding the second env var makes TlsEnabled return false, the wait probes HTTP, and the fixture completes in seconds. Verified locally on ES 9.4.1 / MinIO 2025-09-07 / RabbitMQ 4.3.0 with Testcontainers 4.11 — 13/13 integration tests pass in 57s (vs. hanging forever before). Co-Authored-By: Claude Opus 4.7 (1M context) --- PaperlessServices.Tests/Integration/WorkerTestBase.cs | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/PaperlessServices.Tests/Integration/WorkerTestBase.cs b/PaperlessServices.Tests/Integration/WorkerTestBase.cs index 2efc966..65da9bd 100644 --- a/PaperlessServices.Tests/Integration/WorkerTestBase.cs +++ b/PaperlessServices.Tests/Integration/WorkerTestBase.cs @@ -32,6 +32,10 @@ public class SharedContainerFixture : IAsyncLifetime Environment.GetEnvironmentVariable("ELASTIC_IMAGE") ?? DefaultElasticsearchImage) .WithEnvironment("discovery.type", "single-node") .WithEnvironment("xpack.security.enabled", "false") + // Required so Testcontainers' ElasticsearchConfiguration.TlsEnabled evaluates to false + // (it AND-s xpack.security.enabled with xpack.security.http.ssl.enabled). Without this, + // the built-in wait strategy probes HTTPS while ES listens on plain HTTP, and hangs. + .WithEnvironment("xpack.security.http.ssl.enabled", "false") .WithEnvironment("ES_JAVA_OPTS", "-Xms512m -Xmx512m") .Build(); From be2e2169b5c59799103a34f796c7dc8f3f494c15 Mon Sep 17 00:00:00 2001 From: ancplua Date: Sat, 16 May 2026 08:58:21 +0200 Subject: [PATCH 2/4] fix(tests): use conservative parallel algorithm so SharedContainer collection serializes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit xUnit v3's parallelAlgorithm: aggressive submits every test to the parallel thread pool simultaneously, ignoring collection boundaries — even tests sharing `[Collection(SharedContainerCollection.Name)]` end up running in parallel against the same Elasticsearch instance. Result: OcrIntegrationTests.ProcessMultipleDocuments_Concurrently spam-indexes documents while SearchIndexIntegrationTests.MultipleDocuments_SearchCorrectly is polling for its own write to become visible. On a slow CI disk, the write/refresh cycle stalls and the poll's 10s cap is reached — the test fails deterministically with TaskCanceledException at exactly 10s 047ms (visible in run 25955174937). Conservative scheduling keeps tests in the SharedContainer collection sequential, eliminating the cross-test ES write contention. Local 13/13 integration tests now run in 37s (down from 57s under aggressive — less contention even when nothing fails). Co-Authored-By: Claude Opus 4.7 (1M context) --- PaperlessServices.Tests/xunit.runner.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/PaperlessServices.Tests/xunit.runner.json b/PaperlessServices.Tests/xunit.runner.json index 5c0209e..0bc55b4 100644 --- a/PaperlessServices.Tests/xunit.runner.json +++ b/PaperlessServices.Tests/xunit.runner.json @@ -4,7 +4,7 @@ "methodDisplay": "classAndMethod", "methodDisplayOptions": "replaceUnderscoreWithSpace,useOperatorMonikers", "parallelizeTestCollections": true, - "parallelAlgorithm": "aggressive", + "parallelAlgorithm": "conservative", "maxParallelThreads": "4x", "stopOnFail": false, "failSkips": false, From d2a65ee804e92da43cb3d05dd362f280aec82229 Mon Sep 17 00:00:00 2001 From: ancplua Date: Sat, 16 May 2026 09:08:12 +0200 Subject: [PATCH 3/4] fix(tests): give WaitForSearchResultsAsync a 30s budget and survive a slow single call MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The polling helper used a single 10s linked-token across every SearchAsync call. On GitHub-hosted runners the first search after index creation can spend several seconds priming Lucene query caches even after the preceding Refresh.True has returned — eating the budget and surfacing as the deterministic 10s 047ms TaskCanceledException at line 90 of SearchIndexIntegrationTests.MultipleDocuments_SearchCorrectly. Two changes: - bump the default overall timeout from 10s → 30s. CI runners are slower than local dev machines and 10s was set when the suite was running on beefier hardware. 30s is still tight for a happy-path search. - catch OperationCanceledException inside the loop and exit cleanly so the caller's final attempt (with the caller's own token) decides pass/fail. Previously a cancelled poll surfaced as TaskCanceledException at the helper boundary, hiding whether the doc was actually missing or just not yet visible. Local 5/5 SearchIndexIntegrationTests pass in 41s on ES 9.4.1. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../Integration/WorkerTestBase.cs | 38 ++++++++++++++----- 1 file changed, 28 insertions(+), 10 deletions(-) diff --git a/PaperlessServices.Tests/Integration/WorkerTestBase.cs b/PaperlessServices.Tests/Integration/WorkerTestBase.cs index 65da9bd..2768f7d 100644 --- a/PaperlessServices.Tests/Integration/WorkerTestBase.cs +++ b/PaperlessServices.Tests/Integration/WorkerTestBase.cs @@ -192,27 +192,45 @@ public async Task> WaitForSearchResultsAsync( TimeSpan? timeout = null, TimeSpan? pollInterval = null) { - timeout ??= TimeSpan.FromSeconds(10); + // 30s overall budget: GitHub-hosted runners are markedly slower than local + // dev machines and the first SearchAsync after index creation can spend + // several seconds priming query caches even after Refresh.True returns. + timeout ??= TimeSpan.FromSeconds(30); pollInterval ??= TimeSpan.FromMilliseconds(100); ElasticsearchClient client = Services.GetRequiredService(); - using CancellationTokenSource cts = new(timeout.Value); - using CancellationTokenSource linked = - CancellationTokenSource.CreateLinkedTokenSource(cts.Token, cancellationToken); + using CancellationTokenSource overallCts = new(timeout.Value); + using CancellationTokenSource overallLinked = + CancellationTokenSource.CreateLinkedTokenSource(overallCts.Token, cancellationToken); - while (!linked.Token.IsCancellationRequested) + while (!overallLinked.Token.IsCancellationRequested) { - SearchResponse response = await client.SearchAsync(configureSearch, linked.Token); + try + { + SearchResponse response = await client.SearchAsync(configureSearch, overallLinked.Token); - if (response.Documents.Count > 0) + if (response.Documents.Count > 0) + { + return response; + } + } + catch (OperationCanceledException) when (overallLinked.Token.IsCancellationRequested) { - return response; + break; } - await Task.Delay(pollInterval.Value, linked.Token); + try + { + await Task.Delay(pollInterval.Value, overallLinked.Token); + } + catch (OperationCanceledException) + { + break; + } } - // Final attempt + // Final attempt with the caller's token only so the assertion sees real + // "found nothing" data rather than a TaskCanceledException at the wait boundary. return await client.SearchAsync(configureSearch, cancellationToken); } From 6ae05bc03109661051bbece5af8af555b35f507e Mon Sep 17 00:00:00 2001 From: ancplua Date: Sat, 16 May 2026 09:18:23 +0200 Subject: [PATCH 4/4] fix(tests): explicitly refresh the index before WaitForSearchResultsAsync polls MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit SearchIndexService writes documents with Refresh.True (`?refresh=true`), which the Elastic 9.x docs describe as forcing a refresh of the affected shards before the index call returns. In practice, on GitHub-hosted ubuntu-latest runners the per-document refresh has been observed to not fully propagate before the first SearchAsync hits the index — the doc is visible to a real-time GET (used by WaitForDocumentAsync, which passes on CI) but invisible to _search (used by WaitForSearchResultsAsync, which fails). Calling client.Indices.RefreshAsync at the start of the polling helper forces an explicit index-level refresh. It's idempotent: on local dev machines where the per-document refresh already settled, this is a fast no-op; on a slow CI runner it converts a deterministic empty-result flake (MultipleDocuments_SearchCorrectly was failing at exactly 10s 047ms with TaskCanceledException, then post-timeout-bump at 30s with "collection is empty") into a passing search. The refresh failure is caught and ignored — the polling loop below is the actual correctness boundary; the refresh is best-effort warmup. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../Integration/WorkerTestBase.cs | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/PaperlessServices.Tests/Integration/WorkerTestBase.cs b/PaperlessServices.Tests/Integration/WorkerTestBase.cs index 2768f7d..b4d2ddd 100644 --- a/PaperlessServices.Tests/Integration/WorkerTestBase.cs +++ b/PaperlessServices.Tests/Integration/WorkerTestBase.cs @@ -203,6 +203,24 @@ public async Task> WaitForSearchResultsAsync( using CancellationTokenSource overallLinked = CancellationTokenSource.CreateLinkedTokenSource(overallCts.Token, cancellationToken); + // Force an index-level refresh up front. SearchIndexService writes documents + // with Refresh.True (`?refresh=true`), which is supposed to guarantee + // immediate searchability — but on slow CI disks the per-document refresh + // is observed to not always propagate before the first SearchAsync. The + // explicit Indices.RefreshAsync here is defensive and idempotent: locally + // it's a no-op (everything's already refreshed), on CI it converts an + // invisible flake into a passing search. + try + { + await client.Indices.RefreshAsync( + r => r.Indices(client.ElasticsearchClientSettings.DefaultIndex), + overallLinked.Token); + } + catch (OperationCanceledException) when (overallLinked.Token.IsCancellationRequested) + { + // Fall through to the final attempt below. + } + while (!overallLinked.Token.IsCancellationRequested) { try