Skip to content

fix(tracking): S3 tracking server fails on first start with 'no such table: project' (#574)#828

Open
vaquarkhan wants to merge 2 commits into
apache:mainfrom
vaquarkhan:issue-574-s3-tracking-bug
Open

fix(tracking): S3 tracking server fails on first start with 'no such table: project' (#574)#828
vaquarkhan wants to merge 2 commits into
apache:mainfrom
vaquarkhan:issue-574-s3-tracking-bug

Conversation

@vaquarkhan

@vaquarkhan vaquarkhan commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Root Cause
Bug 1 — no such table: project:

On a fresh container/EKS deploy with no pre-existing snapshot in S3, the tracking server crashes immediately when the indexer runs:

sqlite3.OperationalError: no such table: project
RegisterTortoise(app, config=..., add_exception_handlers=True) does not generate schemas by default (generate_schemas defaults to False). On first start there is no snapshot DB to download, so the SQLite file is created empty (no tables). When sync_index() calls backend.update() → _update_projects() → Project.all(), it hits a nonexistent table.

Bug 2 — indexer silently drops old logs:

The max_paths batch cap in _gather_paths_to_update had a broken break that only exited the inner for loop over a single S3 page, not the outer paginator loop. The paginator continued fetching additional pages, collecting far more than max_paths files. The watermark then advanced to the last file in this oversized batch, permanently skipping files that fell between position max_paths and the actual end of the batch on subsequent cycles.

Fix
Bug 1: Call Tortoise.generate_schemas(safe=True) inside the lifespan after RegisterTortoise enters. safe=True uses CREATE TABLE IF NOT EXISTS, so it:

Creates tables on first start (no snapshot)
Is a no-op when tables already exist from a downloaded snapshot
Never clobbers existing data
Bug 2: Add a cap_reached flag that breaks the outer paginator loop when the inner break fires. The batch is now truly capped at max_paths, and the watermark only advances to the last file in a correctly-sized batch.

Files Changed
backend.py
— schema generation in lifespan (Bug 1); paginator break fix (Bug 2)
test_s3_backend_bug574.py
— regression tests (moto-based)
Tests
test_generate_schemas_safe_true_creates_tables — verifies tables are created on empty DB
test_generate_schemas_safe_true_does_not_clobber_existing — verifies safe=True preserves snapshot data
test_gather_paths_respects_max_paths_cap — verifies batch cap works (requires moto)
test_watermark_advances_only_to_last_indexed_file — verifies watermark correctness (requires moto)
Bug 1 tests pass locally. Bug 2 tests require moto (marked skip if not installed; will run in CI).

Fixes #574

…napshot (apache#574)

Bug 1: On a fresh container/EKS deploy with no pre-existing snapshot,
the S3 tracking server crashed with 'no such table: project' because
RegisterTortoise does not generate schemas by default.

Fix: Call Tortoise.generate_schemas(safe=True) after RegisterTortoise
enters the context. safe=True uses CREATE TABLE IF NOT EXISTS, so it
is a no-op when tables already exist from a downloaded snapshot.

Bug 2: The max_paths batch cap in _gather_paths_to_update had a broken
break that only exited the inner for-loop, not the outer paginator
loop. This caused unbounded file collection, advancing the watermark
past files that should have been indexed in subsequent cycles.

Fix: Add a cap_reached flag that breaks the outer paginator loop when
the inner break fires.

Tests: moto-based regression tests for both bugs (schema creation
without snapshot, schema coexistence with snapshot, batch cap
enforcement, watermark boundary correctness).
@github-actions github-actions Bot added area/storage Persisters, state storage area/tracking Telemetry, tracing, OpenTelemetry labels Jun 30, 2026
…tibility

Use moto test credentials in the mock_s3 fixture so aiobotocore can connect through the mocked S3 client during regression tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/storage Persisters, state storage area/tracking Telemetry, tracing, OpenTelemetry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Burr S3 Tracking - Deployment and Docker Image

1 participant