Skip to content

Add retention/pruning for the observations table (unbounded growth) #494

@ddeboer

Description

@ddeboer

Summary

@lde/distribution-monitor appends to the observations table on every probe (every monitor × every poll cycle) and never deletes anything, so the table grows without bound.

Context

This surfaced while debugging the Network of Terms status DB: observations was already at ~2M rows (a few hundred MB) on a slow (HDD) volume. At the time it was compounding a separate problem — the latest_observations materialized view refresh scanned the whole table every cycle and timed out.

That refresh is now gone (latest is maintained by an upsert as of #493), so table size no longer affects status freshness or correctness. This issue is therefore pure hygiene, not a live incident:

  • Unbounded disk growth on a fixed-size PVC (the status DB volume is 5Gi).
  • Ever-larger backups / slower full-history queries over time.

Proposal

Add configurable retention to the monitor: periodically delete observations older than a cutoff, keeping observations a bounded time-series.

Sketch:

  • A retentionDays (or retentionInterval) config option; unset = keep forever (current behaviour, backwards compatible).
  • A periodic prune, e.g. DELETE FROM observations WHERE observed_at < now() - $retention, run on a cron alongside polling (or piggy-backed on the poll cycle).
  • latest_observations is unaffected — it holds one row per monitor independently of how much history is pruned.
  • Consider batched deletes (LIMIT in a loop) so a first prune of a large backlog doesn't lock or bloat, and so it plays nicely with autovacuum.

Open questions

  • Default retention when enabled (30 / 90 days?), and whether it should be opt-in or have a sane default.
  • Whether to expose it via the existing config schema in network-of-terms status as well.

Notes

Not urgent — flagged here so it isn't lost. Related work: #492 (boot index lock), #493 (latest-via-upsert).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions