Skip to content

File-backed repositories via .store files #258

@KrisSimon

Description

@KrisSimon

Problem

Repositories in ARO are in-memory only — all data is lost when the application stops. This creates three gaps:

  1. No declarative seeding — Populating a repository requires imperative Store statements in Application-Start, which mixes infrastructure concerns with startup logic.
  2. No persistence — Long-running apps (file watchers, servers) lose accumulated state on restart.
  3. No structured secrets/config — Environment variables are accessed individually via <env: "KEY"> with no way to group related config that maps naturally to a repository.

Proposal

Introduce .store files — YAML files that automatically back a repository of the same name, with developer-controlled read-only or writable mode.

Convention

MyApp/
├── main.aro
├── users.store          →  backs "users-repository"
├── settings.store       →  backs "settings-repository"
└── secrets.store        →  backs "secrets-repository"  (gitignored)

Naming rule: <name>.store backs <name>-repository. The filename IS the repository name (no pluralization magic).

File format

.store files use YAML (consistent with aro.yaml and plugin.yaml).

Simple format — a plain list is read-only seed data (the common case stays simple):

# users.store — read-only (default)
- id: admin
  name: Admin User
  role: admin
- id: guest
  name: Guest User
  role: viewer

Extended format — an object with a mode header enables write-back:

# sessions.store — writable, persisted on shutdown
mode: writable
flush: on-shutdown
entries:
  - id: sess-001
    user: admin
    created: 2026-03-15T10:00:00Z

Mode and flush options

Field Values Default Description
mode readonly, writable readonly Whether runtime mutations persist back to the file
flush on-shutdown, on-change on-shutdown When writable changes are written to disk

readonly (default) — The file is seed data. Runtime Store/Delete modify the in-memory repository but never touch the file. This is the right choice for reference data, config, and secrets.

writable + on-shutdown — Changes accumulate in memory and are flushed to the .store file once during graceful shutdown (Application-End / SIGINT / SIGTERM). If the process crashes, changes since last startup are lost. This is the right choice for most writable stores — simple, no I/O during request handling.

writable + on-change — Changes are flushed to disk after each mutation, debounced to 1 second (multiple rapid changes within 1s collapse into a single write). Higher durability at the cost of disk I/O. Use for data you cannot afford to lose on crash.

Write-back mechanics

Atomic writes — All flushes write to <name>.store.tmp first, then rename() over the original. On POSIX this is atomic — no partial writes, no corruption.

Serialization — Writable stores always use the extended format on write-back (the mode/flush/entries structure). A plain-list file upgraded to writable will be rewritten in extended format on first flush. YAML comments in the original file are not preserved after write-back (documented trade-off — writable stores are data, not hand-edited config).

Debounce (on-change) — After a Store or Delete, a 1-second timer starts. If another mutation arrives within that window, the timer resets. When the timer fires, the full repository state is serialized and written atomically. Only one write can be in-flight at a time.

Shutdown flush — On graceful shutdown, all writable stores flush regardless of their flush setting. The flush happens before Application-End: Success executes, so the end handler can rely on stores being persisted.

Crash behavior:

Flush mode Crash behavior
on-shutdown All changes since startup are lost
on-change At most ~1 second of changes lost (debounce window)

Behavior summary

Aspect readonly writable
Load timing Before Application-Start Before Application-Start
Read access Normal Retrieve Normal Retrieve
Store/Delete In-memory only In-memory + persisted to file
File on disk Never modified Updated on flush
Observers Fire normally Fire normally
Missing file Repository starts empty Repository starts empty; file created on first flush
aro build Embedded as read-only resource Error at build time — writable stores cannot be embedded in binaries

Usage in ARO

(Application-Start: My App) {
    (* users-repository is pre-populated from users.store (read-only) *)
    Retrieve the <admins> from the <users-repository> where role = "admin".
    Log <admins: count> to the <console>.

    (* sessions-repository is loaded + will persist back (writable) *)
    Retrieve the <sessions> from the <sessions-repository>.
    Log <sessions: count> to the <console>.

    Start the <http-server> with <contract>.
    Keepalive the <application> for the <events>.
    Return an <OK: status> for the <startup>.
}

(createSession: Session API) {
    Extract the <data> from the <request: body>.
    Store the <session> with <data> into the <sessions-repository>.
    (* ↑ This Store will be persisted to sessions.store *)
    Return a <Created: status> with <session>.
}

No new ARO syntax is required. The developer controls persistence entirely through the .store file header.

Design decisions

Why is read-only the default?
Most store files are reference data or config. Write-back should be an opt-in decision because it changes the semantics of the file from "source of truth I edit" to "runtime-managed data I don't hand-edit".

Why debounce instead of write-per-mutation?
A burst of Store operations (e.g., seeding 100 items in a loop) would cause 100 disk writes. Debouncing collapses them into one. The 1-second window balances durability against I/O cost.

Why not WAL / append-only log?
A write-ahead log would give better crash recovery, but adds significant complexity (log compaction, replay logic). .store files are meant to be simple and human-readable. Apps needing database-grade durability should use a database plugin (e.g., SQLite).

Why not .env?
.env files are flat key-value pairs. .store files hold structured data (lists of objects) that map directly to repository semantics. Environment variables via <env> remain for actual environment config.

Why YAML, not JSON?
Consistency with existing ARO config files (aro.yaml, plugin.yaml). YAML supports comments, which is useful for documenting read-only store files (comments are preserved in read-only stores since the file is never rewritten).

Implementation outline

  1. DiscoveryApplicationLoader scans for *.store alongside *.aro files
  2. Parsing — Detect format (plain list → readonly; object with mode → parse header). Validate each entry has an id field (or auto-generate UUID).
  3. Seeding — Call RepositoryStorage.store() for each entry before Application-Start fires
  4. Events — Emit RepositoryChangedEvent for each seeded entry (observers work as expected)
  5. Write-back registration — For writable stores, register a StoreFlushService that:
    • Subscribes to RepositoryChangedEvent for the backing repository
    • On on-change: starts/resets a 1-second debounce timer, then performs atomic write
    • On on-shutdown: registers with the shutdown handler to flush before Application-End
  6. Atomic write — Serialize repository state to YAML → write <name>.store.tmprename() to <name>.store
  7. Build validationaro build rejects writable stores with a clear error message

Out of scope (future work)

  • Hot-reload on external file change (file watcher for .store files)
  • Encrypted .store files for secrets
  • Conflict resolution for concurrent file access across processes
  • Custom flush intervals (currently fixed at 1s debounce)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions