Problem
The ingest_bus_events() function uses MAX(event_id) from the bus_events table as a high-water mark. On the first run, it applies a timestamp-based cutoff (days=7 default), ingests only recent events, then sets the high-water mark to the latest event ID. All subsequent runs — even with days=365 — only look for events with id > high_water_mark, permanently skipping older events.
This was discovered immediately after merging #107: startup ingestion with days=7 ingested 109 of 2,696 events, and a follow-up days=365 call found nothing new.
Workaround
Clear bus_events and re-ingest:
storage.execute_write('DELETE FROM bus_events')
ingest_bus_events(storage, days=365)
Fix options
- Use a larger startup default — Change startup ingestion to
days=365 so first run captures full history
- Separate first-run logic — When
bus_events is empty, ignore the days parameter and ingest everything
- Always use timestamp cutoff — Don't use high-water mark at all; rely on
INSERT OR IGNORE for dedup (slower but correct)
Option 2 seems cleanest: if the table is empty, ingest everything regardless of days.
Problem
The
ingest_bus_events()function usesMAX(event_id)from thebus_eventstable as a high-water mark. On the first run, it applies a timestamp-based cutoff (days=7default), ingests only recent events, then sets the high-water mark to the latest event ID. All subsequent runs — even withdays=365— only look for events withid > high_water_mark, permanently skipping older events.This was discovered immediately after merging #107: startup ingestion with
days=7ingested 109 of 2,696 events, and a follow-updays=365call found nothing new.Workaround
Clear
bus_eventsand re-ingest:Fix options
days=365so first run captures full historybus_eventsis empty, ignore thedaysparameter and ingest everythingINSERT OR IGNOREfor dedup (slower but correct)Option 2 seems cleanest: if the table is empty, ingest everything regardless of
days.