Skip to content

RFC: Complete event-bus ingestion integration (MCP tools, background ingest, multi-machine analysis) #106

Description

@evansenter

Summary

The event-bus ingestion plumbing from issue #54 was built but never fully connected. The bus_events table, bus_ingest.py, query_bus_events(), and the CLI bus-events command all exist, but bus events are not accessible through MCP and are not automatically ingested by the server's background loop. This RFC covers the remaining work to make event-bus data a first-class queryable data source.

Problem / Motivation

Sessions currently cannot query cross-session knowledge events (gotchas, patterns, improvement suggestions) through MCP. The get_insights() tool already reads from bus_events and reports has_bus_events: true/false, but that table is never populated by the server — only the CLI's bus-events command manually calls ingest_bus_events() before querying. This means:

  1. The ~54 knowledge events (gotcha_discovered, pattern_found, improvement_suggested) spanning Dec 2025 - Feb 2026 are invisible to MCP clients.
  2. The /improve-workflow skill cannot surface cross-session learnings because get_insights() always finds an empty bus_events table.
  3. The background ingestion loop (5-min timer in server.py) only calls ingest.ingest_logs(), missing ingest_bus_events() entirely.

Context

Proposed Solution

Phase 1: Server-side integration (trivial, low risk)

Add ingest_bus_events() to the existing background ingestion loop in server.py. Both the event-bus DB (~/.claude/contrib/agent-event-bus/data.db) and the analytics DB are co-located on speck-vm, so the existing bus_ingest.py works as-is.

Changes to server.py:

  • Import bus_ingest module
  • Add ingest_bus_events(storage) call in server_lifespan() (startup)
  • Add ingest_bus_events(storage) call in _periodic_ingest() (5-min loop)
  • Wrap in try/except like the existing ingest_logs() call so bus ingestion failures don't block session ingestion

Phase 2: MCP tools

Expose two new MCP tools:

get_bus_events(days, event_type, repo, session_id, limit) — Query tool wrapping query_bus_events(). This is the primary tool sessions will use to discover cross-session knowledge.

ingest_bus_events(days) — Optional trigger tool wrapping bus_ingest.ingest_bus_events(). Useful for forcing immediate ingestion rather than waiting for the 5-min background cycle. Low priority since background ingestion handles the common case.

Both need:

  • MCP tool registration in server.py
  • CLI formatter for bus_events output (currently falls through to JSON)
  • Entry in guide.md
  • Entry in cmd_benchmark() tool list

Phase 3: Improve-workflow integration

Once bus events are queryable, update the /improve-workflow skill (in dotfiles, not this repo) to call get_bus_events(event_type="gotcha_discovered") and get_bus_events(event_type="pattern_found") to surface cross-session learnings alongside the existing get_insights() data.

Phase 4 (Deferred): Client push support for bus events

Analysis of current topology:

  • speck-vm (GCP): Runs both agent-session-analytics server AND agent-event-bus server. Both SQLite databases are local. Direct file read works.
  • Evans-Personal-Pro (Mac): Connects to both servers via Tailscale MCP. Does NOT have local event-bus DB. BUT: the Mac does not run the analytics server either — it pushes session data TO speck-vm via upload_entries/finalize_sync.

Conclusion: Client push support is NOT needed for the current topology. The event-bus DB lives on the same machine as the analytics server, so bus_ingest.py's direct SQLite read works. The Mac never needs to push bus events because the bus server IS on speck-vm.

When push support would be needed: If someone runs agent-event-bus on a different machine than agent-session-analytics, they would need a upload_bus_events() MCP endpoint (analogous to upload_entries()). This can be built when the need arises, following the established pattern from PR #104.

Assumptions

Assumption Confidence Impact if Wrong
Event-bus DB will remain co-located with analytics DB on speck-vm High Would need Phase 4 push support
bus_ingest.py read-only SQLite access is safe concurrent with event-bus writes High Could get SQLITE_BUSY errors; add retry logic
~54 knowledge events is representative; volume won't explode Medium May need rate limiting or aggregation in queries
The /improve-workflow skill is the primary consumer of bus events Medium Other skills may need different query patterns

Open Questions

  1. Knowledge event taxonomy: The current event types (gotcha_discovered, pattern_found, improvement_suggested) emerged organically. Should we define a formal taxonomy for knowledge events, or let it evolve?
  2. Deduplication: If the same gotcha is published multiple times across sessions, should get_bus_events deduplicate by payload similarity, or return all instances (showing frequency)?
  3. Retention: Bus events accumulate indefinitely. Should there be a TTL or archiving strategy, or is the volume low enough (~54 knowledge events in ~6 weeks) that it does not matter?

Actionable Requirements

# Requirement Owner Blocked By
1 Add ingest_bus_events() to _periodic_ingest() and server_lifespan() - -
2 Add get_bus_events MCP tool in server.py - -
3 Add ingest_bus_events MCP tool in server.py (optional trigger) - -
4 Add CLI formatter for bus events output - -
5 Add bus events tools to cmd_benchmark() - #2
6 Add bus events section to guide.md - #2
7 Update /improve-workflow skill to query bus events - #2

Test Requirements

  • Unit: test_query_bus_events() with mocked bus_events data; test_ingest_bus_events() with temp SQLite
  • Integration: End-to-end: ingest events from fixture DB, query via MCP tool, verify results
  • Edge cases:
    • Event-bus DB does not exist (graceful skip, already handled in bus_ingest.py)
    • Event-bus DB is locked by another process (SQLITE_BUSY handling)
    • Empty bus_events table (should return empty results, not error)
    • get_insights() with populated bus_events (verify cross_session_activity populated)

Implementation Checklist

  • Phase 1: Add bus ingestion to server background loop
    • Import bus_ingest in server.py
    • Add to server_lifespan() startup
    • Add to _periodic_ingest() loop
    • Test with make logs after restart
  • Phase 2: MCP tools
    • Add get_bus_events tool to server.py
    • Add ingest_bus_events tool to server.py
    • Add _format_bus_events formatter to cli.py
    • Add to cmd_benchmark() tool list
    • Add to guide.md
    • Run make check
  • Phase 3: Improve-workflow integration (separate PR, different repo)
  • Phase 4: Deferred — client push support if topology changes

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions