From cae6baf8b1937bd3ea2e23ceaa4f9984c3efd5ef Mon Sep 17 00:00:00 2001
From: Nick Sullivan <nick@technick.ai>
Date: Sun, 29 Mar 2026 12:57:11 -0500
Subject: [PATCH 1/5] Workflow builder v0.2.0: SQLite tracking, sub-agent
 loops, check-work tiering
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Three new best practices for workflow design:

- Pattern 3: Never loop over collections in orchestrator — spawn sub-agents per item
- Pattern 3: Check-work tiering — cheap model checks if work exists, expensive model acts
- Pattern 4: Split contextual state (markdown) vs tracking state (SQLite), ban JSON
- Pattern 4: Schema versioning mechanism with per-workflow db-setup.md

Contact steward migrated from processed.md to processed.db with schema versioning,
legacy migration path, and automatic initialization for new installs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 skills/workflow-builder/SKILL.md      | 195 ++++++++++++++++++++++----
 workflows/contact-steward/AGENT.md    |  71 ++++++----
 workflows/contact-steward/db-setup.md | 126 +++++++++++++++++
 3 files changed, 337 insertions(+), 55 deletions(-)
 create mode 100644 workflows/contact-steward/db-setup.md

diff --git a/skills/workflow-builder/SKILL.md b/skills/workflow-builder/SKILL.md
index 4ca935e..7934021 100644
--- a/skills/workflow-builder/SKILL.md
+++ b/skills/workflow-builder/SKILL.md
@@ -1,12 +1,12 @@
 ---
 name: workflow-builder
-version: 0.1.0
+version: 0.2.0
 description:
   Design, build, and maintain autonomous OpenClaw workflows (stewards). Use when
   creating new workflow agents, improving existing ones, evaluating automation
   opportunities, or debugging workflow reliability. Triggers on "build a workflow",
   "create a steward", "automate this process", "workflow audit", "what should I
-  automate".
+  automate", "create a cron job", "schedule a recurring task", "build a scheduled job".
 metadata:
   openclaw:
     emoji: "🏗️"
@@ -219,10 +219,68 @@ Write confidence thresholds to `rules.md` so the user can tune them.
 
 ### Pattern 3: Sub-Agent Orchestration
 
-Match intelligence to task complexity:
+Match intelligence to task complexity, and **always use sub-agents for loops.**
+
+#### Rule: Never Loop Over Collections in the Orchestrator
+
+**Any time you iterate over a list (contacts, emails, tasks, records), spawn a sub-agent
+per item.** This preserves the parent context for coordination and prevents pollution.
+
+**Pattern:**
+
+```
+Orchestrator (parent):
+1. Fetch the list (from API, file, database)
+2. Query tracking state to filter already-processed items
+3. FOR EACH new item: Spawn a sub-agent with that item's details
+4. Sub-agent processes one item, returns structured result
+5. Parent collects results, updates tracking state, alerts if needed
+
+Sub-agent:
+- Receives: One item + context needed for that item
+- Does: All the reasoning, decision-making, work
+- Returns: Structured summary (status, action taken, errors, alerts)
+- Never accesses parent's full context
+```
+
+**Why:** Each sub-agent gets a fresh context window. Parent stays clean for
+orchestration logic. No pollution from per-item reasoning.
+
+#### Model Selection: Check-Work Tiering for High-Frequency Jobs
+
+For jobs running every few minutes (e.g., every 5 min, every 15 min):
 
+**Two-stage pattern:**
+
+```
+Stage 1 (Cheap): Use Haiku to ask "Is there any work to do?"
+  - Cheap to run often
+  - Quick predicate check (yes/no)
+  - Examples: "Any new emails?", "Any cron job failures?", "Any security alerts?"
+
+Stage 2 (Expensive): If yes, spawn Opus/Sonnet to do the actual work
+  - Only spawned when there's real work
+  - Has full context for reasoning/decisions
+  - Saves tokens on empty runs
+```
+
+**Example:**
+
+```
+Cron job runs every 5 minutes:
+1. Haiku runs: "Are there any unprocessed emails in my inbox?"
+   → Returns boolean (with brief explanation)
+2. If yes: Spawn Sonnet to "Process and categorize these 3 emails"
+   → Does the actual work
+3. If no: Skip expensive processing, return early
+   → Save ~90% tokens on empty runs
 ```
-Obvious/routine items → Spawn sub-agent (cheaper model: Haiku/Sonnet)
+
+**Model selection for different complexities:**
+
+```
+High-frequency checks (every 5-15 min) → Haiku to check, Sonnet/Opus to act
+Obvious/routine items → Spawn sub-agent (cheaper model: Sonnet)
 Important/nuanced items → Handle yourself or spawn a powerful sub-agent (Opus)
 Quality verification → Can use a strong model as QA reviewer (Opus as sub-agent)
 Uncertain items → Sub-agents escalate to you rather than guessing
@@ -231,21 +289,84 @@ Uncertain items → Sub-agents escalate to you rather than guessing
 **Note:** Don't hardcode model IDs (they go stale fast). Use aliases like `sonnet`,
 `opus`, `haiku` or reference the model by capability level.
 
-### Pattern 4: State Externalization (Compaction-Safe)
+### Pattern 4: State Externalization — Contextual State vs Tracking State
 
 **Critical:** Chat history is a cache, not the source of truth. After every meaningful
-step, write state to disk.
+step, write state to disk. But distinguish between two types:
+
+#### 4a. Contextual State (Markdown only)
+
+**What:** Information the agent reasons about or learns over time. **Examples:**
+`agent_notes.md`, `rules.md`, daily logs, decision summaries. **Format:** Markdown.
+Always human-readable. **Why markdown:** These belong in context so the agent can reason
+about them.
 
 ```markdown
-# state/active-work.json (or inline in agent_notes.md)
+# agent_notes.md
 
-{ "current_phase": "processing", "next_action": "Review batch 2 of inbox",
-"last_completed": "Batch 1: archived 12, deleted 3", "resume_prompt": "Continue inbox
-processing from message ID xyz", "updated_at": "2026-02-18T14:30:00Z" }
+## Patterns Observed
+
+- Contact X always sends updates on Tuesdays
+- Task type Y typically needs 2-hour blocks
+
+## Mistakes Made
+
+- Once skipped important sender — now review sender importance before filtering
 ```
 
-**Rule in AGENT.md:** "On every run, read state first. Either advance it or explicitly
-conclude it."
+#### 4b. Tracking State (SQLite only)
+
+**What:** Deduplication, "have I seen this?", processed IDs, state queries.
+**Examples:** `processed.db` with tables for seen IDs, statuses, timestamps. **Format:**
+SQLite database with structured queries. **Why SQLite:** The agent doesn't reason about
+this — it only queries it. SQLite gives O(1) lookups without loading the entire history
+into context.
+
+⚠️ **NEVER use JSON for state files.** You are an LLM, not a JSON parser. JSON is useful
+for API responses and tool output flags, but state files should be markdown
+(human-readable) or SQLite (queryable). JSON state files create noise, parsing errors,
+and waste context on structure rather than content.
+
+The workflow's `db-setup.md` defines the specific schema. The calling LLM writes the SQL
+— don't over-prescribe queries in AGENT.md. Just describe what should happen (e.g.,
+"check if already processed", "mark as classified", "clean up entries older than 90
+days") and let the LLM write the appropriate queries.
+
+#### Schema Versioning & Migration
+
+Every workflow that uses SQLite must track schema versions so upgrades happen
+automatically:
+
+1. **Store version in the database** via a `schema_meta` table
+2. **Declare the expected version in AGENT.md** (e.g., `Schema version: 1`)
+3. **Each run checks with one query:** `SELECT version FROM schema_meta LIMIT 1`
+   - Matches → proceed (99% of runs, no extra reads)
+   - Lower → read `db-setup.md` for migration steps
+   - Missing → run inline initialization SQL
+4. **Keep initialization SQL inline in AGENT.md** (idempotent `CREATE IF NOT EXISTS`)
+5. **Keep migration steps in a separate `db-setup.md`** — only read on version mismatch
+   or legacy conversion
+
+**Per-workflow `db-setup.md`** contains:
+
+- Target schema with column reference
+- Schema version history table
+- Legacy migration instructions (e.g., `processed.md` → `processed.db`)
+- Versioned migration blocks (e.g., "Version 1 → 2: ALTER TABLE ADD COLUMN ...")
+- Common queries for reference
+
+This pattern handles all scenarios automatically:
+
+- **New server:** No database → initialization SQL creates it
+- **Legacy server:** `processed.md` exists → db-setup.md migration
+- **Schema upgrade pushed:** Version mismatch detected → db-setup.md migration
+- **Normal run:** Version matches → zero overhead
+
+See `workflows/contact-steward/db-setup.md` for a reference implementation.
+
+**Rule in AGENT.md:** "On every run, read contextual state first (agent_notes.md,
+rules.md). Query tracking state via SQLite — one version check, then targeted queries.
+After processing, update both as needed. Never load tracking history into context."
 
 ### Pattern 5: Error Handling & Alerting
 
@@ -300,12 +421,21 @@ openclaw cron add \
 
 ### Cron Configuration Guidelines
 
-| Workflow Type                                | Schedule                    | Model           | Session          |
-| -------------------------------------------- | --------------------------- | --------------- | ---------------- |
-| High-frequency triage (email, notifications) | Every 15-30 min             | Sonnet          | Isolated         |
-| Daily reports/summaries                      | Once daily at fixed time    | Opus            | Isolated         |
-| Weekly reviews/audits                        | Weekly cron                 | Opus + thinking | Isolated         |
-| Reactive (triggered by events)               | Via webhook or system event | Varies          | Main or Isolated |
+| Workflow Type                                | Schedule                    | Model Pattern                | Session  |
+| -------------------------------------------- | --------------------------- | ---------------------------- | -------- |
+| High-frequency checks (every 5-15 min)       | Every 5-15 min              | Haiku (check) → Sonnet (act) | Isolated |
+| High-frequency triage (email, notifications) | Every 15-30 min             | Sonnet                       | Isolated |
+| Daily reports/summaries                      | Once daily at fixed time    | Opus                         | Isolated |
+| Weekly reviews/audits                        | Weekly cron                 | Opus + thinking              | Isolated |
+| Reactive (triggered by events)               | Via webhook or system event | Varies                       | Isolated |
+
+**Note on Check-Work Tiering:**
+
+- If a job runs multiple times per hour, use the two-stage pattern: cheap check (Haiku)
+  → expensive work (Sonnet/Opus)
+- This cuts token costs on empty runs (when there's no work to do)
+- Example: "Email arrived?" (Haiku) → "Process these 5 emails" (Sonnet) only if yes
+- Apply to: health checks, inbox scans, notification monitors, cron job monitors
 
 ### Delivery
 
@@ -380,6 +510,13 @@ If `rules.md` doesn't exist or is empty:
 
 <Summarize in plain language, save rules.md.>
 
+## Database (only if this workflow tracks processed items)
+
+**Schema version: 1** — See `db-setup.md` for full schema.
+
+Before processing, verify schema_meta.version matches the version above. If missing,
+mismatched, or legacy state files exist → read `db-setup.md`.
+
 ## Regular Operation
 
 ### Your Tools
@@ -390,11 +527,15 @@ If `rules.md` doesn't exist or is empty:
 
 1. Read `rules.md` for preferences
 2. Read `agent_notes.md` for learned patterns (if exists)
-3. <Scan/fetch new items>
-4. <Process items based on rules>
-5. Alert if anything needs attention
-6. Append to today's log in `logs/`
-7. Update `agent_notes.md` if you learned something
+3. Ensure database is ready (see Database section — one quick version check)
+4. <Scan/fetch new items>
+5. Query `processed.db` to filter items already handled
+6. FOR EACH new item: Spawn a sub-agent to process it (see Sub-Agent Orchestration)
+7. After each item, update `processed.db` with status
+8. Collect sub-agent results
+9. Alert if anything needs attention
+10. Append to today's log in `logs/`
+11. Update `agent_notes.md` if you learned something new about patterns/mistakes
 
 ### Judgment Guidelines
 
@@ -416,7 +557,13 @@ If `rules.md` doesn't exist or is empty:
 - [ ] Setup interview creates rules.md with all needed preferences
 - [ ] Has clear judgment guidelines (when to act vs leave alone)
 - [ ] Error handling: logs errors, alerts on critical failures
-- [ ] Housekeeping: auto-prunes old logs
+- [ ] **Tracking state:** If workflow queries "have I seen this?", uses `processed.db`
+      (SQLite), not markdown lists
+- [ ] **Sub-agents:** Any loop over a collection spawns sub-agents per item, not in
+      orchestrator
+- [ ] **Contextual state:** agent_notes.md and rules.md are markdown, not JSON
+- [ ] Housekeeping: auto-prunes old logs and cleans up stale tracking entries (e.g.,
+      `DELETE FROM processed WHERE last_checked < ...`)
 - [ ] Integration points documented
 - [ ] Cron job configured with appropriate schedule/model
 - [ ] First week monitoring plan in place
diff --git a/workflows/contact-steward/AGENT.md b/workflows/contact-steward/AGENT.md
index f5c027b..297cbcf 100644
--- a/workflows/contact-steward/AGENT.md
+++ b/workflows/contact-steward/AGENT.md
@@ -172,7 +172,7 @@ the detective work.
 - Filtering out spam, automated messages, businesses
 - Cross-platform lookups to gather context (e.g. `wacli contacts search` for a number)
 - Detecting enrichment opportunities (new details in recent messages)
-- Updating `processed.md` with scan results
+- Updating `processed.db` with scan results (via SQLite queries)
 - Deciding whether to spawn Opus
 
 **You NEVER:** add, update, or modify contacts. All writes go through Opus.
@@ -190,7 +190,7 @@ the detective work.
 - Contact already exists and no new info in recent messages
 - Obvious spam, OTP codes, delivery notifications, automated alerts
 - Your human didn't reply (no reply = no signal that this person matters)
-- Business/automated accounts (log in processed.md and move on)
+- Business/automated accounts (mark as `skipped` in processed.db and move on)
 
 ## The Trigger
 
@@ -211,23 +211,38 @@ the log how many remain. They'll get picked up on subsequent runs.
 This means the first few runs after setup will be catching up on the backlog. That's
 expected — don't try to process everything at once.
 
+## Database
+
+**Schema version: 1** — See `db-setup.md` for the full schema definition.
+
+Tracking state lives in `processed.db` (SQLite). Before first scan, check:
+
+- If `processed.db` doesn't exist or `schema_meta` table is missing → create the
+  database using the schema in `db-setup.md`
+- If `processed.md` exists (legacy) → read `db-setup.md` for migration instructions
+- If `schema_meta.version` is lower than the version above → read `db-setup.md` for
+  upgrade steps
+- If version matches → proceed normally
+
 ## Each Run
 
 1. Read `preferences.md` — know which platforms to scan and how to notify
-2. Read `processed.md` — know what you've already looked at
+2. Ensure database is ready (see Database section above)
 3. Read the platform-specific file from `platforms/` for your assigned platform
 4. Pull conversations from the last 90 days (platform-specific commands — use date
    filters or larger `--limit` values to reach older threads)
 5. For each conversation where your human replied (oldest unprocessed first, max 10 Opus
-   spawns per run — enrichment checks and skips don't count toward the cap): a. Is the
-   other party a saved contact on this platform? If yes, check for enrichment (new
-   messages with contact-relevant info since last processed). If no new info, skip. b.
-   Not a saved contact? Cross-reference the phone number on other platforms (especially
-   `wacli contacts search <number>`) c. Found info (cross-reference match, profile name,
+   spawns per run — enrichment checks and skips don't count toward the cap): a. Check
+   processed.db for this platform + contact_id. b. If found and no new messages since
+   last_checked → skip. c. If found with status `error` → retry (counts toward cap). d.
+   Not in database and saved contact on platform? Check for enrichment (new messages
+   with contact-relevant info). If no new info, skip. e. Not a saved contact?
+   Cross-reference the phone number on other platforms (especially
+   `wacli contacts search <number>`) f. Found info (cross-reference match, profile name,
    or conversation clues)? Spawn Opus with everything you gathered. Opus verifies and
-   writes the contact. d. No match anywhere? Spawn Opus with full conversation context
+   writes the contact. g. No match anywhere? Spawn Opus with full conversation context
    for detective work.
-6. Update `processed.md` with what you checked and the outcome
+6. After each contact, upsert into processed.db with the outcome status and timestamp
 7. Notify your human with a batch summary of what was added and what needs their input
 8. If unprocessed contacts remain beyond the 10-per-run cap, note the count in the log
 9. Append to today's log in `logs/` (see Log Format below)
@@ -300,9 +315,9 @@ that's an Opus job.
 ## Businesses vs People
 
 Detect obvious businesses (rental companies, delivery services, support lines). Skip
-them by default, but log them in processed.md so we don't re-check. If your human is
-having a genuine ongoing relationship with a business contact (e.g. a specific person at
-a company), treat them as a person.
+them by default, but mark them as `skipped` in processed.db so we don't re-check. If
+your human is having a genuine ongoing relationship with a business contact (e.g. a
+specific person at a company), treat them as a person.
 
 ## Notifications
 
@@ -356,7 +371,7 @@ If a platform CLI command fails (non-zero exit, timeout, empty response):
 If an Opus sub-agent fails or times out:
 
 - Log the identifier it was working on
-- Mark it as "error" in processed.md (will be retried next run)
+- Mark it as `error` in processed.db (will be retried next run)
 - Continue with remaining contacts
 
 ## Log Format
@@ -392,28 +407,22 @@ spawns, the Classification Result block from the sub-agent]
 
 ## State
 
-`processed.md` is the only state file. It's natural language, not structured data. You
-read it, you update it. Create it on first run if it doesn't exist.
-
-Format: grouped by platform. Each entry has the identifier, name if known, date last
-checked, and a status:
+`processed.db` is the tracking state (SQLite). It stores which contacts have been seen
+and their status. The database schema and setup instructions are in the Database section
+above. For migration and upgrade details, see `db-setup.md`.
 
-- **classified** — identity resolved, contact added
-- **asked human** — couldn't resolve, asked human, awaiting response
-- **skipped** — spam, business, automated, or human didn't reply
-- **enriched** — existing contact updated with new details
-- **error** — processing failed, retry next run
+Status values: `classified`, `asked_human`, `skipped`, `enriched`, `error`.
 
-Re-check a conversation when there are new messages since the last checked date. Expire
-"asked human" entries after 14 days with no response — downgrade to skipped. Clean up
-"classified" entries older than 90 days.
+Re-check a conversation when there are new messages since `last_checked`. The following
+maintenance queries run during housekeeping (see below).
 
 ## Housekeeping
 
-First run each day: clean up `processed.md` entries older than 90 days that are marked
-as classified (they're stable). Keep "asked human" entries until resolved.
+First run each day:
 
-Delete logs older than 30 days.
+- Expire `asked_human` entries older than 14 days → downgrade to `skipped`
+- Delete `classified` entries older than 90 days (they're stable, no need to track)
+- Delete logs older than 30 days
 
 ## Cron Setup
 
@@ -439,7 +448,7 @@ specific platform name.
 
 This file (`AGENT.md`) and the workflow logic files (`classifier.md`, `platforms/`) are
 maintained upstream and update on deploy. User-specific configuration lives in
-`preferences.md` and `processed.md`, which are **never overwritten** by updates.
+`preferences.md` and `processed.db`, which are **never overwritten** by updates.
 
 ## Security Checklist (Every Run)
 
diff --git a/workflows/contact-steward/db-setup.md b/workflows/contact-steward/db-setup.md
new file mode 100644
index 0000000..3049a6f
--- /dev/null
+++ b/workflows/contact-steward/db-setup.md
@@ -0,0 +1,126 @@
+# Contact Steward — Database Setup & Migration
+
+Only read this file when AGENT.md directs you here — during first-time setup, legacy
+migration, or schema upgrade. Do not read on normal runs.
+
+## Prerequisites
+
+Verify sqlite3 is available:
+
+```bash
+which sqlite3
+```
+
+If not found:
+
+- **macOS:** Already installed at `/usr/bin/sqlite3`. If missing: `brew install sqlite`
+- **Ubuntu/Debian:** `sudo apt install sqlite3`
+
+## Schema Version History
+
+| Version | Changes                                                                                    | Date       |
+| ------- | ------------------------------------------------------------------------------------------ | ---------- |
+| 1       | Initial schema — processed table with platform, contact_id, status, last_checked, metadata | 2026-03-29 |
+
+## Target Schema (Current: Version 1)
+
+```sql
+CREATE TABLE IF NOT EXISTS schema_meta (version INTEGER NOT NULL);
+
+CREATE TABLE IF NOT EXISTS processed (
+  platform TEXT NOT NULL,
+  contact_id TEXT NOT NULL,
+  status TEXT NOT NULL,
+  last_checked INTEGER NOT NULL,
+  metadata TEXT,
+  PRIMARY KEY (platform, contact_id)
+);
+
+CREATE INDEX IF NOT EXISTS idx_status ON processed(status);
+CREATE INDEX IF NOT EXISTS idx_last_checked ON processed(last_checked);
+```
+
+### Column Reference
+
+| Column       | Type    | Description                                                           |
+| ------------ | ------- | --------------------------------------------------------------------- |
+| platform     | TEXT    | `whatsapp`, `imessage`, or `quo`                                      |
+| contact_id   | TEXT    | Phone number, JID, or platform-specific identifier                    |
+| status       | TEXT    | One of: `classified`, `asked_human`, `skipped`, `enriched`, `error`   |
+| last_checked | INTEGER | Unix timestamp of last processing                                     |
+| metadata     | TEXT    | Brief notes (e.g., "enriched from WhatsApp", "spam — pizza delivery") |
+
+### Status Values
+
+- **classified** — Identity resolved, contact added to platform
+- **asked_human** — Couldn't resolve, asked human, awaiting response
+- **skipped** — Spam, business, automated, or human didn't reply
+- **enriched** — Existing contact updated with new details
+- **error** — Processing failed, will retry next run
+
+## Scenario: New Installation (No Database)
+
+Run the initialization SQL from AGENT.md. It's inline there and fully idempotent. You
+don't need this file for new installations — AGENT.md has everything.
+
+## Scenario: Legacy Migration (processed.md exists)
+
+If `processed.md` exists from a previous version, migrate its entries to SQLite.
+
+**Step 1:** Run the initialization SQL from AGENT.md to create the database.
+
+**Step 2:** Read `processed.md`. It's natural language grouped by platform. Each entry
+has an identifier, optional name, date, and status. For each entry, insert:
+
+```bash
+sqlite3 workflows/contact-steward/processed.db \
+  "INSERT OR IGNORE INTO processed (platform, contact_id, status, last_checked, metadata) \
+   VALUES ('<platform>', '<contact_id>', '<status>', <unix_timestamp>, '<notes>')"
+```
+
+Use `INSERT OR IGNORE` to skip duplicates safely. Map the natural language statuses to
+the standard values: classified, asked_human, skipped, enriched, error.
+
+**Step 3:** Verify by comparing counts:
+
+```bash
+sqlite3 workflows/contact-steward/processed.db \
+  "SELECT platform, COUNT(*) FROM processed GROUP BY platform"
+```
+
+**Step 4:** Archive the old file:
+
+```bash
+mv workflows/contact-steward/processed.md workflows/contact-steward/processed.md.migrated
+```
+
+Keep `.migrated` for a few weeks as a safety net, then delete it.
+
+## Scenario: Schema Upgrade (Version Mismatch)
+
+When AGENT.md's `schema_version` is higher than the database's version, apply migrations
+in order. Each migration block is idempotent — safe to re-run.
+
+### Migrating from Version 0 → 1 (No schema_meta table)
+
+If `SELECT version FROM schema_meta` errors (table doesn't exist), the database was
+created before version tracking. The processed table likely already exists with the
+correct columns. Run:
+
+```bash
+sqlite3 workflows/contact-steward/processed.db <<'SQL'
+CREATE TABLE IF NOT EXISTS schema_meta (version INTEGER NOT NULL);
+INSERT INTO schema_meta VALUES (1);
+CREATE INDEX IF NOT EXISTS idx_status ON processed(status);
+CREATE INDEX IF NOT EXISTS idx_last_checked ON processed(last_checked);
+SQL
+```
+
+<!-- Future migrations go here, labeled clearly:
+### Migrating from Version 1 → 2
+Description of what changed and why.
+```sql
+ALTER TABLE processed ADD COLUMN new_column TEXT;
+UPDATE schema_meta SET version = 2;
+```
+-->

From 86a9b398830ddcf16ee1426ae6bca2968e6aa7e0 Mon Sep 17 00:00:00 2001
From: Nick Sullivan <nick@technick.ai>
Date: Sun, 29 Mar 2026 12:58:46 -0500
Subject: [PATCH 2/5] Fix three issues from code review
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- schema_meta: Add PRIMARY KEY + CHECK constraint to prevent duplicate rows
- Cleanup window: 90 days → 120 days to avoid race with 90-day scan window
- SKILL.md: Pattern 4 no longer says inline SQL — schema lives in db-setup.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 skills/workflow-builder/SKILL.md      | 11 ++++++-----
 workflows/contact-steward/AGENT.md    |  3 ++-
 workflows/contact-steward/db-setup.md | 12 +++++++++---
 3 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/skills/workflow-builder/SKILL.md b/skills/workflow-builder/SKILL.md
index 7934021..05230b5 100644
--- a/skills/workflow-builder/SKILL.md
+++ b/skills/workflow-builder/SKILL.md
@@ -339,13 +339,14 @@ automatically:
 
 1. **Store version in the database** via a `schema_meta` table
 2. **Declare the expected version in AGENT.md** (e.g., `Schema version: 1`)
-3. **Each run checks with one query:** `SELECT version FROM schema_meta LIMIT 1`
+3. **Each run checks with one query:** `SELECT version FROM schema_meta`
    - Matches → proceed (99% of runs, no extra reads)
    - Lower → read `db-setup.md` for migration steps
-   - Missing → run inline initialization SQL
-4. **Keep initialization SQL inline in AGENT.md** (idempotent `CREATE IF NOT EXISTS`)
-5. **Keep migration steps in a separate `db-setup.md`** — only read on version mismatch
-   or legacy conversion
+   - Missing or error → read `db-setup.md` for initialization
+4. **Keep the schema definition in `db-setup.md`** — the calling LLM creates tables from
+   the schema, no need to inline SQL in AGENT.md
+5. **Keep migration steps in `db-setup.md`** — only read on version mismatch, missing
+   database, or legacy conversion
 
 **Per-workflow `db-setup.md`** contains:
 
diff --git a/workflows/contact-steward/AGENT.md b/workflows/contact-steward/AGENT.md
index 297cbcf..60b7f04 100644
--- a/workflows/contact-steward/AGENT.md
+++ b/workflows/contact-steward/AGENT.md
@@ -421,7 +421,8 @@ maintenance queries run during housekeeping (see below).
 First run each day:
 
 - Expire `asked_human` entries older than 14 days → downgrade to `skipped`
-- Delete `classified` entries older than 90 days (they're stable, no need to track)
+- Delete `classified` entries older than 120 days (must exceed the 90-day scan window to
+  avoid re-processing contacts whose conversations are still visible)
 - Delete logs older than 30 days
 
 ## Cron Setup
diff --git a/workflows/contact-steward/db-setup.md b/workflows/contact-steward/db-setup.md
index 3049a6f..5d6d38b 100644
--- a/workflows/contact-steward/db-setup.md
+++ b/workflows/contact-steward/db-setup.md
@@ -25,7 +25,10 @@ If not found:
 ## Target Schema (Current: Version 1)
 
 ```sql
-CREATE TABLE IF NOT EXISTS schema_meta (version INTEGER NOT NULL);
+CREATE TABLE IF NOT EXISTS schema_meta (
+  id INTEGER PRIMARY KEY CHECK(id = 1),
+  version INTEGER NOT NULL
+);
 
 CREATE TABLE IF NOT EXISTS processed (
   platform TEXT NOT NULL,
@@ -109,8 +112,11 @@ correct columns. Run:
 
 ```bash
 sqlite3 workflows/contact-steward/processed.db <<'SQL'
-CREATE TABLE IF NOT EXISTS schema_meta (version INTEGER NOT NULL);
-INSERT INTO schema_meta VALUES (1);
+CREATE TABLE IF NOT EXISTS schema_meta (
+  id INTEGER PRIMARY KEY CHECK(id = 1),
+  version INTEGER NOT NULL
+);
+INSERT OR REPLACE INTO schema_meta (id, version) VALUES (1, 1);
 CREATE INDEX IF NOT EXISTS idx_status ON processed(status);
 CREATE INDEX IF NOT EXISTS idx_last_checked ON processed(last_checked);
 SQL

From fda5a06a0ada177c57f6733e134b976b4d16f2eb Mon Sep 17 00:00:00 2001
From: Nick Sullivan <nick@technick.ai>
Date: Sun, 29 Mar 2026 13:00:25 -0500
Subject: [PATCH 3/5] Replace schema_meta table with PRAGMA user_version

SQLite has a built-in integer for version tracking in the database header.
No extra table, no constraints, no duplicate row risks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 skills/workflow-builder/SKILL.md      | 11 ++++-----
 workflows/contact-steward/AGENT.md    | 10 ++++-----
 workflows/contact-steward/db-setup.md | 32 ++++++++-------------------
 3 files changed, 20 insertions(+), 33 deletions(-)

diff --git a/skills/workflow-builder/SKILL.md b/skills/workflow-builder/SKILL.md
index 05230b5..8e21c09 100644
--- a/skills/workflow-builder/SKILL.md
+++ b/skills/workflow-builder/SKILL.md
@@ -337,12 +337,13 @@ days") and let the LLM write the appropriate queries.
 Every workflow that uses SQLite must track schema versions so upgrades happen
 automatically:
 
-1. **Store version in the database** via a `schema_meta` table
+1. **Use SQLite's built-in `PRAGMA user_version`** to track schema version (no extra
+   tables needed)
 2. **Declare the expected version in AGENT.md** (e.g., `Schema version: 1`)
-3. **Each run checks with one query:** `SELECT version FROM schema_meta`
+3. **Each run checks:** `PRAGMA user_version`
    - Matches → proceed (99% of runs, no extra reads)
    - Lower → read `db-setup.md` for migration steps
-   - Missing or error → read `db-setup.md` for initialization
+   - Database missing → read `db-setup.md` for initialization
 4. **Keep the schema definition in `db-setup.md`** — the calling LLM creates tables from
    the schema, no need to inline SQL in AGENT.md
 5. **Keep migration steps in `db-setup.md`** — only read on version mismatch, missing
@@ -515,8 +516,8 @@ If `rules.md` doesn't exist or is empty:
 
 **Schema version: 1** — See `db-setup.md` for full schema.
 
-Before processing, verify schema_meta.version matches the version above. If missing,
-mismatched, or legacy state files exist → read `db-setup.md`.
+Before processing, check `PRAGMA user_version`. If it doesn't match the version above,
+or the database is missing → read `db-setup.md`.
 
 ## Regular Operation
 
diff --git a/workflows/contact-steward/AGENT.md b/workflows/contact-steward/AGENT.md
index 60b7f04..5fcac8f 100644
--- a/workflows/contact-steward/AGENT.md
+++ b/workflows/contact-steward/AGENT.md
@@ -215,14 +215,14 @@ expected — don't try to process everything at once.
 
 **Schema version: 1** — See `db-setup.md` for the full schema definition.
 
-Tracking state lives in `processed.db` (SQLite). Before first scan, check:
+Tracking state lives in `processed.db` (SQLite). Before first scan, check
+`PRAGMA user_version` on the database:
 
-- If `processed.db` doesn't exist or `schema_meta` table is missing → create the
-  database using the schema in `db-setup.md`
+- If `processed.db` doesn't exist → read `db-setup.md` to create it
 - If `processed.md` exists (legacy) → read `db-setup.md` for migration instructions
-- If `schema_meta.version` is lower than the version above → read `db-setup.md` for
+- If `user_version` is lower than the schema version above → read `db-setup.md` for
   upgrade steps
-- If version matches → proceed normally
+- If `user_version` matches → proceed normally
 
 ## Each Run
 
diff --git a/workflows/contact-steward/db-setup.md b/workflows/contact-steward/db-setup.md
index 5d6d38b..aa68716 100644
--- a/workflows/contact-steward/db-setup.md
+++ b/workflows/contact-steward/db-setup.md
@@ -25,11 +25,6 @@ If not found:
 ## Target Schema (Current: Version 1)
 
 ```sql
-CREATE TABLE IF NOT EXISTS schema_meta (
-  id INTEGER PRIMARY KEY CHECK(id = 1),
-  version INTEGER NOT NULL
-);
-
 CREATE TABLE IF NOT EXISTS processed (
   platform TEXT NOT NULL,
   contact_id TEXT NOT NULL,
@@ -63,8 +58,11 @@ CREATE INDEX IF NOT EXISTS idx_last_checked ON processed(last_checked);
 
 ## Scenario: New Installation (No Database)
 
-Run the initialization SQL from AGENT.md. It's inline there and fully idempotent. You
-don't need this file for new installations — AGENT.md has everything.
+Create the database using the target schema above, then set the version:
+
+```sql
+PRAGMA user_version = 1;
+```
 
 ## Scenario: Legacy Migration (processed.md exists)
 
@@ -104,23 +102,11 @@ Keep `.migrated` for a few weeks as a safety net, then delete it.
 When AGENT.md's `schema_version` is higher than the database's version, apply migrations
 in order. Each migration block is idempotent — safe to re-run.
 
-### Migrating from Version 0 → 1 (No schema_meta table)
-
-If `SELECT version FROM schema_meta` errors (table doesn't exist), the database was
-created before version tracking. The processed table likely already exists with the
-correct columns. Run:
+### Migrating from Version 0 → 1 (user_version is 0)
 
-```bash
-sqlite3 workflows/contact-steward/processed.db <<'SQL'
-CREATE TABLE IF NOT EXISTS schema_meta (
-  id INTEGER PRIMARY KEY CHECK(id = 1),
-  version INTEGER NOT NULL
-);
-INSERT OR REPLACE INTO schema_meta (id, version) VALUES (1, 1);
-CREATE INDEX IF NOT EXISTS idx_status ON processed(status);
-CREATE INDEX IF NOT EXISTS idx_last_checked ON processed(last_checked);
-SQL
-```
+If `PRAGMA user_version` returns 0, the database was created before version tracking or
+is brand new. Ensure the processed table and indexes exist (the CREATE IF NOT EXISTS
+statements are idempotent), then set `PRAGMA user_version = 1`.
 
 <!-- Future migrations go here, labeled clearly:
 ### Migrating from Version 1 → 2

From 002510e31d7f4bb62a3062cd9ab2d7bed60dc255 Mon Sep 17 00:00:00 2001
From: Nick Sullivan <nick@technick.ai>
Date: Sun, 29 Mar 2026 13:01:41 -0500
Subject: [PATCH 4/5] Inline database schema into AGENT.md, remove db-setup.md
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The LLM needs the schema in context to write queries — a separate file just
adds a read with no savings. Schema, setup, and migration all live in AGENT.md now.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 skills/workflow-builder/SKILL.md      |  43 +++-------
 workflows/contact-steward/AGENT.md    |  43 +++++++---
 workflows/contact-steward/db-setup.md | 118 --------------------------
 3 files changed, 45 insertions(+), 159 deletions(-)
 delete mode 100644 workflows/contact-steward/db-setup.md

diff --git a/skills/workflow-builder/SKILL.md b/skills/workflow-builder/SKILL.md
index 8e21c09..ccf6f63 100644
--- a/skills/workflow-builder/SKILL.md
+++ b/skills/workflow-builder/SKILL.md
@@ -334,37 +334,17 @@ days") and let the LLM write the appropriate queries.
 
 #### Schema Versioning & Migration
 
-Every workflow that uses SQLite must track schema versions so upgrades happen
-automatically:
+Every workflow that uses SQLite should track schema versions using SQLite's built-in
+`PRAGMA user_version` (an integer stored in the database header — no extra tables):
 
-1. **Use SQLite's built-in `PRAGMA user_version`** to track schema version (no extra
-   tables needed)
-2. **Declare the expected version in AGENT.md** (e.g., `Schema version: 1`)
+1. **Put the schema inline in AGENT.md** — the LLM needs it to write queries anyway
+2. **Declare the expected version** (e.g., `PRAGMA user_version: 1`)
 3. **Each run checks:** `PRAGMA user_version`
-   - Matches → proceed (99% of runs, no extra reads)
-   - Lower → read `db-setup.md` for migration steps
-   - Database missing → read `db-setup.md` for initialization
-4. **Keep the schema definition in `db-setup.md`** — the calling LLM creates tables from
-   the schema, no need to inline SQL in AGENT.md
-5. **Keep migration steps in `db-setup.md`** — only read on version mismatch, missing
-   database, or legacy conversion
+   - Matches → proceed
+   - Lower or missing → create tables / apply migrations / set user_version
+4. **If legacy state files exist** (e.g., `processed.md`), migrate entries and archive
 
-**Per-workflow `db-setup.md`** contains:
-
-- Target schema with column reference
-- Schema version history table
-- Legacy migration instructions (e.g., `processed.md` → `processed.db`)
-- Versioned migration blocks (e.g., "Version 1 → 2: ALTER TABLE ADD COLUMN ...")
-- Common queries for reference
-
-This pattern handles all scenarios automatically:
-
-- **New server:** No database → initialization SQL creates it
-- **Legacy server:** `processed.md` exists → db-setup.md migration
-- **Schema upgrade pushed:** Version mismatch detected → db-setup.md migration
-- **Normal run:** Version matches → zero overhead
-
-See `workflows/contact-steward/db-setup.md` for a reference implementation.
+See `workflows/contact-steward/AGENT.md` for a reference implementation.
 
 **Rule in AGENT.md:** "On every run, read contextual state first (agent_notes.md,
 rules.md). Query tracking state via SQLite — one version check, then targeted queries.
@@ -514,10 +494,11 @@ If `rules.md` doesn't exist or is empty:
 
 ## Database (only if this workflow tracks processed items)
 
-**Schema version: 1** — See `db-setup.md` for full schema.
+**PRAGMA user_version: 1**
 
-Before processing, check `PRAGMA user_version`. If it doesn't match the version above,
-or the database is missing → read `db-setup.md`.
+<Schema definition inline — CREATE TABLE, indexes, column descriptions.> <Setup &
+migration instructions — what to do if database is missing, version is lower, or legacy
+state files exist.>
 
 ## Regular Operation
 
diff --git a/workflows/contact-steward/AGENT.md b/workflows/contact-steward/AGENT.md
index 5fcac8f..bc02e24 100644
--- a/workflows/contact-steward/AGENT.md
+++ b/workflows/contact-steward/AGENT.md
@@ -213,16 +213,40 @@ expected — don't try to process everything at once.
 
 ## Database
 
-**Schema version: 1** — See `db-setup.md` for the full schema definition.
+Tracking state lives in `processed.db` (SQLite). **PRAGMA user_version: 1**
+
+### Schema
+
+```sql
+CREATE TABLE IF NOT EXISTS processed (
+  platform TEXT NOT NULL,
+  contact_id TEXT NOT NULL,
+  status TEXT NOT NULL,
+  last_checked INTEGER NOT NULL,
+  metadata TEXT,
+  PRIMARY KEY (platform, contact_id)
+);
+
+CREATE INDEX IF NOT EXISTS idx_status ON processed(status);
+CREATE INDEX IF NOT EXISTS idx_last_checked ON processed(last_checked);
+```
+
+**Columns:** `platform` (whatsapp/imessage/quo), `contact_id` (phone/JID), `status`
+(classified/asked_human/skipped/enriched/error), `last_checked` (unix timestamp),
+`metadata` (brief notes).
+
+### Setup & Migration
 
-Tracking state lives in `processed.db` (SQLite). Before first scan, check
-`PRAGMA user_version` on the database:
+Before first scan, check `PRAGMA user_version`:
 
-- If `processed.db` doesn't exist → read `db-setup.md` to create it
-- If `processed.md` exists (legacy) → read `db-setup.md` for migration instructions
-- If `user_version` is lower than the schema version above → read `db-setup.md` for
-  upgrade steps
-- If `user_version` matches → proceed normally
+- **Database missing** → create it with the schema above, set `PRAGMA user_version = 1`
+- **user_version = 0** → tables may exist without version tracking. Run the CREATE IF
+  NOT EXISTS statements (idempotent), set `PRAGMA user_version = 1`
+- **user_version matches** → proceed
+- **user_version lower than current** → apply any needed ALTER TABLE changes for the new
+  version, then update user_version
+- **`processed.md` exists (legacy)** → create the database, migrate entries from the
+  markdown file into the processed table, archive as `processed.md.migrated`
 
 ## Each Run
 
@@ -408,8 +432,7 @@ spawns, the Classification Result block from the sub-agent]
 ## State
 
 `processed.db` is the tracking state (SQLite). It stores which contacts have been seen
-and their status. The database schema and setup instructions are in the Database section
-above. For migration and upgrade details, see `db-setup.md`.
+and their status. Schema and setup instructions are in the Database section above.
 
 Status values: `classified`, `asked_human`, `skipped`, `enriched`, `error`.
 
diff --git a/workflows/contact-steward/db-setup.md b/workflows/contact-steward/db-setup.md
deleted file mode 100644
index aa68716..0000000
--- a/workflows/contact-steward/db-setup.md
+++ /dev/null
@@ -1,118 +0,0 @@
-# Contact Steward — Database Setup & Migration
-
-Only read this file when AGENT.md directs you here — during first-time setup, legacy
-migration, or schema upgrade. Do not read on normal runs.
-
-## Prerequisites
-
-Verify sqlite3 is available:
-
-```bash
-which sqlite3
-```
-
-If not found:
-
-- **macOS:** Already installed at `/usr/bin/sqlite3`. If missing: `brew install sqlite`
-- **Ubuntu/Debian:** `sudo apt install sqlite3`
-
-## Schema Version History
-
-| Version | Changes                                                                                    | Date       |
-| ------- | ------------------------------------------------------------------------------------------ | ---------- |
-| 1       | Initial schema — processed table with platform, contact_id, status, last_checked, metadata | 2026-03-29 |
-
-## Target Schema (Current: Version 1)
-
-```sql
-CREATE TABLE IF NOT EXISTS processed (
-  platform TEXT NOT NULL,
-  contact_id TEXT NOT NULL,
-  status TEXT NOT NULL,
-  last_checked INTEGER NOT NULL,
-  metadata TEXT,
-  PRIMARY KEY (platform, contact_id)
-);
-
-CREATE INDEX IF NOT EXISTS idx_status ON processed(status);
-CREATE INDEX IF NOT EXISTS idx_last_checked ON processed(last_checked);
-```
-
-### Column Reference
-
-| Column       | Type    | Description                                                           |
-| ------------ | ------- | --------------------------------------------------------------------- |
-| platform     | TEXT    | `whatsapp`, `imessage`, or `quo`                                      |
-| contact_id   | TEXT    | Phone number, JID, or platform-specific identifier                    |
-| status       | TEXT    | One of: `classified`, `asked_human`, `skipped`, `enriched`, `error`   |
-| last_checked | INTEGER | Unix timestamp of last processing                                     |
-| metadata     | TEXT    | Brief notes (e.g., "enriched from WhatsApp", "spam — pizza delivery") |
-
-### Status Values
-
-- **classified** — Identity resolved, contact added to platform
-- **asked_human** — Couldn't resolve, asked human, awaiting response
-- **skipped** — Spam, business, automated, or human didn't reply
-- **enriched** — Existing contact updated with new details
-- **error** — Processing failed, will retry next run
-
-## Scenario: New Installation (No Database)
-
-Create the database using the target schema above, then set the version:
-
-```sql
-PRAGMA user_version = 1;
-```
-
-## Scenario: Legacy Migration (processed.md exists)
-
-If `processed.md` exists from a previous version, migrate its entries to SQLite.
-
-**Step 1:** Run the initialization SQL from AGENT.md to create the database.
-
-**Step 2:** Read `processed.md`. It's natural language grouped by platform. Each entry
-has an identifier, optional name, date, and status. For each entry, insert:
-
-```bash
-sqlite3 workflows/contact-steward/processed.db \
-  "INSERT OR IGNORE INTO processed (platform, contact_id, status, last_checked, metadata) \
-   VALUES ('<platform>', '<contact_id>', '<status>', <unix_timestamp>, '<notes>')"
-```
-
-Use `INSERT OR IGNORE` to skip duplicates safely. Map the natural language statuses to
-the standard values: classified, asked_human, skipped, enriched, error.
-
-**Step 3:** Verify by comparing counts:
-
-```bash
-sqlite3 workflows/contact-steward/processed.db \
-  "SELECT platform, COUNT(*) FROM processed GROUP BY platform"
-```
-
-**Step 4:** Archive the old file:
-
-```bash
-mv workflows/contact-steward/processed.md workflows/contact-steward/processed.md.migrated
-```
-
-Keep `.migrated` for a few weeks as a safety net, then delete it.
-
-## Scenario: Schema Upgrade (Version Mismatch)
-
-When AGENT.md's `schema_version` is higher than the database's version, apply migrations
-in order. Each migration block is idempotent — safe to re-run.
-
-### Migrating from Version 0 → 1 (user_version is 0)
-
-If `PRAGMA user_version` returns 0, the database was created before version tracking or
-is brand new. Ensure the processed table and indexes exist (the CREATE IF NOT EXISTS
-statements are idempotent), then set `PRAGMA user_version = 1`.
-
-<!-- Future migrations go here, labeled clearly:
-### Migrating from Version 1 → 2
-Description of what changed and why.
-```sql
-ALTER TABLE processed ADD COLUMN new_column TEXT;
-UPDATE schema_meta SET version = 2;
-```
--->

From bffd52ed162705eec3a4f5d3eee3eb1f6059ec03 Mon Sep 17 00:00:00 2001
From: Nick Sullivan <nick@technick.ai>
Date: Sun, 29 Mar 2026 13:05:14 -0500
Subject: [PATCH 5/5] Fix enrichment gap for tracked contacts with new messages
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Step 5d previously required "not in database AND saved contact" — so contacts
already in processed.db with new messages since last_checked had no explicit path.
Now enrichment check applies to all saved contacts regardless of DB state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---
 workflows/contact-steward/AGENT.md | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/workflows/contact-steward/AGENT.md b/workflows/contact-steward/AGENT.md
index bc02e24..f8e223e 100644
--- a/workflows/contact-steward/AGENT.md
+++ b/workflows/contact-steward/AGENT.md
@@ -257,15 +257,15 @@ Before first scan, check `PRAGMA user_version`:
    filters or larger `--limit` values to reach older threads)
 5. For each conversation where your human replied (oldest unprocessed first, max 10 Opus
    spawns per run — enrichment checks and skips don't count toward the cap): a. Check
-   processed.db for this platform + contact_id. b. If found and no new messages since
-   last_checked → skip. c. If found with status `error` → retry (counts toward cap). d.
-   Not in database and saved contact on platform? Check for enrichment (new messages
-   with contact-relevant info). If no new info, skip. e. Not a saved contact?
-   Cross-reference the phone number on other platforms (especially
-   `wacli contacts search <number>`) f. Found info (cross-reference match, profile name,
-   or conversation clues)? Spawn Opus with everything you gathered. Opus verifies and
-   writes the contact. g. No match anywhere? Spawn Opus with full conversation context
-   for detective work.
+   processed.db for this platform + contact_id. b. If found, not an `error`, and no new
+   messages since last_checked → skip. c. If found with status `error` → treat as new,
+   retry (counts toward cap). d. Is the other party a saved contact on this platform?
+   Check for enrichment (new messages with contact-relevant info). If no new info,
+   update last_checked and skip. e. Not a saved contact? Cross-reference the phone
+   number on other platforms (especially `wacli contacts search <number>`) f. Found info
+   (cross-reference match, profile name, or conversation clues)? Spawn Opus with
+   everything you gathered. Opus verifies and writes the contact. g. No match anywhere?
+   Spawn Opus with full conversation context for detective work.
 6. After each contact, upsert into processed.db with the outcome status and timestamp
 7. Notify your human with a batch summary of what was added and what needs their input
 8. If unprocessed contacts remain beyond the 10-per-run cap, note the count in the log