From 52b1b64359f3d727d1f58c14f9baff438c43875a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B6=85=E6=B8=A1=E6=B3=95=E5=B8=AB?= Date: Wed, 13 May 2026 22:09:10 +0000 Subject: [PATCH 01/10] docs(adr): goal-driven cronjob with disable_on_success --- docs/adr/goal-driven-cronjob.md | 224 ++++++++++++++++++++++++++++++++ 1 file changed, 224 insertions(+) create mode 100644 docs/adr/goal-driven-cronjob.md diff --git a/docs/adr/goal-driven-cronjob.md b/docs/adr/goal-driven-cronjob.md new file mode 100644 index 00000000..f8e56ac9 --- /dev/null +++ b/docs/adr/goal-driven-cronjob.md @@ -0,0 +1,224 @@ +# ADR: Goal-Driven CronJob (disable_on_success) + +- **Status:** Proposed +- **Date:** 2026-05-13 +- **Author:** @chaodu-agent +- **Related:** [Basic CronJob ADR](./basic-cronjob.md), [CronJob Docs](../cronjob.md) + +--- + +## 1. User Story & Requirements + +As an OpenAB operator, I want to define a **goal** that agents must achieve, where a CronJob periodically checks if the goal is met and keeps prompting agents until it is — so that I can assign persistent objectives without manually following up. + +As a team lead, I want agents to self-organize ("escape room" mode) — I tell them the goal, not the steps. + +Requirements: +- Extend existing `[[cron.jobs]]` with a `disable_on_success` field +- Before sending the scheduled message, run the specified command +- If command exits 0 → goal achieved, auto-disable the job, do NOT send message +- If command exits non-zero → goal not met, send message as normal (agents continue working) +- Auto-disable state must persist across restarts +- Human can re-enable a completed goal by bumping `generation` +- All communication stays in a single stable thread + +--- + +## 2. Context & Decision Drivers + +### The "Escape Room" Pattern + +Traditional agent interaction is reactive: human sends message, agent responds. This ADR introduces **goal-driven** interaction: human sets an objective, agents work autonomously across multiple rounds until the objective is met. + +The key insight: we don't need a complex goal orchestrator for Phase 1. The existing CronJob scheduler already provides periodic execution — we just need to add a "stop condition." + +### Why Extend CronJob (Not a New System) + +We considered two approaches: + +| Approach | Pros | Cons | +|----------|------|------| +| New `[[goals]]` config section | Clean separation, dedicated semantics | New scheduler, new state machine, large MVP | +| Extend `[[cron.jobs]]` | Minimal change, reuses existing infra | Slightly overloaded config section | + +**Decision: Extend `[[cron.jobs]]`** — Phase 1 is literally "cron + exit check + auto-disable." The existing scheduler, channel routing, and thread handling all apply. A full goal runner with state delta detection and escalation is deferred to Phase 2. + +### Design Principle: Smallest Useful Increment + +> "Don't build a goal orchestrator when a conditional cron job will do." + +Phase 1 proves the concept. Phase 2 adds sophistication only after validation. + +--- + +## 3. Design + +### Configuration + +```toml +[[cron.jobs]] +id = "unit-tests-pass" # required for disable_on_success jobs +schedule = "*/10 * * * *" +channel = "123456789012345678" +thread_id = "" # auto-created on first fire if empty +message = "Goal not met: all unit tests must pass. Please continue working." +disable_on_success = "npm test" # command to evaluate goal +disable_on_success_timeout_secs = 60 # command timeout +disable_on_success_working_dir = "/repo" # working directory +generation = 1 # bump to re-enable after auto-disable +``` + +### New Fields + +| Field | Required | Default | Description | +|-------|----------|---------|-------------| +| `id` | ✅ (when `disable_on_success` set) | — | Stable unique identifier for state persistence | +| `disable_on_success` | | — | Shell command; exit 0 = goal achieved, auto-disable | +| `disable_on_success_timeout_secs` | | `60` | Max seconds before command is killed | +| `disable_on_success_working_dir` | | — | Working directory for command execution | +| `generation` | | `1` | Bump to re-enable an auto-disabled job | + +### Execution Flow + +``` +CronJob schedule fires + │ + ▼ + Is job auto-disabled AND config generation == persisted generation? + │ + ┌────┴────┐ + Yes No + │ │ + ▼ ▼ + Skip Run disable_on_success command + (done) │ + ┌────┴────┐ + │ exit 0? │ + └────┬────┘ + Yes │ No / Timeout + │ │ │ + ▼ │ ▼ + Auto-disable Send message + job, persist to channel/thread + state (agents keep working) +``` + +### State Persistence + +Persisted in `cron-state.json`, keyed by job `id`: + +```json +{ + "unit-tests-pass": { + "generation": 1, + "auto_disabled": true, + "auto_disabled_at": "2026-05-13T22:00:00Z", + "thread_id": "1504239931940409587" + } +} +``` + +Loaded on startup, written on state change. + +### Re-enable Logic + +``` +config.generation > persisted.generation? + │ + Yes → Clear auto_disabled state, job runs again + No → Job stays disabled +``` + +Human bumps `generation = 2` in config → job reactivates. No ambiguity, no conflict with `enabled` field. + +### Thread Lifecycle + +| Scenario | Behavior | +|----------|----------| +| `thread_id` provided in config | Use that thread for all fires | +| `thread_id` empty | Auto-create thread on first fire, persist in state | + +All messages go to the **same thread** — agents need conversation history as context across rounds. + +### Security + +| Concern | Mitigation | +|---------|-----------| +| Arbitrary shell execution | Trust config source (same as existing cron). Only maintainers edit config. | +| Runaway commands | `disable_on_success_timeout_secs` kills long-running processes | +| Command injection | Config is static TOML, not user-input at runtime | + +Future phases may add container isolation or command whitelists. + +--- + +## 4. Implementation Plan + +### Phase 1 (This ADR) + +1. Parse new fields from `[[cron.jobs]]` config +2. On cron fire, if `disable_on_success` is set: + - Check persisted state (generation match → skip if auto-disabled) + - Execute command with timeout and working_dir + - exit 0 → persist auto-disabled state, skip message + - exit != 0 / timeout → send message as normal +3. Thread auto-creation: if `thread_id` empty, create thread on first fire, persist +4. State file: read/write `cron-state.json` + +### Phase 2 (Future — Not This ADR) + +Introduce `[[goals]]` config section with: +- `progress_check` — state delta detection between rounds +- `stuck_threshold` — escalate after N rounds without progress +- `max_rounds` — hard cap +- LLM judge — tie-breaker after command passes +- Escalation messages with decision options +- Round counter and progress reporting + +Phase 1 `[[cron.jobs]]` entries with `disable_on_success` remain valid and coexist with Phase 2 `[[goals]]` — no migration required. + +--- + +## 5. Test Scenarios + +### Happy Path + +1. Repo has one failing test +2. Cron fires every 10 min with `disable_on_success = "npm test"` +3. `npm test` fails → message sent → agents discuss and fix +4. Next fire → `npm test` passes → job auto-disables, no message + +### Restart Resilience + +1. Job is auto-disabled (goal achieved) +2. Process restarts +3. State loaded from `cron-state.json` → job stays disabled + +### Re-enable + +1. Job is auto-disabled (`generation: 1` in state) +2. Human bumps config to `generation = 2` +3. Next fire → generation mismatch → clear auto-disable → run command again + +### Timeout + +1. `disable_on_success` command hangs +2. After `disable_on_success_timeout_secs` → killed +3. Treated as failure → message sent + +--- + +## 6. Open Questions + +1. **Multi-agent coordination** — How do agents avoid conflicting actions when self-organizing? +2. **Observability** — Should we log command output / exit codes for debugging? +3. **Context overflow** — Long-running goals accumulate thread history; summarization strategy TBD +4. **Notification on success** — Should auto-disable post a "✅ Goal achieved" message, or silently stop? + +--- + +## 7. References + +- [Basic CronJob ADR](./basic-cronjob.md) +- [CronJob Docs](../cronjob.md) +- [Design Discussion (Discord)](https://discord.com/channels/1491295327620169908/1504239931940409587) From bbc472639ecee8e77ecae8a55bf6c86cc987e62d Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B6=85=E6=B8=A1=E6=B3=95=E5=B8=AB?= Date: Wed, 13 May 2026 22:10:07 +0000 Subject: [PATCH 02/10] docs(adr): fix stale field references --- docs/adr/goal-driven-cronjob.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/adr/goal-driven-cronjob.md b/docs/adr/goal-driven-cronjob.md index f8e56ac9..7dc47b61 100644 --- a/docs/adr/goal-driven-cronjob.md +++ b/docs/adr/goal-driven-cronjob.md @@ -129,7 +129,7 @@ config.generation > persisted.generation? No → Job stays disabled ``` -Human bumps `generation = 2` in config → job reactivates. No ambiguity, no conflict with `enabled` field. +Human bumps `generation = 2` in config → job reactivates. No ambiguity, no conflict with existing fields. ### Thread Lifecycle @@ -159,9 +159,9 @@ Future phases may add container isolation or command whitelists. 1. Parse new fields from `[[cron.jobs]]` config 2. On cron fire, if `disable_on_success` is set: - Check persisted state (generation match → skip if auto-disabled) - - Execute command with timeout and working_dir + - Execute command with `disable_on_success_timeout_secs` and `disable_on_success_working_dir` - exit 0 → persist auto-disabled state, skip message - - exit != 0 / timeout → send message as normal + - exit != 0 / timeout exceeded → send message as normal 3. Thread auto-creation: if `thread_id` empty, create thread on first fire, persist 4. State file: read/write `cron-state.json` From 6fd31f1caf0edf3629d132aa541ba450e984e1bc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B6=85=E6=B8=A1=E6=B3=95=E5=B8=AB?= Date: Wed, 13 May 2026 22:11:10 +0000 Subject: [PATCH 03/10] docs(adr): usercron-only constraint, no separate state file --- docs/adr/goal-driven-cronjob.md | 57 ++++++++++++++++----------------- 1 file changed, 28 insertions(+), 29 deletions(-) diff --git a/docs/adr/goal-driven-cronjob.md b/docs/adr/goal-driven-cronjob.md index 7dc47b61..7e12d85d 100644 --- a/docs/adr/goal-driven-cronjob.md +++ b/docs/adr/goal-driven-cronjob.md @@ -55,7 +55,10 @@ Phase 1 proves the concept. Phase 2 adds sophistication only after validation. ### Configuration +`disable_on_success` is **only supported in usercron** (`.openab/usercron/cronjob.toml`), NOT in global config. This is because auto-disable needs to write state back to the file, and only usercron is agent-writable. + ```toml +# .openab/usercron/cronjob.toml [[cron.jobs]] id = "unit-tests-pass" # required for disable_on_success jobs schedule = "*/10 * * * *" @@ -66,6 +69,7 @@ disable_on_success = "npm test" # command to evaluate goal disable_on_success_timeout_secs = 60 # command timeout disable_on_success_working_dir = "/repo" # working directory generation = 1 # bump to re-enable after auto-disable +enabled = true # agent sets to false on success ``` ### New Fields @@ -105,29 +109,23 @@ CronJob schedule fires ### State Persistence -Persisted in `cron-state.json`, keyed by job `id`: - -```json -{ - "unit-tests-pass": { - "generation": 1, - "auto_disabled": true, - "auto_disabled_at": "2026-05-13T22:00:00Z", - "thread_id": "1504239931940409587" - } -} -``` +No separate state file needed. When goal is achieved, agent writes `enabled = false` directly to `.openab/usercron/cronjob.toml`. State lives in the config itself. + +| Event | Action | +|-------|--------| +| Goal achieved (exit 0) | Agent sets `enabled = false` in usercron file | +| Human re-enables | Human sets `enabled = true` (or bumps `generation`) | +| Thread auto-created | Agent writes `thread_id` back to usercron file | -Loaded on startup, written on state change. +This works because usercron is designed to be agent-writable, unlike global config. ### Re-enable Logic -``` -config.generation > persisted.generation? - │ - Yes → Clear auto_disabled state, job runs again - No → Job stays disabled -``` +Human edits `.openab/usercron/cronjob.toml`: +- Set `enabled = true`, or +- Bump `generation` (e.g. `generation = 2`) + +Either action signals intentional re-activation. No ambiguity, no separate state file to manage. Human bumps `generation = 2` in config → job reactivates. No ambiguity, no conflict with existing fields. @@ -156,14 +154,15 @@ Future phases may add container isolation or command whitelists. ### Phase 1 (This ADR) -1. Parse new fields from `[[cron.jobs]]` config +1. Parse new fields from usercron `[[cron.jobs]]` (`.openab/usercron/cronjob.toml`) 2. On cron fire, if `disable_on_success` is set: - - Check persisted state (generation match → skip if auto-disabled) + - Check `enabled` — if false, skip + - Check `generation` — if config > last known, treat as re-enabled - Execute command with `disable_on_success_timeout_secs` and `disable_on_success_working_dir` - - exit 0 → persist auto-disabled state, skip message + - exit 0 → write `enabled = false` to usercron file, skip message - exit != 0 / timeout exceeded → send message as normal -3. Thread auto-creation: if `thread_id` empty, create thread on first fire, persist -4. State file: read/write `cron-state.json` +3. Thread auto-creation: if `thread_id` empty, create thread on first fire, write back to usercron file +4. No separate state file — usercron IS the state ### Phase 2 (Future — Not This ADR) @@ -190,15 +189,15 @@ Phase 1 `[[cron.jobs]]` entries with `disable_on_success` remain valid and coexi ### Restart Resilience -1. Job is auto-disabled (goal achieved) +1. Job is auto-disabled (agent wrote `enabled = false` to usercron) 2. Process restarts -3. State loaded from `cron-state.json` → job stays disabled +3. Usercron loaded → `enabled = false` → job stays disabled ### Re-enable -1. Job is auto-disabled (`generation: 1` in state) -2. Human bumps config to `generation = 2` -3. Next fire → generation mismatch → clear auto-disable → run command again +1. Job is disabled (`enabled = false` in usercron) +2. Human edits usercron: sets `enabled = true` (or bumps `generation`) +3. Next fire → job runs again ### Timeout From ed5c3dc90b36fe90076858153a7014a3a1600905 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B6=85=E6=B8=A1=E6=B3=95=E5=B8=AB?= Date: Wed, 13 May 2026 22:12:08 +0000 Subject: [PATCH 04/10] docs(adr): success notification mandatory, id validation is startup error --- docs/adr/goal-driven-cronjob.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/adr/goal-driven-cronjob.md b/docs/adr/goal-driven-cronjob.md index 7e12d85d..18a27941 100644 --- a/docs/adr/goal-driven-cronjob.md +++ b/docs/adr/goal-driven-cronjob.md @@ -76,7 +76,7 @@ enabled = true # agent sets to false on succe | Field | Required | Default | Description | |-------|----------|---------|-------------| -| `id` | ✅ (when `disable_on_success` set) | — | Stable unique identifier for state persistence | +| `id` | ✅ (when `disable_on_success` set) | — | Stable unique identifier for state persistence. Missing `id` on a job with `disable_on_success` is a **startup error**. | | `disable_on_success` | | — | Shell command; exit 0 = goal achieved, auto-disable | | `disable_on_success_timeout_secs` | | `60` | Max seconds before command is killed | | `disable_on_success_working_dir` | | — | Working directory for command execution | @@ -113,7 +113,7 @@ No separate state file needed. When goal is achieved, agent writes `enabled = fa | Event | Action | |-------|--------| -| Goal achieved (exit 0) | Agent sets `enabled = false` in usercron file | +| Goal achieved (exit 0) | Post `✅ Goal achieved: ` to thread, then set `enabled = false` in usercron file | | Human re-enables | Human sets `enabled = true` (or bumps `generation`) | | Thread auto-created | Agent writes `thread_id` back to usercron file | @@ -159,7 +159,7 @@ Future phases may add container isolation or command whitelists. - Check `enabled` — if false, skip - Check `generation` — if config > last known, treat as re-enabled - Execute command with `disable_on_success_timeout_secs` and `disable_on_success_working_dir` - - exit 0 → write `enabled = false` to usercron file, skip message + - exit 0 → post `✅ Goal achieved` to thread, write `enabled = false` to usercron file - exit != 0 / timeout exceeded → send message as normal 3. Thread auto-creation: if `thread_id` empty, create thread on first fire, write back to usercron file 4. No separate state file — usercron IS the state @@ -212,7 +212,6 @@ Phase 1 `[[cron.jobs]]` entries with `disable_on_success` remain valid and coexi 1. **Multi-agent coordination** — How do agents avoid conflicting actions when self-organizing? 2. **Observability** — Should we log command output / exit codes for debugging? 3. **Context overflow** — Long-running goals accumulate thread history; summarization strategy TBD -4. **Notification on success** — Should auto-disable post a "✅ Goal achieved" message, or silently stop? --- From d930465899336c0066face89cc46cab5fd1e58f9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B6=85=E6=B8=A1=E6=B3=95=E5=B8=AB?= Date: Wed, 13 May 2026 22:13:58 +0000 Subject: [PATCH 05/10] =?UTF-8?q?docs(adr):=20align=20with=20existing=20us?= =?UTF-8?q?ercron=20contract=20=E2=80=94=20correct=20path,=20format,=20act?= =?UTF-8?q?or?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/adr/goal-driven-cronjob.md | 46 +++++++++++++++++---------------- 1 file changed, 24 insertions(+), 22 deletions(-) diff --git a/docs/adr/goal-driven-cronjob.md b/docs/adr/goal-driven-cronjob.md index 18a27941..fe907b94 100644 --- a/docs/adr/goal-driven-cronjob.md +++ b/docs/adr/goal-driven-cronjob.md @@ -55,11 +55,11 @@ Phase 1 proves the concept. Phase 2 adds sophistication only after validation. ### Configuration -`disable_on_success` is **only supported in usercron** (`.openab/usercron/cronjob.toml`), NOT in global config. This is because auto-disable needs to write state back to the file, and only usercron is agent-writable. +`disable_on_success` is **only supported in usercron** (`$HOME/.openab/cronjob.toml`), NOT in global config. This is because auto-disable needs to write state back to the file, and only usercron is writable by the OpenAB scheduler at runtime. ```toml -# .openab/usercron/cronjob.toml -[[cron.jobs]] +# $HOME/.openab/cronjob.toml (usercron format uses [[jobs]]) +[[jobs]] id = "unit-tests-pass" # required for disable_on_success jobs schedule = "*/10 * * * *" channel = "123456789012345678" @@ -69,7 +69,7 @@ disable_on_success = "npm test" # command to evaluate goal disable_on_success_timeout_secs = 60 # command timeout disable_on_success_working_dir = "/repo" # working directory generation = 1 # bump to re-enable after auto-disable -enabled = true # agent sets to false on success +enabled = true # scheduler sets to false on success ``` ### New Fields @@ -109,25 +109,28 @@ CronJob schedule fires ### State Persistence -No separate state file needed. When goal is achieved, agent writes `enabled = false` directly to `.openab/usercron/cronjob.toml`. State lives in the config itself. +No separate state file needed. When goal is achieved, the **OpenAB scheduler** writes `enabled = false` directly to `$HOME/.openab/cronjob.toml`. State lives in the config itself. | Event | Action | |-------|--------| -| Goal achieved (exit 0) | Post `✅ Goal achieved: ` to thread, then set `enabled = false` in usercron file | -| Human re-enables | Human sets `enabled = true` (or bumps `generation`) | -| Thread auto-created | Agent writes `thread_id` back to usercron file | +| Goal achieved (exit 0) | Scheduler posts `✅ Goal achieved: ` to thread, then sets `enabled = false` in usercron file | +| Human re-enables | Human sets `enabled = true` and/or bumps `generation` in usercron file | +| Thread auto-created | Scheduler writes `thread_id` back to usercron file | -This works because usercron is designed to be agent-writable, unlike global config. +This works because usercron is designed to be runtime-writable (hot-reloaded by the scheduler), unlike global config. -### Re-enable Logic +**`enabled` vs `generation` interaction:** +- Scheduler checks `enabled` first — if `false`, job is skipped entirely +- `generation` is checked only when `enabled = true` — if config `generation` > last-known generation at time of auto-disable, the job is treated as re-enabled +- To re-enable: human sets `enabled = true` (scheduler won't auto-re-enable a disabled job just by bumping generation alone) -Human edits `.openab/usercron/cronjob.toml`: -- Set `enabled = true`, or -- Bump `generation` (e.g. `generation = 2`) +### Re-enable Logic -Either action signals intentional re-activation. No ambiguity, no separate state file to manage. +Human edits `$HOME/.openab/cronjob.toml`: +1. Set `enabled = true` (required — this is what the scheduler checks first) +2. Optionally bump `generation` (signals a fresh start, resets any internal tracking) -Human bumps `generation = 2` in config → job reactivates. No ambiguity, no conflict with existing fields. +The scheduler hot-reloads the file, sees `enabled = true`, and resumes firing. ### Thread Lifecycle @@ -154,14 +157,13 @@ Future phases may add container isolation or command whitelists. ### Phase 1 (This ADR) -1. Parse new fields from usercron `[[cron.jobs]]` (`.openab/usercron/cronjob.toml`) +1. Parse new fields from usercron `[[jobs]]` (`$HOME/.openab/cronjob.toml`) 2. On cron fire, if `disable_on_success` is set: - Check `enabled` — if false, skip - - Check `generation` — if config > last known, treat as re-enabled - Execute command with `disable_on_success_timeout_secs` and `disable_on_success_working_dir` - - exit 0 → post `✅ Goal achieved` to thread, write `enabled = false` to usercron file + - exit 0 → scheduler posts `✅ Goal achieved` to thread, writes `enabled = false` to usercron file - exit != 0 / timeout exceeded → send message as normal -3. Thread auto-creation: if `thread_id` empty, create thread on first fire, write back to usercron file +3. Thread auto-creation: if `thread_id` empty, create thread on first fire, scheduler writes back to usercron file 4. No separate state file — usercron IS the state ### Phase 2 (Future — Not This ADR) @@ -189,15 +191,15 @@ Phase 1 `[[cron.jobs]]` entries with `disable_on_success` remain valid and coexi ### Restart Resilience -1. Job is auto-disabled (agent wrote `enabled = false` to usercron) +1. Job is auto-disabled (scheduler wrote `enabled = false` to usercron) 2. Process restarts 3. Usercron loaded → `enabled = false` → job stays disabled ### Re-enable 1. Job is disabled (`enabled = false` in usercron) -2. Human edits usercron: sets `enabled = true` (or bumps `generation`) -3. Next fire → job runs again +2. Human edits `$HOME/.openab/cronjob.toml`: sets `enabled = true` +3. Scheduler hot-reloads → job runs again ### Timeout From 8194ed6339a3bccde119d2be83497f352ea6c4c3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B6=85=E6=B8=A1=E6=B3=95=E5=B8=AB?= Date: Wed, 13 May 2026 22:15:24 +0000 Subject: [PATCH 06/10] docs(adr): remove generation, simplify re-enable to enabled=true only, fix flow diagram --- docs/adr/goal-driven-cronjob.md | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/docs/adr/goal-driven-cronjob.md b/docs/adr/goal-driven-cronjob.md index fe907b94..fc5a9cce 100644 --- a/docs/adr/goal-driven-cronjob.md +++ b/docs/adr/goal-driven-cronjob.md @@ -19,7 +19,7 @@ Requirements: - If command exits 0 → goal achieved, auto-disable the job, do NOT send message - If command exits non-zero → goal not met, send message as normal (agents continue working) - Auto-disable state must persist across restarts -- Human can re-enable a completed goal by bumping `generation` +- Human can re-enable a completed goal by setting `enabled = true` - All communication stays in a single stable thread --- @@ -68,7 +68,6 @@ message = "Goal not met: all unit tests must pass. Please continue working." disable_on_success = "npm test" # command to evaluate goal disable_on_success_timeout_secs = 60 # command timeout disable_on_success_working_dir = "/repo" # working directory -generation = 1 # bump to re-enable after auto-disable enabled = true # scheduler sets to false on success ``` @@ -80,7 +79,6 @@ enabled = true # scheduler sets to false on s | `disable_on_success` | | — | Shell command; exit 0 = goal achieved, auto-disable | | `disable_on_success_timeout_secs` | | `60` | Max seconds before command is killed | | `disable_on_success_working_dir` | | — | Working directory for command execution | -| `generation` | | `1` | Bump to re-enable an auto-disabled job | ### Execution Flow @@ -88,7 +86,7 @@ enabled = true # scheduler sets to false on s CronJob schedule fires │ ▼ - Is job auto-disabled AND config generation == persisted generation? + Is enabled = false in usercron? │ ┌────┴────┐ Yes No @@ -102,9 +100,9 @@ CronJob schedule fires Yes │ No / Timeout │ │ │ ▼ │ ▼ - Auto-disable Send message - job, persist to channel/thread - state (agents keep working) + Post ✅, Send message + set enabled to channel/thread + = false (agents keep working) ``` ### State Persistence @@ -114,23 +112,17 @@ No separate state file needed. When goal is achieved, the **OpenAB scheduler** w | Event | Action | |-------|--------| | Goal achieved (exit 0) | Scheduler posts `✅ Goal achieved: ` to thread, then sets `enabled = false` in usercron file | -| Human re-enables | Human sets `enabled = true` and/or bumps `generation` in usercron file | +| Human re-enables | Human sets `enabled = true` in usercron file | | Thread auto-created | Scheduler writes `thread_id` back to usercron file | This works because usercron is designed to be runtime-writable (hot-reloaded by the scheduler), unlike global config. -**`enabled` vs `generation` interaction:** -- Scheduler checks `enabled` first — if `false`, job is skipped entirely -- `generation` is checked only when `enabled = true` — if config `generation` > last-known generation at time of auto-disable, the job is treated as re-enabled -- To re-enable: human sets `enabled = true` (scheduler won't auto-re-enable a disabled job just by bumping generation alone) - ### Re-enable Logic Human edits `$HOME/.openab/cronjob.toml`: -1. Set `enabled = true` (required — this is what the scheduler checks first) -2. Optionally bump `generation` (signals a fresh start, resets any internal tracking) +- Set `enabled = true` -The scheduler hot-reloads the file, sees `enabled = true`, and resumes firing. +That's it. Scheduler hot-reloads the file, sees `enabled = true`, and resumes firing. No generation counter, no state comparison needed. ### Thread Lifecycle @@ -199,7 +191,7 @@ Phase 1 `[[cron.jobs]]` entries with `disable_on_success` remain valid and coexi 1. Job is disabled (`enabled = false` in usercron) 2. Human edits `$HOME/.openab/cronjob.toml`: sets `enabled = true` -3. Scheduler hot-reloads → job runs again +3. Scheduler hot-reloads → job fires again on next schedule ### Timeout From 6aaa75088acb08872c1bdb64ecd10750a229cbde Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B6=85=E6=B8=A1=E6=B3=95=E5=B8=AB?= Date: Wed, 13 May 2026 22:16:10 +0000 Subject: [PATCH 07/10] =?UTF-8?q?docs(adr):=20fix=20Happy=20Path=20step=20?= =?UTF-8?q?4=20=E2=80=94=20success=20notification?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/adr/goal-driven-cronjob.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adr/goal-driven-cronjob.md b/docs/adr/goal-driven-cronjob.md index fc5a9cce..c4ed168c 100644 --- a/docs/adr/goal-driven-cronjob.md +++ b/docs/adr/goal-driven-cronjob.md @@ -179,7 +179,7 @@ Phase 1 `[[cron.jobs]]` entries with `disable_on_success` remain valid and coexi 1. Repo has one failing test 2. Cron fires every 10 min with `disable_on_success = "npm test"` 3. `npm test` fails → message sent → agents discuss and fix -4. Next fire → `npm test` passes → job auto-disables, no message +4. Next fire → `npm test` passes → scheduler posts `✅ Goal achieved`, sets `enabled = false` ### Restart Resilience From 294f443d509e5dee000de95e6f0e8f7b627965a6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B6=85=E6=B8=A1=E6=B3=95=E5=B8=AB?= Date: Wed, 13 May 2026 22:17:24 +0000 Subject: [PATCH 08/10] docs(adr): fix remaining [[cron.jobs]] refs, clarify success notification --- docs/adr/goal-driven-cronjob.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/adr/goal-driven-cronjob.md b/docs/adr/goal-driven-cronjob.md index c4ed168c..00aced63 100644 --- a/docs/adr/goal-driven-cronjob.md +++ b/docs/adr/goal-driven-cronjob.md @@ -14,9 +14,9 @@ As an OpenAB operator, I want to define a **goal** that agents must achieve, whe As a team lead, I want agents to self-organize ("escape room" mode) — I tell them the goal, not the steps. Requirements: -- Extend existing `[[cron.jobs]]` with a `disable_on_success` field +- Extend existing usercron `[[jobs]]` with a `disable_on_success` field - Before sending the scheduled message, run the specified command -- If command exits 0 → goal achieved, auto-disable the job, do NOT send message +- If command exits 0 → goal achieved, post `✅ Goal achieved` to thread, auto-disable the job, do NOT send the regular failure message - If command exits non-zero → goal not met, send message as normal (agents continue working) - Auto-disable state must persist across restarts - Human can re-enable a completed goal by setting `enabled = true` @@ -39,9 +39,9 @@ We considered two approaches: | Approach | Pros | Cons | |----------|------|------| | New `[[goals]]` config section | Clean separation, dedicated semantics | New scheduler, new state machine, large MVP | -| Extend `[[cron.jobs]]` | Minimal change, reuses existing infra | Slightly overloaded config section | +| Extend usercron `[[jobs]]` | Minimal change, reuses existing infra | Slightly overloaded config section | -**Decision: Extend `[[cron.jobs]]`** — Phase 1 is literally "cron + exit check + auto-disable." The existing scheduler, channel routing, and thread handling all apply. A full goal runner with state delta detection and escalation is deferred to Phase 2. +**Decision: Extend usercron `[[jobs]]`** — Phase 1 is literally "cron + exit check + auto-disable." The existing scheduler, channel routing, and thread handling all apply. A full goal runner with state delta detection and escalation is deferred to Phase 2. ### Design Principle: Smallest Useful Increment @@ -168,7 +168,7 @@ Introduce `[[goals]]` config section with: - Escalation messages with decision options - Round counter and progress reporting -Phase 1 `[[cron.jobs]]` entries with `disable_on_success` remain valid and coexist with Phase 2 `[[goals]]` — no migration required. +Phase 1 usercron `[[jobs]]` entries with `disable_on_success` remain valid and coexist with Phase 2 `[[goals]]` — no migration required. --- From 8b900ad0c1bdf485ee732fcdf6119ce946a64496 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=B6=85=E6=B8=A1=E6=B3=95=E5=B8=AB?= Date: Wed, 13 May 2026 22:58:27 +0000 Subject: [PATCH 09/10] docs(adr): add disable_on_success_match for explicit goal confirmation --- docs/adr/goal-driven-cronjob.md | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/docs/adr/goal-driven-cronjob.md b/docs/adr/goal-driven-cronjob.md index 00aced63..bc60dd17 100644 --- a/docs/adr/goal-driven-cronjob.md +++ b/docs/adr/goal-driven-cronjob.md @@ -16,7 +16,7 @@ As a team lead, I want agents to self-organize ("escape room" mode) — I tell t Requirements: - Extend existing usercron `[[jobs]]` with a `disable_on_success` field - Before sending the scheduled message, run the specified command -- If command exits 0 → goal achieved, post `✅ Goal achieved` to thread, auto-disable the job, do NOT send the regular failure message +- If command exits 0 (and stdout contains `disable_on_success_match` if set) → goal achieved, post `✅ Goal achieved` to thread, auto-disable the job, do NOT send the regular failure message - If command exits non-zero → goal not met, send message as normal (agents continue working) - Auto-disable state must persist across restarts - Human can re-enable a completed goal by setting `enabled = true` @@ -66,6 +66,7 @@ channel = "123456789012345678" thread_id = "" # auto-created on first fire if empty message = "Goal not met: all unit tests must pass. Please continue working." disable_on_success = "npm test" # command to evaluate goal +disable_on_success_match = "SUCCESS" # optional: stdout must contain this string disable_on_success_timeout_secs = 60 # command timeout disable_on_success_working_dir = "/repo" # working directory enabled = true # scheduler sets to false on success @@ -76,7 +77,8 @@ enabled = true # scheduler sets to false on s | Field | Required | Default | Description | |-------|----------|---------|-------------| | `id` | ✅ (when `disable_on_success` set) | — | Stable unique identifier for state persistence. Missing `id` on a job with `disable_on_success` is a **startup error**. | -| `disable_on_success` | | — | Shell command; exit 0 = goal achieved, auto-disable | +| `disable_on_success` | | — | Shell command; exit 0 + match = goal achieved, auto-disable | +| `disable_on_success_match` | | — | If set, stdout must contain this string (in addition to exit 0) for goal to be considered achieved | | `disable_on_success_timeout_secs` | | `60` | Max seconds before command is killed | | `disable_on_success_working_dir` | | — | Working directory for command execution | @@ -100,9 +102,22 @@ CronJob schedule fires Yes │ No / Timeout │ │ │ ▼ │ ▼ - Post ✅, Send message - set enabled to channel/thread - = false (agents keep working) + match set? Send message + │ │ to channel/thread + Yes No (agents keep working) + │ │ + ▼ ▼ + stdout Post ✅, + contains set enabled + match? = false + │ + ┌───┴───┐ + Yes No + │ │ + ▼ ▼ +Post ✅, Send message +set enabled (goal not confirmed) += false ``` ### State Persistence From b7c5281eaf4aa796810671ad5d8701bdd58204f6 Mon Sep 17 00:00:00 2001 From: chaodufashi Date: Wed, 13 May 2026 22:59:14 +0000 Subject: [PATCH 10/10] docs(adr): require success marker for goal completion --- docs/adr/goal-driven-cronjob.md | 52 +++++++++++++++------------------ 1 file changed, 24 insertions(+), 28 deletions(-) diff --git a/docs/adr/goal-driven-cronjob.md b/docs/adr/goal-driven-cronjob.md index bc60dd17..1f6c61db 100644 --- a/docs/adr/goal-driven-cronjob.md +++ b/docs/adr/goal-driven-cronjob.md @@ -14,9 +14,10 @@ As an OpenAB operator, I want to define a **goal** that agents must achieve, whe As a team lead, I want agents to self-organize ("escape room" mode) — I tell them the goal, not the steps. Requirements: -- Extend existing usercron `[[jobs]]` with a `disable_on_success` field +- Extend existing usercron `[[jobs]]` with `disable_on_success` fields - Before sending the scheduled message, run the specified command -- If command exits 0 (and stdout contains `disable_on_success_match` if set) → goal achieved, post `✅ Goal achieved` to thread, auto-disable the job, do NOT send the regular failure message +- If command exits 0 and prints the configured `disable_on_success_match` string to stdout/stderr → goal achieved, post `✅ Goal achieved` to thread, auto-disable the job, do NOT send the regular failure message +- If command exits 0 without the required match string → goal not met, send message as normal - If command exits non-zero → goal not met, send message as normal (agents continue working) - Auto-disable state must persist across restarts - Human can re-enable a completed goal by setting `enabled = true` @@ -65,8 +66,8 @@ schedule = "*/10 * * * *" channel = "123456789012345678" thread_id = "" # auto-created on first fire if empty message = "Goal not met: all unit tests must pass. Please continue working." -disable_on_success = "npm test" # command to evaluate goal -disable_on_success_match = "SUCCESS" # optional: stdout must contain this string +disable_on_success = "npm test && echo GOAL_ACHIEVED" # command to evaluate goal +disable_on_success_match = "GOAL_ACHIEVED" # required marker in command output disable_on_success_timeout_secs = 60 # command timeout disable_on_success_working_dir = "/repo" # working directory enabled = true # scheduler sets to false on success @@ -77,8 +78,8 @@ enabled = true # scheduler sets to false on s | Field | Required | Default | Description | |-------|----------|---------|-------------| | `id` | ✅ (when `disable_on_success` set) | — | Stable unique identifier for state persistence. Missing `id` on a job with `disable_on_success` is a **startup error**. | -| `disable_on_success` | | — | Shell command; exit 0 + match = goal achieved, auto-disable | -| `disable_on_success_match` | | — | If set, stdout must contain this string (in addition to exit 0) for goal to be considered achieved | +| `disable_on_success` | | — | Shell command that evaluates the goal | +| `disable_on_success_match` | ✅ (when `disable_on_success` set) | — | Required marker string that must appear in command stdout/stderr, in addition to exit 0, before the goal is considered achieved | | `disable_on_success_timeout_secs` | | `60` | Max seconds before command is killed | | `disable_on_success_working_dir` | | — | Working directory for command execution | @@ -97,27 +98,15 @@ CronJob schedule fires Skip Run disable_on_success command (done) │ ┌────┴────┐ - │ exit 0? │ + │ exit 0 │ + │ + marker? │ └────┬────┘ Yes │ No / Timeout │ │ │ ▼ │ ▼ - match set? Send message - │ │ to channel/thread - Yes No (agents keep working) - │ │ - ▼ ▼ - stdout Post ✅, - contains set enabled - match? = false - │ - ┌───┴───┐ - Yes No - │ │ - ▼ ▼ -Post ✅, Send message -set enabled (goal not confirmed) -= false + Post ✅, Send message + set enabled to channel/thread + = false (agents keep working) ``` ### State Persistence @@ -126,7 +115,7 @@ No separate state file needed. When goal is achieved, the **OpenAB scheduler** w | Event | Action | |-------|--------| -| Goal achieved (exit 0) | Scheduler posts `✅ Goal achieved: ` to thread, then sets `enabled = false` in usercron file | +| Goal achieved (exit 0 + marker) | Scheduler posts `✅ Goal achieved: ` to thread, then sets `enabled = false` in usercron file | | Human re-enables | Human sets `enabled = true` in usercron file | | Thread auto-created | Scheduler writes `thread_id` back to usercron file | @@ -153,6 +142,7 @@ All messages go to the **same thread** — agents need conversation history as c | Concern | Mitigation | |---------|-----------| | Arbitrary shell execution | Trust config source (same as existing cron). Only maintainers edit config. | +| False-positive success | Require both exit 0 and an explicit `disable_on_success_match` in command stdout/stderr | | Runaway commands | `disable_on_success_timeout_secs` kills long-running processes | | Command injection | Config is static TOML, not user-input at runtime | @@ -167,9 +157,10 @@ Future phases may add container isolation or command whitelists. 1. Parse new fields from usercron `[[jobs]]` (`$HOME/.openab/cronjob.toml`) 2. On cron fire, if `disable_on_success` is set: - Check `enabled` — if false, skip + - Validate `id` and `disable_on_success_match` are present - Execute command with `disable_on_success_timeout_secs` and `disable_on_success_working_dir` - - exit 0 → scheduler posts `✅ Goal achieved` to thread, writes `enabled = false` to usercron file - - exit != 0 / timeout exceeded → send message as normal + - exit 0 and stdout/stderr contains `disable_on_success_match` → scheduler posts `✅ Goal achieved` to thread, writes `enabled = false` to usercron file + - exit != 0 / timeout exceeded / marker missing → send message as normal 3. Thread auto-creation: if `thread_id` empty, create thread on first fire, scheduler writes back to usercron file 4. No separate state file — usercron IS the state @@ -192,9 +183,9 @@ Phase 1 usercron `[[jobs]]` entries with `disable_on_success` remain valid and c ### Happy Path 1. Repo has one failing test -2. Cron fires every 10 min with `disable_on_success = "npm test"` +2. Cron fires every 10 min with `disable_on_success = "npm test && echo GOAL_ACHIEVED"` and `disable_on_success_match = "GOAL_ACHIEVED"` 3. `npm test` fails → message sent → agents discuss and fix -4. Next fire → `npm test` passes → scheduler posts `✅ Goal achieved`, sets `enabled = false` +4. Next fire → `npm test` passes and output contains `GOAL_ACHIEVED` → scheduler posts `✅ Goal achieved`, sets `enabled = false` ### Restart Resilience @@ -214,6 +205,11 @@ Phase 1 usercron `[[jobs]]` entries with `disable_on_success` remain valid and c 2. After `disable_on_success_timeout_secs` → killed 3. Treated as failure → message sent +### Missing Marker + +1. `disable_on_success` exits 0 but does not print `disable_on_success_match` +2. Treated as failure → regular message sent + --- ## 6. Open Questions