Skip to content

feat: deterministic Stop hook for executor completion (#129)#137

Closed
sriumcp wants to merge 1 commit into
AI-native-Systems-Research:reflectivefrom
sriumcp:feat/129-execute-stop-hook
Closed

feat: deterministic Stop hook for executor completion (#129)#137
sriumcp wants to merge 1 commit into
AI-native-Systems-Research:reflectivefrom
sriumcp:feat/129-execute-stop-hook

Conversation

@sriumcp
Copy link
Copy Markdown
Collaborator

@sriumcp sriumcp commented May 24, 2026

Summary

  • New bin/nous-execute-stop Python entrypoint suitable for use as a Claude Code Stop hook.
  • Allows the executor session to terminate only when both principle_updates.json exists and nous validate execution passes — schema-driven, no LLM judgment.
  • On block, writes a structured reason to stderr; Claude Code feeds that back into the agent's conversation so it fixes the artifact instead of restarting.

Why deterministic over Haiku

The /goal evaluator (#124) is right for fuzzy success criteria, but execution completion is a schema check — a shell-out that runs validate_execution is cheaper, faster, and immune to evaluator drift. The two coexist.

Wire-up

The orchestrator exports NOUS_ITER_DIR before launching the executor session; the per-campaign .claude/settings.json (lands in #135) registers this script under hooks.Stop. This PR ships just the script — installation today is manual via that settings file.

Behavioral tests

Five cases in tests/test_execute_stop_hook.py:

Scenario Expected
Valid iter dir + principle_updates.json present exit 0, no stderr
principle_updates.json missing exit 2, stderr names the file
findings.json missing required field exit 2, stderr has schema diff
NOUS_ITER_DIR points at non-existent dir exit 2, reason given
NOUS_ITER_DIR unset exit 2, config-error reason

Tests use StubDispatcher to populate a known-passing iter_dir, then mutate it to simulate failure modes. Assertions describe what the hook emits (exit code + stderr substrings) — never which functions it called or how it organized internal work.

Test plan

Closes #129.
Refs #120.

🤖 Generated with Claude Code

…Systems-Research#129)

Ship bin/nous-execute-stop, a Python entrypoint suitable for use as a
Claude Code Stop hook. It tells the harness whether the executor agent
is allowed to terminate, based on objective evidence on disk:

  * exit 0 (allow stop) when:
      - principle_updates.json exists in $NOUS_ITER_DIR
      - `nous validate execution --dir $NOUS_ITER_DIR` returns pass
  * exit 2 (block stop) otherwise, with a structured reason on stderr
    so Claude Code feeds it back into the agent's conversation and the
    next turn fixes the artifact rather than restarting.

Why deterministic over probabilistic: the existing /goal evaluator (Haiku
post-turn) is right for fuzzy success criteria, but execution completion
is a schema check — cheaper, faster, and immune to evaluator drift to
have a deterministic shell-out. The two coexist; AI-native-Systems-Research#124 wires /goal for
fuzzy gating, this hook handles the schema gate.

Wire-up: the orchestrator exports NOUS_ITER_DIR before launching the
executor session, and the per-campaign .claude/settings.json (which
lands in AI-native-Systems-Research#135) registers this script under hooks.Stop. This PR ships
just the script so it can be installed manually today.

Behavioral tests (5):
  * pass case: valid iter dir + principle_updates.json -> exit 0, no stderr
  * block: principle_updates.json missing -> exit 2, stderr names the file
  * block: corrupted findings.json -> exit 2, stderr includes the schema diff
  * block: NOUS_ITER_DIR points at non-existent dir -> exit 2 with reason
  * block: NOUS_ITER_DIR unset -> exit 2 with config-error reason

Tests use StubDispatcher to populate a known-passing iter dir, then
mutate it to simulate failure modes. Assertions describe what the hook
emits (exit code + stderr substrings) — never which functions it called.

Test suite: 338 baseline + 5 new = 343 passing.

Closes AI-native-Systems-Research#129.
Refs AI-native-Systems-Research#120.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sriumcp
Copy link
Copy Markdown
Collaborator Author

sriumcp commented May 24, 2026

Superseded by #153 — the consolidated tracking-120 PR carrying all 17 commits in merge order. Closing this in favor of that single PR per project owner's request.

@sriumcp sriumcp closed this May 24, 2026
@sriumcp sriumcp deleted the feat/129-execute-stop-hook branch May 25, 2026 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant