A reusable long-running agent harness skill: turn a long task into a recoverable loop of
init -> run (small step) -> verify (gates) -> retry/rollback -> escalate -> handoff.
This repo contains a single skill folder (originally developed under .claude/skills/).
SKILL.md: Skill entrypoint (trigger + workflow)agents/openai.yaml: UI metadata (optional but included)script/: Executable harness toolsreference/: Playbooks / guidanceassets/: Icons
- Initialize a task directory:
bash script/init_harness.sh \
--task "Fix XXX and keep regression green" \
--verify-cmd "./gradlew test --tests com.foo.BarTest"This prints a task dir like .agent-harness/<task-id> and creates durable artifacts:
task.json,feature_list.json,claude-progress.txt,init.sh
- (Recommended) Get bearings at the start of each session:
bash script/get_bearings.sh --task-dir .agent-harness/<task-id>- Run attempts with verify gates:
bash script/run_attempt.sh \
--task-dir .agent-harness/<task-id> \
--run-cmd "<your incremental command>" \
--verify-cmd "./gradlew test --tests com.foo.BarTest" \
--durability sync \
--heartbeat-seconds 15 \
--max-attempts 12cat >/tmp/verify.plan <<'EOF_PLAN'
unit::./gradlew test --tests com.foo.UnitTest
integration::./gradlew integrationTest
EOF_PLAN
bash script/run_attempt.sh \
--task-dir .agent-harness/<task-id> \
--run-cmd "<your incremental command>" \
--verify-plan /tmp/verify.plan \
--escalate-after 2 \
--stop-on-escalation \
--max-attempts 12Generate a handoff file (also generated automatically on success/escalation/failure):
bash script/write_handoff.sh --task-dir .agent-harness/<task-id>Run the full maintenance gate (validate + self-test + lint):
bash script/release_check.sh--rollback gitis destructive and requires--allow-hard-reset.- By default
run_attempt.shlocks the task dir to prevent concurrent writers.