GitHub - peter-stratton/dark-factory: Human constraints, interactive planning, autonomous execution.

     _            _           __            _
  __| | __ _ _ __| | __      / _| __ _  ___| |_ ___  _ __ _   _
 / _` |/ _` | '__| |/ /_____| |_ / _` |/ __| __/ _ \| '__| | | |
| (_| | (_| | |  |   <______|  _| (_| | (__| || (_) | |  | |_| |
 \__,_|\__,_|_|  |_|\_\     |_|  \__,_|\___|\__\___/|_|   \__, |
                                                           |___/

A Go CLI built for Claude Code that orchestrates autonomous AI agents to implement GitHub issues, review their own work, and merge — without human intervention.

Documentation · Getting Started · Releases

Philosophy

The hard part of software engineering isn't typing code — it's deciding what to build and how it fits. Dark Factory keeps those decisions with humans. Engineers write the roadmap, define architecture layers, design conventions, and author issue specs. Agents operate within those constraints. The harness is the design.

This is a collaborative architecture tool, not a "throw a ticket at an AI and hope for the best" system. The adversarial review model reinforces this: a separate reviewer agent checks whether the code respects the architecture a human defined, follows conventions a human wrote, and meets acceptance criteria a human specified. Every judgment call that shapes a codebase stays with the humans who understand it.

Dark Factory has been built entirely by its own agent pipeline — every feature was implemented, reviewed, and merged by godark run. The humans write specs and design harnesses; the agents write code.

Install

Homebrew (macOS):

brew install peter-stratton/dark-factory/godark

Go install:

go install github.com/peter-stratton/dark-factory/cmd/godark@latest

Binary download: grab a pre-built binary from GitHub Releases.

Platform support

Dark Factory is built for Claude Code and GitHub. The architecture is designed around Claude Code's specific capabilities — session resumption, CLAUDE.md as a control surface, slash command skills, and sandboxed execution.

Layer	Supported
AI agent	Claude Code (Anthropic)
Version control	GitHub

Features

Three-agent pipeline — implementer, quality reviewer, and functional reviewer are independent agents with isolated permissions; reviewers literally cannot edit files
Specification-driven quality gates — human-authored scenario specs define "done"; the functional reviewer generates ephemeral integration tests from specs, not just rubber-stamping the diff
Architecture-as-code enforcement — machine-readable layer definitions validated by godark vet; reviewers check architectural compliance, not just correctness
Structured agent dialogue — implementer posts reasoning as PR comments, reviewers challenge it; the PR thread is an auditable record of adversarial design review
Full run observability — local web dashboard with review chain timelines, quality flags, tool traces, and agent dialogue history for every issue
Harness engineering lifecycle — scaffold, validate, and enforce project constraints with godark new, godark init, godark vet, and six harness types
Auto-detected multi-language support — detects project type from marker files and configures the sandbox, build, and test commands automatically
Fully sandboxed agent runs by default — agents execute inside ephemeral Docker containers with no access to the host filesystem or network beyond what's explicitly configured
Single binary, runs on a laptop — no infrastructure fleet, no MCP server farm; just a Go binary, and Docker

How it works

Given a GitHub repo and a milestone, godark runs a three-agent development loop:

Fetch open issues from the milestone, sorted by priority (p1 → p2 → p3 → unlabeled)
Resolve dependencies — issues declare Blocked by: #N or Depends on: #N in their body; skip any whose dependencies are still open
Implementer — Claude Code implements the issue, writes unit tests, and opens a PR
Guard rails — verify the PR exists, contains Closes #N, and didn't touch protected files
Quality reviewer — a separate Claude Code instance audits the PR for security, performance, and code quality issues; if it requests changes, the implementer retries before functional review begins
Functional reviewer — another Claude Code instance reviews the PR against human-authored scenario specs, generates ephemeral integration tests, and approves or requests changes
Retry loop — if either reviewer rejects, the implementer reads the review comments and pushes fixes (max N retries per gate)
Merge or escalate — approved PRs are squash-merged; failed PRs are labeled needs-human-review
Punchlist — for each merged PR, a tool-less punchlist agent generates 3-5 concrete manual acceptance tests (specific config values, commands, expected outcomes) rendered as checkboxes alongside the existing punchlist output
Repeat — move to the next unblocked issue

Quick start

# New project
godark new my-project --repo owner/my-project

# Existing project
godark init --repo owner/my-project

Then open the project in Claude Code and use the built-in skills to define your architecture, conventions, and roadmap. See the Getting Started guide for a full walkthrough.

Documentation

Full documentation is available at godarkfactory.com:

Getting Started — installation, setup, and tutorial
CLI Reference — all commands, flags, and usage examples
Configuration — godark.yaml deep dive
Skills — slash commands for roadmaps, planning, issues, and more
Licensing & Adoption — commercial use, data privacy, and FAQ

Phase overviews

Each completed phase has a practical overview with real-world examples showing what was built and how users experience it. These live in docs/phase-overviews/:

Phase	Overview
1	Skeleton & Orchestration — CLI scaffold, config, deps, dry-run
2	Quality & Vetting — `godark vet` validation framework
3	Docker Sandbox — container isolation, auth, cloning
4	Agent Execution — implementer, reviewer, guard rails, retry loop
5	Agent SDK Migration — SDK wrapper, role permissions, session resumption
6	Multi-Language Support — auto-detect, runtime config, pluggable Dockerfiles
7	Review Quality & Dashboard — run data, quality flags, web dashboard
8	Harness Engineering — harness templates, `godark new`, vet architecture
9	Harness-Aware Agent Execution — harness injection, dialogue, enforcement
10	Deterministic Verification Pipeline — verify step, auto-fix, bash deny-list
11	Run Analysis & Prompt Feedback — `godark analyze`, trends, prompt gaps
12	Complex Project Support — multi-module, codegen, secrets, CI checks
13	Human-in-the-Loop Review — graduated auto-merge, watch command, risk classifier, notifications
14	Bounded Concurrency — wave-barrier dispatcher, RunMode, serial post-wave merge, rate-limit batching, per-issue logs
15	Deferred — Server Mode & Centralized Operation
16	Public Release — ELv2 license, GoReleaser, Homebrew tap, release workflow, CONTRIBUTING.md
17	Configurable Base Branch — base branch config, PR targeting, prompt safety, run data tracking
18	Adaptive Agent Loop — recon agent, hybrid retry strategy, handoff context
19	Spring Cleaning — unified verdict parsing, typed constants, shared helpers, CLI consolidation
20	Terminal UI — Bubble Tea TUI, progress reporter, adaptive colors, hybrid output mode
21	Analytics Persistence — SQLite stats store, retry recovery rate, cost/duration breakdown, repo stats, flag-based prompt gaps
23	Watch & Daemon Mode — shared watch package, daemon mode, external merge detection, watch TUI and dashboard
24	Container Resource Tracking — Docker stats capture, per-step memory/CPU, analyze output, dashboard columns, host mode
25	Docker Socket Mount & Compose Lifecycle — compose config, socket mount, up/down lifecycle, env forwarding, doctor checks
22	Analytics Overhaul — first-pass rate, wasted cost, failure reasons, per-repo breakdown, sprint report command
26	Merge Coordinator Agent - dedicated conflict resolver, per-issue and rollup integration, telemetry, dashboard step
27	Agent Efficiency & Resilience - per-role judge thresholds, benign kill handling, model overrides, handoff context, generalized recon
28	Container Health Judge — real-time log streaming, idle/thrash/transport rules, container retry, intervention flow
29	Complete CLI Migration - delete Python runner, simplify Run(), remove --no-sandbox, unconditional Docker, test migration
30	Spec Tightening - GIVEN/WHEN/THEN validation, phase-scoped vet, spec delta generation, pipeline integration
31	Planner Agent - structured implementation plans, non-blocking pipeline step, implementer prompt injection, model override
32	Decision Flow Tracing - trace ID generation, SQLite persistence, `godark trace` CLI, dashboard copy button, TUI column
33	Semi-Structured Review - semi-formal reviewer prompt, config toggle, consistency quality gate, automatic re-run on contradiction

To generate an overview for a newly completed phase, use /godark-create-phase-overview <phase-number>.

Building

go build -o bin/godark ./cmd/godark
go test ./...

Status

See docs/roadmap/ for the full development roadmap.

License

Dark Factory is licensed under the Elastic License 2.0. Free for commercial use — the only restriction is you can't resell it as a hosted service. See the Licensing & Adoption page for details.

Name		Name	Last commit message	Last commit date
Latest commit History 603 Commits
.claude		.claude
.github		.github
cmd/godark		cmd/godark
docs		docs
internal		internal
prompts		prompts
tests/scenarios		tests/scenarios
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yaml		.goreleaser.yaml
.tool-versions		.tool-versions
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum
godark.yaml		godark.yaml
lefthook.yml		lefthook.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Philosophy

Install

Platform support

Features

How it works

Quick start

Documentation

Phase overviews

Building

Status

License

About

Uh oh!

Releases 41

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Philosophy

Install

Platform support

Features

How it works

Quick start

Documentation

Phase overviews

Building

Status

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 41

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages