VictorVVedtion · VictorVVedtion · Apr 7, 2026 · Apr 7, 2026 · Apr 7, 2026 · Apr 7, 2026
diff --git a/.selfmodel/playbook/evolution-protocol.md b/.selfmodel/playbook/evolution-protocol.md
diff --git a/.selfmodel/playbook/orchestration-loop.md b/.selfmodel/playbook/orchestration-loop.md
@@ -220,6 +220,16 @@ LOOP:
      - Append to quality.jsonl
      - Append to orchestration.log
 
+  8.5. EVOLUTION CHECK (every 10 MERGED Sprints)
+       a. Read team.json → evolution.last_review_sprint
+       b. Count MERGED sprints since last review (from quality.jsonl or plan.md)
+       c. If count >= 10:
+          i.   Run evolution detection (equivalent to selfmodel evolve --detect)
+          ii.  Log: phase=<N> event=evolution_detect candidates=<N>
+          iii. If candidates > 0: notify user "N evolution candidates. Run /selfmodel:evolve"
+          iv.  Update team.json: evolution.last_review_sprint = current_sprint
+       d. If count < 10: skip
+
   9. CHECK context health
      - Phase boundary (all sprints in current phase MERGED) → Phase Gate → FORCE RESET
      - Context > 70% → FORCE RESET

diff --git a/.selfmodel/state/evolution.jsonl b/.selfmodel/state/evolution.jsonl
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -248,6 +248,7 @@ Contract template → read `.selfmodel/playbook/sprint-template.md`
 | Quality review + scoring | `.selfmodel/playbook/quality-gates.md` |
 | Sprint contract creation | `.selfmodel/playbook/sprint-template.md` |
 | Lessons learned + evolution | `.selfmodel/playbook/lessons-learned.md` |
+| Evolution pipeline + upstream PR | `.selfmodel/playbook/evolution-protocol.md` |
 | Independent evaluation + skeptical prompt | `.selfmodel/playbook/evaluator-prompt.md` |
 | Automated orchestration loop (large projects) | `.selfmodel/playbook/orchestration-loop.md` |
 | E2E 验证协议 v2 | `.selfmodel/playbook/e2e-protocol-v2.md` |
@@ -295,8 +296,9 @@ Contract template → read `.selfmodel/playbook/sprint-template.md`
 
 ## Evolution
 
-**Trigger**: Every 10 Sprints completed
+**Trigger**: Every 10 Sprints completed (auto-detected at orchestration-loop Step 8.5)
 **Cycle**: `MEASURE → DIAGNOSE → PROPOSE → EXPERIMENT → EVALUATE → SELECT`
+**Pipeline**: `DETECT → STAGE → SUBMIT → TRACK` (for upstream contributions)
 
 1. **MEASURE** — Extract trends from quality.jsonl
 2. **DIAGNOSE** — Identify systemic bottlenecks
@@ -305,6 +307,10 @@ Contract template → read `.selfmodel/playbook/sprint-template.md`
 5. **EVALUATE** — Validate with data
 6. **SELECT** — Effective → write to lessons-learned.md | Ineffective → discard with record
 
+**Upstream contribution**: Validated improvements (Result: improved) are candidates
+for upstream PRs. Run `selfmodel evolve --detect` or `/selfmodel:evolve` to scan.
+Human approval required before any PR submission. Full protocol: `playbook/evolution-protocol.md`.
+
 **Skill discovery**: New need → try existing skill → evaluate → keep or discard
 
 ## Danger Zones
@@ -314,6 +320,7 @@ Contract template → read `.selfmodel/playbook/sprint-template.md`
 - Modifying `CLAUDE.md` (this file)
 - Modifying `.selfmodel/playbook/` rule files
 - Deleting `.selfmodel/state/` state files
+- Submitting evolution PRs to upstream (`selfmodel evolve --submit`)
 - Force push to main
 
 ### ABSOLUTELY FORBIDDEN
@@ -347,6 +354,7 @@ selfmodel/
     ├── state/dispatch-config.json     # Dispatch gate config (cap, convergence files)
     ├── state/quality.jsonl            # Quality score history
     ├── state/evolution.jsonl          # Evolution log
+    ├── state/evolution-staging/       # Staged evolution patches (pre-PR)
     ├── state/orchestration.log        # Orchestration loop event log
     ├── reviews/                       # Review records
     └── playbook/                      # On-demand loaded rules

diff --git a/README.md b/README.md
@@ -189,6 +189,15 @@ Evaluator         E2E Agent v2
 - **Self-evolution** — Every 10 sprints: MEASURE → DIAGNOSE → PROPOSE → EXPERIMENT → EVALUATE → SELECT. Hook interception logs feed into evolution analysis.
 - **Chaos testing (/rampage)** — "Be Water" philosophy. 4 surface engines (WEB, CLI, API, LIB) × 7 user personas (Impatient, Confused, Explorer, Multitasker, Edge Case, Abandoner, Speedrunner). Maps all user journeys, then walks each with chaotic behaviors. Advisory quality gate after E2E pass.
 
+## Evolution Pipeline
+
+Every 10 completed sprints, selfmodel can turn validated local process improvements into upstream contributions through the Evolution Pipeline. It scans local diffs and lessons learned for reusable changes, stages only generalizable patches, and records pipeline state in `.selfmodel/state/evolution.jsonl`. Run `/selfmodel:evolve` for the guided workflow; full protocol: [`.selfmodel/playbook/evolution-protocol.md`](.selfmodel/playbook/evolution-protocol.md).
+
+- **DETECT** — Compare local playbook, hook, script, and lessons-learned changes against the upstream baseline to create CANDIDATE entries.
+- **STAGE** — Interactively classify candidates, strip project-specific details, and generate patch files in `.selfmodel/state/evolution-staging/`.
+- **SUBMIT** — Package staged patches into an upstream PR after path audits and applicability checks. Human approval is required before any submission.
+- **TRACK** — Monitor open PRs and sync ACCEPTED, REJECTED, or CONFLICT states back into `evolution.jsonl`.
+
 ## Chaos Testing: /rampage
 
 `/rampage` is a standalone Claude Code skill that acts as the most chaotic, boundary-pushing user imaginable. It finds bugs that systematic QA never catches: race conditions, state corruption, navigation traps, input edge cases.

diff --git a/commands/evolve.md b/commands/evolve.md
@@ -0,0 +1,100 @@
+---
+description: "Evolution-to-PR pipeline: detect local improvements, classify, and submit upstream"
+allowed-tools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"]
+argument-hint: "[--detect] [--stage] [--submit] [--track] [--status]"
+---
+
+# /selfmodel:evolve
+
+Run the Evolution-to-PR Pipeline per `{baseDir}/references/evolution-protocol.md`.
+
+## Prerequisites
+- Git repo with selfmodel initialized (`.selfmodel/` exists)
+- Upstream baseline available (git remote `upstream` or `.selfmodel/state/upstream-baseline.sha`)
+- For `--submit`: `gh` CLI authenticated with upstream repo access
+
+## Modes
+
+### Default (no flags)
+Run full interactive pipeline: DETECT → STAGE (interactive) → offer SUBMIT.
+
+### `--detect`
+Detection only. Scan local diffs against upstream baseline. Append CANDIDATE entries
+to `evolution.jsonl`. Read-only except for evolution.jsonl writes. Safe to run anytime.
+
+### `--stage`
+Interactive classification. Walk through CANDIDATE entries, display diffs, recommend
+classification based on generalizability heuristics. User decides: Stage / Reject / Keep.
+Generate patch files for STAGED entries in `.selfmodel/state/evolution-staging/`.
+
+### `--submit`
+Create upstream PR from STAGED patches. Pre-submission checks: shellcheck, path audit,
+patch applicability. **Requires explicit human approval** before `gh pr create`.
+
+### `--track`
+Monitor submitted PRs. Query status via `gh pr view`, update evolution.jsonl entries
+to ACCEPTED / REJECTED_UPSTREAM / CONFLICT. Handle CONFLICT by creating SUPERSEDED
+entries and new CANDIDATE entries with updated diffs.
+
+### `--status`
+Display pipeline status summary without running any phase:
+```
+Evolution Pipeline Status
+─────────────────────────
+CANDIDATE:                3
+STAGED:                   2
+SUBMITTED:                1 (PR #42 open)
+ACCEPTED:                 5
+REJECTED_PROJECT_SPECIFIC: 4
+REJECTED_UPSTREAM:        0
+CONFLICT:                 0
+SUPERSEDED:               1
+─────────────────────────
+Last detect: Sprint 30 (2026-04-01)
+Last submit: Sprint 20 (2026-03-15)
+```
+
+## Pipeline Steps
+
+1. **DETECT**: Compare local playbook/hooks/scripts against upstream baseline.
+   Sources: playbook diffs, hook diffs, script diffs, validated lessons (Result: improved),
+   hook intercept patterns, quality trends. Output: CANDIDATE entries in evolution.jsonl.
+
+2. **STAGE**: Interactive classification. Each CANDIDATE presented with diff preview
+   and heuristic recommendation. User decides: [S]tage / [R]eject / [K]eep / [E]dit.
+   STAGED entries produce patches in `.selfmodel/state/evolution-staging/<evo-id>/`.
+
+3. **SUBMIT**: Human-approved PR creation. Pre-checks (shellcheck, path audit, patch
+   applicability) → PR preview → human approval gate → `gh pr create`. PR template
+   includes evidence table from evolution.jsonl entries.
+
+4. **TRACK**: Monitor submitted PRs. ACCEPTED / REJECTED_UPSTREAM / CONFLICT.
+   CONFLICT triggers SUPERSEDE flow: old entry marked, new CANDIDATE created.
+
+## Generalizability Heuristics
+
+Five heuristics score each candidate (0.0 to 1.0):
+
+1. **PATH_DETECTION** — Absolute paths outside examples → project-specific
+2. **PROJECT_NAME_DETECTION** — Project name in logic/strings → project-specific
+3. **GENERIC_PATTERN** — New section without project nouns → generalizable
+4. **HOOK_FIX** — Hook change + intercept log false positives → generalizable
+5. **SCORING_CALIBRATION** — Threshold change + quality.jsonl trend → generalizable
+
+## Safety Rules
+
+- Human MUST approve before any PR submission (SUBMIT has mandatory gate)
+- Detection is read-only (only writes evolution.jsonl)
+- Never submit project-specific paths, names, or credentials
+- All .sh patches must pass shellcheck
+- evolution.jsonl is append-only (no deletions)
+- Upstream conflict → SUPERSEDE, never force push
+
+## State Files
+
+| File | Purpose |
+|------|---------|
+| `.selfmodel/state/evolution.jsonl` | All evolution entries (append-only) |
+| `.selfmodel/state/evolution-staging/<evo-id>/` | Patch files for STAGED entries |
+| `.selfmodel/state/upstream-baseline.sha` | Upstream reference point |
+| `.selfmodel/state/team.json` → `evolution` | Persistent counters and timestamps |