Skip to content

Spec 22: outcome attribution v2 — link sessions → commits → PRs → CI → downstream touches #94

@0bserver07

Description

@0bserver07

Goal

Build the link table that turns "this session existed" into "this session produced commit X, in PR Y, which CI passed, which has been touched 3 times since (twice to fix bugs)." The single feature that makes outcome metrics trustworthy.

Why now

Yield tab today is per-cwd git log correlation — coarse, no PR awareness, no downstream-touch awareness. Comparative benchmark (Spec 26) and risk recommender (Spec 16) both need real attribution to mean anything.

Schema

v019commit_session_link + extends pr_outcomes:

CREATE TABLE commit_session_link (
  commit_sha TEXT NOT NULL,
  repo_slug TEXT NOT NULL,
  session_id TEXT NOT NULL,
  link_type TEXT NOT NULL,        -- 'authored' | 'co-authored' | 'inferred-by-touch'
  confidence REAL NOT NULL,       -- [0, 1]
  evidence_json TEXT,             -- which files / messages established the link
  established_ts TEXT NOT NULL,
  PRIMARY KEY (commit_sha, repo_slug, session_id)
);
CREATE INDEX idx_csl_session ON commit_session_link(session_id);
CREATE INDEX idx_csl_repo_commit ON commit_session_link(repo_slug, commit_sha);

-- Extend pr_outcomes from spec 20:
ALTER TABLE pr_outcomes ADD COLUMN linked_session_ids TEXT;  -- JSON array
ALTER TABLE pr_outcomes ADD COLUMN downstream_touch_count INTEGER DEFAULT 0;
ALTER TABLE pr_outcomes ADD COLUMN last_downstream_touch_ts TEXT;

_ADD_COLUMN_GUARDS entry: 19: ("pr_outcomes", "linked_session_ids").

User-visible surface

  • CLI: stackunderflow attribute commit <sha> → "this commit came from session X (worked, agent: claude-opus-4-7, $0.42)".
  • CLI: stackunderflow attribute session <id> → "this session produced commits A, B; merged in PR feat: yield analysis — correlate sessions with git commits (productive/reverted/abandoned) #42; CI passed; touched 3 times since (1 revert)".
  • API: GET /api/attribution/commit/{sha} and /api/attribution/session/{id}.
  • Meta-agent tools: attribute_commit(sha) and attribute_session(id).
  • UI: extend Yield tab with PR / CI / "still good after N days" columns.

Implementation plan

  1. v019 migration.
  2. New service stackunderflow/services/attribution.py:
    • link_session_to_commit(conn, session_id) — heuristic: walk session messages, extract any git commit -m Bash calls + their resulting commit shas (pull from messages.raw_json's tool_result blocks); for sessions without explicit commits, infer from "session edited file F at time T; commit C touched F at time T+ε".
    • link_pr_to_sessions(conn, pr_id) — match by branch name (sessions that ran git checkout -b <branch> or git push origin <branch>).
    • downstream_touch_count(conn, repo_slug, file_path, since_ts) — via the same git data Yield uses.
  3. Periodic backfill (CLI: stackunderflow attribute backfill --since 30d) — runs on every PR row + every commit-bearing session.
  4. Surface in API + meta-agent + UI.

Tests

  • Authored-link: session that runs git commit -m "msg" → assert link with confidence 1.0.
  • Inferred-link: session edits foo.py, commit C touches foo.py 5min later, no other session touched it → assert link with confidence 0.7+.
  • PR-link: session pushes branch feat/x, PR opened from feat/x → assert link.
  • Downstream-touch: PR merged, file touched 3 times in next 30 days → count = 3.
  • Idempotency: re-running attribution doesn't duplicate links.

Hard parts

  • Heuristic linking is squishy. Document the confidence rubric explicitly. Default min-confidence threshold (0.5) on user-facing surfaces.
  • Session-to-PR linking via branch name fails if the user works on main directly — flag as link_type='inferred-by-touch' with low confidence.
  • Performance: git log per file scales poorly. Cache aggressively; backfill incrementally.

Out of scope

  • Cross-repo attribution (defer — most users have one repo per session).
  • Auto-revert detection beyond the existing git log --grep "Revert" heuristic.
  • Real-time attribution as commits land (this is offline backfill).

Dependencies

  • Blocked by: Spec 20 (PR/CI ingest must be in place).
  • Builds on Yield tab's git helpers in services/yield_tracker.py.
  • Consumed by Spec 26 (comparative benchmark).

Estimated effort

Size XL — single agent, ~3-4 hr. The heuristics + their test coverage are the bulk.

Hard rules

  • DO NOT touch versions / CHANGELOG headings.
  • Pre-assigned schema slot: v019.
  • Branch: feat/outcome-attribution-v2 off main.

Metadata

Metadata

Assignees

No one assigned

    Labels

    size-xl~3-4 hr agent runspecSpec/feature for an agent to implementwave-3Wave 3: outcome attribution + grading

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions