Northeastern-Electric-Racing · bracyw · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026
diff --git a/.claude/skills/address-pr-comments/SKILL.md b/.claude/skills/address-pr-comments/SKILL.md
@@ -0,0 +1,112 @@
+---
+name: address-pr-comments
+description: Fetch review comments on the current branch's PR, judge whether each (including outdated ones) still applies, and walk through fixes
+allowed-tools: Bash(git:*), Bash(gh pr:*), Bash(gh api:*)
+user-invocable: true
+---
+
+## Your task
+
+Walk through unresolved review feedback on the current branch's PR. For every comment — including ones GitHub has marked outdated — judge whether the underlying concern still applies to the current code, then propose and implement fixes. **Do not reply to comment threads.** The user will handle replies manually.
+
+Run each step in order. Stop and report if any step fails.
+
+### 1. Identify the PR
+
+```bash
+gh pr view --json number,title,headRefName,baseRefName
+```
+
+If no PR exists for the current branch, stop and tell the user there's nothing to address.
+
+Capture the PR number — you'll need it for every subsequent `gh api` call.
+
+### 2. Fetch all review feedback
+
+Pull the three comment surfaces GitHub exposes. You need all of them — reviewers use different ones and they don't overlap:
+
+```bash
+# Inline review comments (line-level, "Files changed" tab) — includes outdated ones
+gh api "repos/{owner}/{repo}/pulls/<PR_NUMBER>/comments" --paginate
+
+# Top-level PR conversation comments ("Conversation" tab)
+gh api "repos/{owner}/{repo}/issues/<PR_NUMBER>/comments" --paginate
+
+# Review summaries (approve/request-changes/comment with a top-level body)
+gh api "repos/{owner}/{repo}/pulls/<PR_NUMBER>/reviews" --paginate
+```
+
+Derive `{owner}/{repo}` from `gh repo view --json nameWithOwner -q .nameWithOwner`.
+
+For each inline review comment, note: `id`, `path`, `line` (or `original_line`), `body`, `user.login`, `commit_id`, `original_commit_id`, `diff_hunk`, `in_reply_to_id`, and the `position` / `original_position` fields. `position: null` means GitHub considers the comment **outdated**.
+
+### 3. Filter to unresolved, unique threads
+
+- Collapse reply chains by `in_reply_to_id` — keep the whole chain together and treat the latest message as the current state of the thread.
+- Drop threads that have been resolved (check the `resolved` field on review threads via GraphQL if needed — see fallback below).
+- Skip bot comments (dependabot, codecov, etc.) unless they flag something a human should act on.
+- Deduplicate comments that overlap with the Phase 1/2/3 review findings you already fixed — if a fix is already in the latest commit, note it as already-addressed and move on.
+
+**GraphQL fallback for resolved state** (REST doesn't expose it directly):
+
+```bash
+gh api graphql -f query='
+query($owner:String!, $repo:String!, $num:Int!) {
+  repository(owner:$owner, name:$repo) {
+    pullRequest(number:$num) {
+      reviewThreads(first:100) {
+        nodes { id isResolved isOutdated comments(first:20) { nodes { id databaseId body path line originalLine author { login } } } }
+      }
+    }
+  }
+}' -f owner=OWNER -f repo=REPO -F num=PR_NUMBER
+```
+
+This is the most reliable source: `isResolved` and `isOutdated` on each thread, plus all comments in order.
+
+### 4. Understand outdated comments (CRITICAL)
+
+An **outdated** comment is one whose original line no longer exists at the same position in HEAD — usually because the code around it was rewritten, reformatted, or moved. GitHub hides these behind a "show outdated" toggle, but **the reviewer's concern may still apply**. Do not auto-dismiss them.
+
+For every outdated comment, do this analysis:
+
+1. **Read the `diff_hunk`** — that's the exact code context the reviewer saw. It shows the lines immediately around the comment.
+2. **Search the current working tree** for the code the reviewer referenced. Use content-based search (Grep for a function name, variable, specific expression from the hunk), NOT line numbers. The code may have moved files.
+3. **Compare what the reviewer said to the current code** and classify the comment into one of:
+   - **Already addressed** — the specific change the reviewer wanted is present in HEAD. Mark it done, move on.
+   - **Still applies, same location** — the code exists nearby, reviewer's concern is still valid. Fix it.
+   - **Still applies, moved** — the same pattern exists elsewhere in the diff. Fix it at the new location.
+   - **Obsoleted by a larger rewrite** — the code the reviewer commented on no longer exists and the replacement doesn't have the same issue. Mark it done with a brief note.
+   - **Unclear** — can't tell without more context. Flag for the user to decide.
+
+State your classification explicitly for each outdated comment before proposing a fix. Do not silently skip outdated comments.
+
+### 5. Present the comment list
+
+Before making any changes, show the user a structured summary:
+
+- Group comments by `path`, then by reviewer
+- For each comment: quote the body (1-2 line excerpt is fine), show your classification, and propose the fix you'd make
+- Ask the user to confirm, skip, or modify each proposed fix before you edit code
+
+Use this format:
+
+```
+<file>:<line> — <reviewer>
+Comment: "<excerpt from body>"
+Status: <active | outdated>
+Classification: <one of the 5 above>
+Proposed fix: <what you'd change>
+```
+
+If there are many comments (>8), ask the user whether they want to go one-by-one or have you implement all the clearly-valid ones in a batch and only interrupt on unclear ones.
+
+### 6. Apply fixes
+
+Make the code changes. Commit at logical boundaries (not one commit per comment — group related fixes). Use the repo's commit convention (`/commit` skill).
+
+**Do not post replies to any comment thread.** The user will reply manually after reviewing your fixes. Your job ends at the code change + commit.
+
+### 7. Report
+
+List the comments addressed (with file:line and one-line summary of the fix), the ones you classified as already-addressed or obsoleted (with reasoning), and any you flagged as unclear. Include the commit SHAs so the user can reference them in their manual replies.
diff --git a/.claude/skills/caveman/SKILL.md b/.claude/skills/caveman/SKILL.md
@@ -0,0 +1,49 @@
+---
+name: caveman
+description: >
+  Ultra-compressed communication mode. Cuts token usage ~75% by dropping
+  filler, articles, and pleasantries while keeping full technical accuracy.
+  Use when user says "caveman mode", "talk like caveman", "use caveman",
+  "less tokens", "be brief", or invokes /caveman.
+---
+
+Respond terse like smart caveman. All technical substance stay. Only fluff die.
+
+## Persistence
+
+ACTIVE EVERY RESPONSE once triggered. No revert after many turns. No filler drift. Still active if unsure. Off only when user says "stop caveman" or "normal mode".
+
+## Rules
+
+Drop: articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries (sure/certainly/of course/happy to), hedging. Fragments OK. Short synonyms (big not extensive, fix not "implement a solution for"). Abbreviate common terms (DB/auth/config/req/res/fn/impl). Strip conjunctions. Use arrows for causality (X -> Y). One word when one word enough.
+
+Technical terms stay exact. Code blocks unchanged. Errors quoted exact.
+
+Pattern: `[thing] [action] [reason]. [next step].`
+
+Not: "Sure! I'd be happy to help you with that. The issue you're experiencing is likely caused by..."
+Yes: "Bug in auth middleware. Token expiry check use `<` not `<=`. Fix:"
+
+### Examples
+
+**"Why React component re-render?"**
+
+> Inline obj prop -> new ref -> re-render. `useMemo`.
+
+**"Explain database connection pooling."**
+
+> Pool = reuse DB conn. Skip handshake -> fast under load.
+
+## Auto-Clarity Exception
+
+Drop caveman temporarily for: security warnings, irreversible action confirmations, multi-step sequences where fragment order risks misread, user asks to clarify or repeats question. Resume caveman after clear part done.
+
+Example -- destructive op:
+
+> **Warning:** This will permanently delete all rows in the `users` table and cannot be undone.
+>
+> ```sql
+> DROP TABLE users;
+> ```
+>
+> Caveman resume. Verify backup exist first.
diff --git a/.claude/skills/commit/SKILL.md b/.claude/skills/commit/SKILL.md
@@ -0,0 +1,36 @@
+---
+name: commit
+description: Stage and commit using this repo's commit message convention
+allowed-tools: Bash(git add:*), Bash(git status:*), Bash(git commit:*), Bash(git diff:*)
+user-invocable: true
+---
+
+## Context
+
+Current branch:
+`!git branch --show-current`
+
+Git status:
+`!git status`
+
+Staged + unstaged diff:
+`!git diff HEAD`
+
+Recent commits:
+`!git log --oneline -5`
+
+## Task
+
+1. Extract the ticket number from the current branch name using pattern: `{number}-{description}` — the leading number is the GitHub issue (e.g., branch `533-csv-upload-download-rules` → ticket `#533`)
+2. Review the diff and stage all relevant changed files (avoid staging unrelated or generated files)
+3. Write a commit message in the format: `#{ticket_number} - {concise description of changes}` — example: `#501 - fmt`
+
+   Keep the {concise description} caveman-terse (borrowed from the caveman skill): imperative, drop articles and filler, fragments fine, abbreviate common terms (fmt, refactor, deps, config), aim for 2-8 words. Prefer `#533 - add CSV upload endpoint` over `#533 - this commit adds a new endpoint for uploading CSV files`.
+4. Commit in a single operation using a HEREDOC for the message:
+   ```bash
+   git commit -m "$(cat <<'EOF'
+   #NNN - description here
+   EOF
+   )"
+   ```
+5. Run `git status` to confirm the commit succeeded
diff --git a/.claude/skills/diagnose/SKILL.md b/.claude/skills/diagnose/SKILL.md
@@ -0,0 +1,117 @@
+---
+name: diagnose
+description: Disciplined diagnosis loop for hard bugs and performance regressions. Reproduce → minimise → hypothesise → instrument → fix → regression-test. Use when user says "diagnose this" / "debug this", reports a bug, says something is broken/throwing/failing, or describes a performance regression.
+---
+
+# Diagnose
+
+A discipline for hard bugs. Skip phases only when explicitly justified.
+
+When exploring the codebase, use the project's domain glossary to get a clear mental model of the relevant modules, and check ADRs in the area you're touching.
+
+## Phase 1 — Build a feedback loop
+
+**This is the skill.** Everything else is mechanical. If you have a fast, deterministic, agent-runnable pass/fail signal for the bug, you will find the cause — bisection, hypothesis-testing, and instrumentation all just consume that signal. If you don't have one, no amount of staring at code will save you.
+
+Spend disproportionate effort here. **Be aggressive. Be creative. Refuse to give up.**
+
+### Ways to construct one — try them in roughly this order
+
+1. **Failing test** at whatever seam reaches the bug — unit, integration, e2e.
+2. **Curl / HTTP script** against a running dev server.
+3. **CLI invocation** with a fixture input, diffing stdout against a known-good snapshot.
+4. **Headless browser script** (Playwright / Puppeteer) — drives the UI, asserts on DOM/console/network.
+5. **Replay a captured trace.** Save a real network request / payload / event log to disk; replay it through the code path in isolation.
+6. **Throwaway harness.** Spin up a minimal subset of the system (one service, mocked deps) that exercises the bug code path with a single function call.
+7. **Property / fuzz loop.** If the bug is "sometimes wrong output", run 1000 random inputs and look for the failure mode.
+8. **Bisection harness.** If the bug appeared between two known states (commit, dataset, version), automate "boot at state X, check, repeat" so you can `git bisect run` it.
+9. **Differential loop.** Run the same input through old-version vs new-version (or two configs) and diff outputs.
+10. **HITL bash script.** Last resort. If a human must click, drive _them_ with `scripts/hitl-loop.template.sh` so the loop is still structured. Captured output feeds back to you.
+
+Build the right feedback loop, and the bug is 90% fixed.
+
+### Iterate on the loop itself
+
+Treat the loop as a product. Once you have _a_ loop, ask:
+
+- Can I make it faster? (Cache setup, skip unrelated init, narrow the test scope.)
+- Can I make the signal sharper? (Assert on the specific symptom, not "didn't crash".)
+- Can I make it more deterministic? (Pin time, seed RNG, isolate filesystem, freeze network.)
+
+A 30-second flaky loop is barely better than no loop. A 2-second deterministic loop is a debugging superpower.
+
+### Non-deterministic bugs
+
+The goal is not a clean repro but a **higher reproduction rate**. Loop the trigger 100×, parallelise, add stress, narrow timing windows, inject sleeps. A 50%-flake bug is debuggable; 1% is not — keep raising the rate until it's debuggable.
+
+### When you genuinely cannot build a loop
+
+Stop and say so explicitly. List what you tried. Ask the user for: (a) access to whatever environment reproduces it, (b) a captured artifact (HAR file, log dump, core dump, screen recording with timestamps), or (c) permission to add temporary production instrumentation. Do **not** proceed to hypothesise without a loop.
+
+Do not proceed to Phase 2 until you have a loop you believe in.
+
+## Phase 2 — Reproduce
+
+Run the loop. Watch the bug appear.
+
+Confirm:
+
+- [ ] The loop produces the failure mode the **user** described — not a different failure that happens to be nearby. Wrong bug = wrong fix.
+- [ ] The failure is reproducible across multiple runs (or, for non-deterministic bugs, reproducible at a high enough rate to debug against).
+- [ ] You have captured the exact symptom (error message, wrong output, slow timing) so later phases can verify the fix actually addresses it.
+
+Do not proceed until you reproduce the bug.
+
+## Phase 3 — Hypothesise
+
+Generate **3–5 ranked hypotheses** before testing any of them. Single-hypothesis generation anchors on the first plausible idea.
+
+Each hypothesis must be **falsifiable**: state the prediction it makes.
+
+> Format: "If <X> is the cause, then <changing Y> will make the bug disappear / <changing Z> will make it worse."
+
+If you cannot state the prediction, the hypothesis is a vibe — discard or sharpen it.
+
+**Show the ranked list to the user before testing.** They often have domain knowledge that re-ranks instantly ("we just deployed a change to #3"), or know hypotheses they've already ruled out. Cheap checkpoint, big time saver. Don't block on it — proceed with your ranking if the user is AFK.
+
+## Phase 4 — Instrument
+
+Each probe must map to a specific prediction from Phase 3. **Change one variable at a time.**
+
+Tool preference:
+
+1. **Debugger / REPL inspection** if the env supports it. One breakpoint beats ten logs.
+2. **Targeted logs** at the boundaries that distinguish hypotheses.
+3. Never "log everything and grep".
+
+**Tag every debug log** with a unique prefix, e.g. `[DEBUG-a4f2]`. Cleanup at the end becomes a single grep. Untagged logs survive; tagged logs die.
+
+**Perf branch.** For performance regressions, logs are usually wrong. Instead: establish a baseline measurement (timing harness, `performance.now()`, profiler, query plan), then bisect. Measure first, fix second.
+
+## Phase 5 — Fix + regression test
+
+Write the regression test **before the fix** — but only if there is a **correct seam** for it.
+
+A correct seam is one where the test exercises the **real bug pattern** as it occurs at the call site. If the only available seam is too shallow (single-caller test when the bug needs multiple callers, unit test that can't replicate the chain that triggered the bug), a regression test there gives false confidence.
+
+**If no correct seam exists, that itself is the finding.** Note it. The codebase architecture is preventing the bug from being locked down. Flag this for the next phase.
+
+If a correct seam exists:
+
+1. Turn the minimised repro into a failing test at that seam.
+2. Watch it fail.
+3. Apply the fix.
+4. Watch it pass.
+5. Re-run the Phase 1 feedback loop against the original (un-minimised) scenario.
+
+## Phase 6 — Cleanup + post-mortem
+
+Required before declaring done:
+
+- [ ] Original repro no longer reproduces (re-run the Phase 1 loop)
+- [ ] Regression test passes (or absence of seam is documented)
+- [ ] All `[DEBUG-...]` instrumentation removed (`grep` the prefix)
+- [ ] Throwaway prototypes deleted (or moved to a clearly-marked debug location)
+- [ ] The hypothesis that turned out correct is stated in the commit / PR message — so the next debugger learns
+
+**Then ask: what would have prevented this bug?** If the answer involves architectural change (no good test seam, tangled callers, hidden coupling) hand off to the `/improve-codebase-architecture` skill with the specifics. Make the recommendation **after** the fix is in, not before — you have more information now than when you started.
diff --git a/.claude/skills/diagnose/scripts/hitl-loop.template.sh b/.claude/skills/diagnose/scripts/hitl-loop.template.sh
@@ -0,0 +1,41 @@
+#!/usr/bin/env bash
+# Human-in-the-loop reproduction loop.
+# Copy this file, edit the steps below, and run it.
+# The agent runs the script; the user follows prompts in their terminal.
+#
+# Usage:
+#   bash hitl-loop.template.sh
+#
+# Two helpers:
+#   step "<instruction>"          → show instruction, wait for Enter
+#   capture VAR "<question>"      → show question, read response into VAR
+#
+# At the end, captured values are printed as KEY=VALUE for the agent to parse.
+
+set -euo pipefail
+
+step() {
+  printf '\n>>> %s\n' "$1"
+  read -r -p "    [Enter when done] " _
+}
+
+capture() {
+  local var="$1" question="$2" answer
+  printf '\n>>> %s\n' "$question"
+  read -r -p "    > " answer
+  printf -v "$var" '%s' "$answer"
+}
+
+# --- edit below ---------------------------------------------------------
+
+step "Open the app at http://localhost:3000 and sign in."
+
+capture ERRORED "Click the 'Export' button. Did it throw an error? (y/n)"
+
+capture ERROR_MSG "Paste the error message (or 'none'):"
+
+# --- edit above ---------------------------------------------------------
+
+printf '\n--- Captured ---\n'
+printf 'ERRORED=%s\n' "$ERRORED"
+printf 'ERROR_MSG=%s\n' "$ERROR_MSG"