fix(rfspec): persist results and support fire-and-forget polling

TheFactoriousDROID · TheFactoriousDROID · commit a7e2bf0bec19 · 2026-03-09T20:11:55.000-07:00
The run.sh script spawns three droid exec calls that take several minutes,
but the Execute tool times out at 60s. When that happens the temp dir
self-destructs and results are lost.

Changes:
- Write model outputs to persistent ~/.factory/rfspec/runs/&lt;id&gt;/ instead of
  a temp dir
- Print RFSPEC_RUN_DIR path immediately so the agent captures it before timeout
- Write a done sentinel (STATUS=complete|failed) for polling
- Update SKILL.md (v1.3.0) with fire-and-forget + poll workflow instructions
diff --git a/plugins/rfspec/skills/rfspec/SKILL.md b/plugins/rfspec/skills/rfspec/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: rfspec
-version: 1.2.0
+version: 1.3.0
 description: |
   Multi-model spec generation and synthesis. Use when the user wants to:
   - Get competing proposals from different AI models
@@ -17,20 +17,63 @@ Fan out a prompt to multiple models, compare their responses, and help the user
 
 ## Quick Reference
 
-| Task | Action |
-|------|--------|
-| Generate competing specs | `/rfspec <prompt>` |
-| Pick one result | Select via AskUser after comparison |
-| Synthesize results | Combine strongest elements when user chooses synthesis |
-| Save final spec | Write to `specs/active/YYYY-MM-DD-<slug>.md` |
+| Task                     | Action                                                 |
+| ------------------------ | ------------------------------------------------------ |
+| Generate competing specs | `/rfspec <prompt>` (background)                        |
+| Poll for results         | Check `<run_dir>/done` sentinel                        |
+| Pick one result          | Select via AskUser after comparison                    |
+| Synthesize results       | Combine strongest elements when user chooses synthesis |
+| Save final spec          | Write to `specs/active/YYYY-MM-DD-<slug>.md`           |
 
 ## Workflow
 
-1. Run `/rfspec <user's prompt>` -- fires parallel model calls, returns labeled options (A, B, C).
-2. Evaluate the results -- see [references/evaluation-guide.md](references/evaluation-guide.md).
-3. Present the choice to the user via AskUser.
-4. Present the selected or synthesized spec via ExitSpecMode for user review.
-5. Save to `specs/active/` only after the user approves in spec mode.
+The `/rfspec` command spawns three `droid exec` calls in parallel. These take
+several minutes, far exceeding the Execute tool timeout. You MUST use the
+fire-and-forget + poll pattern.
+
+### Step 1 -- Launch (background)
+
+Run the command with `fireAndForget=true`:
+
+```
+Execute: /rfspec <user's prompt>
+  fireAndForget: true
+```
+
+The script immediately prints `RFSPEC_RUN_DIR=<path>` to its log file.
+Read the log file (path printed by Execute) to capture the run directory.
+
+### Step 2 -- Poll for completion
+
+Tell the user the models are running and you will check back. Then poll:
+
+```
+Execute: cat <run_dir>/done 2>/dev/null || echo "PENDING"
+```
+
+Poll every 30-60 seconds. The sentinel contains `STATUS=complete` or
+`STATUS=failed`. While waiting, you can do other work or let the user know
+progress.
+
+### Step 3 -- Read results
+
+Once `done` exists, read the results:
+
+```
+Read: <run_dir>/results.md
+```
+
+This file contains all three model outputs as markdown sections (Option A, B, C).
+
+### Step 4 -- Evaluate and present
+
+Evaluate the results -- see [references/evaluation-guide.md](references/evaluation-guide.md).
+Present the choice to the user via AskUser.
+
+### Step 5 -- Finalize
+
+Present the selected or synthesized spec via ExitSpecMode for user review.
+Save to `specs/active/` only after the user approves in spec mode.
 
 ## Saving
 
@@ -63,29 +106,31 @@ Example 1: User wants competing specs
 User says: "Get me specs from multiple models for adding a dark mode toggle"
 Actions:
 
-1. Run `/rfspec add a dark mode toggle to the settings page with persistent user preference`
-2. Read Options A, B, C
-3. Compare: "Option A uses CSS variables with a React context, Option B uses Tailwind's dark class with localStorage, Option C uses a theme provider with system preference detection."
-4. Present choice via AskUser
-Result: User picks Option B, saved to `specs/active/2026-03-06-dark-mode-toggle.md`
+1. Execute `/rfspec add a dark mode toggle ...` with `fireAndForget=true`
+2. Read the background log to get `RFSPEC_RUN_DIR`
+3. Tell user: "Models are running, I'll check back shortly."
+4. Poll `<run_dir>/done` until `STATUS=complete`
+5. Read `<run_dir>/results.md`, compare Options A, B, C
+6. Present choice via AskUser
+   Result: User picks Option B, saved to `specs/active/2026-03-06-dark-mode-toggle.md`
 
 Example 2: User wants synthesis
 User says: "rfspec this: refactor the auth module to use JWT"
 Actions:
 
-1. Run `/rfspec refactor the auth module to use JWT`
-2. Compare results, noting Option A has better token rotation but Option C has cleaner middleware
+1. Launch background, poll for completion
+2. Read results, compare -- Option A has better token rotation, Option C has cleaner middleware
 3. User selects "Synthesize"
 4. Combine Option A's rotation logic with Option C's middleware structure
-Result: Synthesized spec saved to `specs/active/2026-03-06-auth-jwt-refactor.md`
+   Result: Synthesized spec saved to `specs/active/2026-03-06-auth-jwt-refactor.md`
 
 Example 3: All options rejected
 User says: "None of these work, they all miss the caching layer"
 Actions:
 
 1. Ask what's missing -- user explains the Redis caching requirement
 2. Offer to re-run: `/rfspec refactor auth module to use JWT with Redis session caching`
-Result: New round of specs generated with caching addressed
+   Result: New round of specs generated with caching addressed
 
 ## References
 
diff --git a/plugins/rfspec/skills/rfspec/scripts/run.sh b/plugins/rfspec/skills/rfspec/scripts/run.sh
@@ -2,8 +2,14 @@
 set -euo pipefail
 
 # ── guard: dependencies ──────────────────────────────────────────────
-command -v jq  >/dev/null 2>&1 || { echo "Error: jq is required but not installed. Install it with: brew install jq"; exit 1; }
-command -v droid >/dev/null 2>&1 || { echo "Error: droid CLI is required but not found on PATH."; exit 1; }
+command -v jq >/dev/null 2>&1 || {
+  echo "Error: jq is required but not installed. Install it with: brew install jq"
+  exit 1
+}
+command -v droid >/dev/null 2>&1 || {
+  echo "Error: droid CLI is required but not found on PATH."
+  exit 1
+}
 
 PROMPT="$*"
 
@@ -15,23 +21,42 @@ if [ -z "$PROMPT" ]; then
   exit 1
 fi
 
-# ── prompt ────────────────────────────────────────────────────────────
+# ── persistent output directory ──────────────────────────────────────
+# Results go to a stable path so the calling session can poll for them.
+# The temp dir is only used for the prompt file passed to droid exec.
+RFSPEC_HOME="${HOME}/.factory/rfspec/runs"
+RUN_ID="$(date +%Y%m%d-%H%M%S)-$$"
+OUTDIR="${RFSPEC_HOME}/${RUN_ID}"
+mkdir -p "$OUTDIR"
+
 TMPDIR=$(mktemp -d)
 trap 'rm -rf "$TMPDIR"' EXIT
 
-echo "$PROMPT" > "$TMPDIR/prompt.md"
+echo "$PROMPT" >"$TMPDIR/prompt.md"
+cp "$TMPDIR/prompt.md" "$OUTDIR/prompt.md"
+
+# Print the output path IMMEDIATELY so the calling agent can capture it
+# even if the Execute call times out before the models finish.
+echo "RFSPEC_RUN_DIR=${OUTDIR}"
+echo "Firing three model calls in parallel. Poll ${OUTDIR}/results.md for output."
 
 # ── models (id, label, max reasoning) ────────────────────────────────
-MODEL_A="claude-opus-4-6";  LABEL_A="Opus 4.6";       RE_A="max"
-MODEL_B="gpt-5.4";          LABEL_B="GPT-5.4";        RE_B="xhigh"
-MODEL_C="gemini-3.1-pro-preview"; LABEL_C="Gemini 3.1 Pro"; RE_C="high"
+MODEL_A="claude-opus-4-6"
+LABEL_A="Opus 4.6"
+RE_A="max"
+MODEL_B="gpt-5.4"
+LABEL_B="GPT-5.4"
+RE_B="xhigh"
+MODEL_C="gemini-3.1-pro-preview"
+LABEL_C="Gemini 3.1 Pro"
+RE_C="high"
 
 # ── fire all three in parallel ───────────────────────────────────────
-droid exec -m "$MODEL_A" -r "$RE_A" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null > "$TMPDIR/a.json" &
+droid exec -m "$MODEL_A" -r "$RE_A" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null >"$OUTDIR/a.json" &
 PID_A=$!
-droid exec -m "$MODEL_B" -r "$RE_B" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null > "$TMPDIR/b.json" &
+droid exec -m "$MODEL_B" -r "$RE_B" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null >"$OUTDIR/b.json" &
 PID_B=$!
-droid exec -m "$MODEL_C" -r "$RE_C" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null > "$TMPDIR/c.json" &
+droid exec -m "$MODEL_C" -r "$RE_C" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null >"$OUTDIR/c.json" &
 PID_C=$!
 
 FAIL=""
@@ -47,24 +72,29 @@ extract() {
   fi
 }
 
-RESULT_A=$(extract "$TMPDIR/a.json")
-RESULT_B=$(extract "$TMPDIR/b.json")
-RESULT_C=$(extract "$TMPDIR/c.json")
+RESULT_A=$(extract "$OUTDIR/a.json")
+RESULT_B=$(extract "$OUTDIR/b.json")
+RESULT_C=$(extract "$OUTDIR/c.json")
 
-# ── present results ──────────────────────────────────────────────────
-echo "=== RFSPEC RESULTS ==="
-echo ""
-echo "User request: ${PROMPT}"
-echo ""
+# ── write results to persistent file ─────────────────────────────────
+{
+  echo "# rfspec results"
+  echo ""
+  echo "User request: ${PROMPT}"
+  echo ""
 
-[ -n "$RESULT_A" ] && printf '### Option A -- %s\n\n%s\n\n' "$LABEL_A" "$RESULT_A"
-[ -n "$RESULT_B" ] && printf '### Option B -- %s\n\n%s\n\n' "$LABEL_B" "$RESULT_B"
-[ -n "$RESULT_C" ] && printf '### Option C -- %s\n\n%s\n\n' "$LABEL_C" "$RESULT_C"
+  [ -n "$RESULT_A" ] && printf '## Option A -- %s\n\n%s\n\n' "$LABEL_A" "$RESULT_A"
+  [ -n "$RESULT_B" ] && printf '## Option B -- %s\n\n%s\n\n' "$LABEL_B" "$RESULT_B"
+  [ -n "$RESULT_C" ] && printf '## Option C -- %s\n\n%s\n\n' "$LABEL_C" "$RESULT_C"
 
-if [ -n "$FAIL" ]; then
-  echo "Note: The following models encountered errors: ${FAIL}"
-  echo ""
-fi
+  if [ -n "$FAIL" ]; then
+    echo "> **Note:** The following models encountered errors: ${FAIL}"
+    echo ""
+  fi
+} >"$OUTDIR/results.md"
+
+# ── also print to stdout (for cases where timeout is large enough) ───
+cat "$OUTDIR/results.md"
 
 SUCCESS=0
 [ -n "$RESULT_A" ] && SUCCESS=$((SUCCESS + 1))
@@ -74,20 +104,11 @@ SUCCESS=0
 if [ "$SUCCESS" -eq 0 ]; then
   echo "Error: All three models failed. Check that your droid CLI is authenticated"
   echo "and the models (${MODEL_A}, ${MODEL_B}, ${MODEL_C}) are available."
+  echo "STATUS=failed" >"$OUTDIR/done"
   exit 1
 fi
 
-echo "=== AGENT INSTRUCTIONS ==="
-echo "Analyze the specs above. Provide a brief comparison of each model's"
-echo "strengths and weaknesses. Then use the AskUser tool to offer:"
-echo "- Use Option A (${LABEL_A}) as-is"
-echo "- Use Option B (${LABEL_B}) as-is"
-echo "- Use Option C (${LABEL_C}) as-is"
-echo "- Synthesize a refined spec combining the best of all three"
-echo "- No -- none of these work (explain why)"
+# ── write completion sentinel ────────────────────────────────────────
+echo "STATUS=complete" >"$OUTDIR/done"
 echo ""
-echo "CRITICAL: Do NOT save the spec directly. After the user picks an option"
-echo "or requests synthesis, use the ExitSpecMode tool to present the final"
-echo "spec content for review. Only save to specs/active/YYYY-MM-DD-<slug>.md"
-echo "AFTER the user approves the spec in spec mode. If rejected, gather"
-echo "feedback and revise."
+echo "Results written to: ${OUTDIR}/results.md"