Skip to content

Commit a7e2bf0

Browse files
fix(rfspec): persist results and support fire-and-forget polling
The run.sh script spawns three droid exec calls that take several minutes, but the Execute tool times out at 60s. When that happens the temp dir self-destructs and results are lost. Changes: - Write model outputs to persistent ~/.factory/rfspec/runs/<id>/ instead of a temp dir - Print RFSPEC_RUN_DIR path immediately so the agent captures it before timeout - Write a done sentinel (STATUS=complete|failed) for polling - Update SKILL.md (v1.3.0) with fire-and-forget + poll workflow instructions
1 parent 6959152 commit a7e2bf0

File tree

2 files changed

+125
-59
lines changed

2 files changed

+125
-59
lines changed

plugins/rfspec/skills/rfspec/SKILL.md

Lines changed: 66 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
name: rfspec
3-
version: 1.2.0
3+
version: 1.3.0
44
description: |
55
Multi-model spec generation and synthesis. Use when the user wants to:
66
- Get competing proposals from different AI models
@@ -17,20 +17,63 @@ Fan out a prompt to multiple models, compare their responses, and help the user
1717

1818
## Quick Reference
1919

20-
| Task | Action |
21-
|------|--------|
22-
| Generate competing specs | `/rfspec <prompt>` |
23-
| Pick one result | Select via AskUser after comparison |
24-
| Synthesize results | Combine strongest elements when user chooses synthesis |
25-
| Save final spec | Write to `specs/active/YYYY-MM-DD-<slug>.md` |
20+
| Task | Action |
21+
| ------------------------ | ------------------------------------------------------ |
22+
| Generate competing specs | `/rfspec <prompt>` (background) |
23+
| Poll for results | Check `<run_dir>/done` sentinel |
24+
| Pick one result | Select via AskUser after comparison |
25+
| Synthesize results | Combine strongest elements when user chooses synthesis |
26+
| Save final spec | Write to `specs/active/YYYY-MM-DD-<slug>.md` |
2627

2728
## Workflow
2829

29-
1. Run `/rfspec <user's prompt>` -- fires parallel model calls, returns labeled options (A, B, C).
30-
2. Evaluate the results -- see [references/evaluation-guide.md](references/evaluation-guide.md).
31-
3. Present the choice to the user via AskUser.
32-
4. Present the selected or synthesized spec via ExitSpecMode for user review.
33-
5. Save to `specs/active/` only after the user approves in spec mode.
30+
The `/rfspec` command spawns three `droid exec` calls in parallel. These take
31+
several minutes, far exceeding the Execute tool timeout. You MUST use the
32+
fire-and-forget + poll pattern.
33+
34+
### Step 1 -- Launch (background)
35+
36+
Run the command with `fireAndForget=true`:
37+
38+
```
39+
Execute: /rfspec <user's prompt>
40+
fireAndForget: true
41+
```
42+
43+
The script immediately prints `RFSPEC_RUN_DIR=<path>` to its log file.
44+
Read the log file (path printed by Execute) to capture the run directory.
45+
46+
### Step 2 -- Poll for completion
47+
48+
Tell the user the models are running and you will check back. Then poll:
49+
50+
```
51+
Execute: cat <run_dir>/done 2>/dev/null || echo "PENDING"
52+
```
53+
54+
Poll every 30-60 seconds. The sentinel contains `STATUS=complete` or
55+
`STATUS=failed`. While waiting, you can do other work or let the user know
56+
progress.
57+
58+
### Step 3 -- Read results
59+
60+
Once `done` exists, read the results:
61+
62+
```
63+
Read: <run_dir>/results.md
64+
```
65+
66+
This file contains all three model outputs as markdown sections (Option A, B, C).
67+
68+
### Step 4 -- Evaluate and present
69+
70+
Evaluate the results -- see [references/evaluation-guide.md](references/evaluation-guide.md).
71+
Present the choice to the user via AskUser.
72+
73+
### Step 5 -- Finalize
74+
75+
Present the selected or synthesized spec via ExitSpecMode for user review.
76+
Save to `specs/active/` only after the user approves in spec mode.
3477

3578
## Saving
3679

@@ -63,29 +106,31 @@ Example 1: User wants competing specs
63106
User says: "Get me specs from multiple models for adding a dark mode toggle"
64107
Actions:
65108

66-
1. Run `/rfspec add a dark mode toggle to the settings page with persistent user preference`
67-
2. Read Options A, B, C
68-
3. Compare: "Option A uses CSS variables with a React context, Option B uses Tailwind's dark class with localStorage, Option C uses a theme provider with system preference detection."
69-
4. Present choice via AskUser
70-
Result: User picks Option B, saved to `specs/active/2026-03-06-dark-mode-toggle.md`
109+
1. Execute `/rfspec add a dark mode toggle ...` with `fireAndForget=true`
110+
2. Read the background log to get `RFSPEC_RUN_DIR`
111+
3. Tell user: "Models are running, I'll check back shortly."
112+
4. Poll `<run_dir>/done` until `STATUS=complete`
113+
5. Read `<run_dir>/results.md`, compare Options A, B, C
114+
6. Present choice via AskUser
115+
Result: User picks Option B, saved to `specs/active/2026-03-06-dark-mode-toggle.md`
71116

72117
Example 2: User wants synthesis
73118
User says: "rfspec this: refactor the auth module to use JWT"
74119
Actions:
75120

76-
1. Run `/rfspec refactor the auth module to use JWT`
77-
2. Compare results, noting Option A has better token rotation but Option C has cleaner middleware
121+
1. Launch background, poll for completion
122+
2. Read results, compare -- Option A has better token rotation, Option C has cleaner middleware
78123
3. User selects "Synthesize"
79124
4. Combine Option A's rotation logic with Option C's middleware structure
80-
Result: Synthesized spec saved to `specs/active/2026-03-06-auth-jwt-refactor.md`
125+
Result: Synthesized spec saved to `specs/active/2026-03-06-auth-jwt-refactor.md`
81126

82127
Example 3: All options rejected
83128
User says: "None of these work, they all miss the caching layer"
84129
Actions:
85130

86131
1. Ask what's missing -- user explains the Redis caching requirement
87132
2. Offer to re-run: `/rfspec refactor auth module to use JWT with Redis session caching`
88-
Result: New round of specs generated with caching addressed
133+
Result: New round of specs generated with caching addressed
89134

90135
## References
91136

plugins/rfspec/skills/rfspec/scripts/run.sh

Lines changed: 59 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,14 @@
22
set -euo pipefail
33

44
# ── guard: dependencies ──────────────────────────────────────────────
5-
command -v jq >/dev/null 2>&1 || { echo "Error: jq is required but not installed. Install it with: brew install jq"; exit 1; }
6-
command -v droid >/dev/null 2>&1 || { echo "Error: droid CLI is required but not found on PATH."; exit 1; }
5+
command -v jq >/dev/null 2>&1 || {
6+
echo "Error: jq is required but not installed. Install it with: brew install jq"
7+
exit 1
8+
}
9+
command -v droid >/dev/null 2>&1 || {
10+
echo "Error: droid CLI is required but not found on PATH."
11+
exit 1
12+
}
713

814
PROMPT="$*"
915

@@ -15,23 +21,42 @@ if [ -z "$PROMPT" ]; then
1521
exit 1
1622
fi
1723

18-
# ── prompt ────────────────────────────────────────────────────────────
24+
# ── persistent output directory ──────────────────────────────────────
25+
# Results go to a stable path so the calling session can poll for them.
26+
# The temp dir is only used for the prompt file passed to droid exec.
27+
RFSPEC_HOME="${HOME}/.factory/rfspec/runs"
28+
RUN_ID="$(date +%Y%m%d-%H%M%S)-$$"
29+
OUTDIR="${RFSPEC_HOME}/${RUN_ID}"
30+
mkdir -p "$OUTDIR"
31+
1932
TMPDIR=$(mktemp -d)
2033
trap 'rm -rf "$TMPDIR"' EXIT
2134

22-
echo "$PROMPT" > "$TMPDIR/prompt.md"
35+
echo "$PROMPT" >"$TMPDIR/prompt.md"
36+
cp "$TMPDIR/prompt.md" "$OUTDIR/prompt.md"
37+
38+
# Print the output path IMMEDIATELY so the calling agent can capture it
39+
# even if the Execute call times out before the models finish.
40+
echo "RFSPEC_RUN_DIR=${OUTDIR}"
41+
echo "Firing three model calls in parallel. Poll ${OUTDIR}/results.md for output."
2342

2443
# ── models (id, label, max reasoning) ────────────────────────────────
25-
MODEL_A="claude-opus-4-6"; LABEL_A="Opus 4.6"; RE_A="max"
26-
MODEL_B="gpt-5.4"; LABEL_B="GPT-5.4"; RE_B="xhigh"
27-
MODEL_C="gemini-3.1-pro-preview"; LABEL_C="Gemini 3.1 Pro"; RE_C="high"
44+
MODEL_A="claude-opus-4-6"
45+
LABEL_A="Opus 4.6"
46+
RE_A="max"
47+
MODEL_B="gpt-5.4"
48+
LABEL_B="GPT-5.4"
49+
RE_B="xhigh"
50+
MODEL_C="gemini-3.1-pro-preview"
51+
LABEL_C="Gemini 3.1 Pro"
52+
RE_C="high"
2853

2954
# ── fire all three in parallel ───────────────────────────────────────
30-
droid exec -m "$MODEL_A" -r "$RE_A" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null > "$TMPDIR/a.json" &
55+
droid exec -m "$MODEL_A" -r "$RE_A" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null >"$OUTDIR/a.json" &
3156
PID_A=$!
32-
droid exec -m "$MODEL_B" -r "$RE_B" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null > "$TMPDIR/b.json" &
57+
droid exec -m "$MODEL_B" -r "$RE_B" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null >"$OUTDIR/b.json" &
3358
PID_B=$!
34-
droid exec -m "$MODEL_C" -r "$RE_C" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null > "$TMPDIR/c.json" &
59+
droid exec -m "$MODEL_C" -r "$RE_C" --auto medium -f "$TMPDIR/prompt.md" -o json 2>/dev/null >"$OUTDIR/c.json" &
3560
PID_C=$!
3661

3762
FAIL=""
@@ -47,24 +72,29 @@ extract() {
4772
fi
4873
}
4974

50-
RESULT_A=$(extract "$TMPDIR/a.json")
51-
RESULT_B=$(extract "$TMPDIR/b.json")
52-
RESULT_C=$(extract "$TMPDIR/c.json")
75+
RESULT_A=$(extract "$OUTDIR/a.json")
76+
RESULT_B=$(extract "$OUTDIR/b.json")
77+
RESULT_C=$(extract "$OUTDIR/c.json")
5378

54-
# ── present results ──────────────────────────────────────────────────
55-
echo "=== RFSPEC RESULTS ==="
56-
echo ""
57-
echo "User request: ${PROMPT}"
58-
echo ""
79+
# ── write results to persistent file ─────────────────────────────────
80+
{
81+
echo "# rfspec results"
82+
echo ""
83+
echo "User request: ${PROMPT}"
84+
echo ""
5985

60-
[ -n "$RESULT_A" ] && printf '### Option A -- %s\n\n%s\n\n' "$LABEL_A" "$RESULT_A"
61-
[ -n "$RESULT_B" ] && printf '### Option B -- %s\n\n%s\n\n' "$LABEL_B" "$RESULT_B"
62-
[ -n "$RESULT_C" ] && printf '### Option C -- %s\n\n%s\n\n' "$LABEL_C" "$RESULT_C"
86+
[ -n "$RESULT_A" ] && printf '## Option A -- %s\n\n%s\n\n' "$LABEL_A" "$RESULT_A"
87+
[ -n "$RESULT_B" ] && printf '## Option B -- %s\n\n%s\n\n' "$LABEL_B" "$RESULT_B"
88+
[ -n "$RESULT_C" ] && printf '## Option C -- %s\n\n%s\n\n' "$LABEL_C" "$RESULT_C"
6389

64-
if [ -n "$FAIL" ]; then
65-
echo "Note: The following models encountered errors: ${FAIL}"
66-
echo ""
67-
fi
90+
if [ -n "$FAIL" ]; then
91+
echo "> **Note:** The following models encountered errors: ${FAIL}"
92+
echo ""
93+
fi
94+
} >"$OUTDIR/results.md"
95+
96+
# ── also print to stdout (for cases where timeout is large enough) ───
97+
cat "$OUTDIR/results.md"
6898

6999
SUCCESS=0
70100
[ -n "$RESULT_A" ] && SUCCESS=$((SUCCESS + 1))
@@ -74,20 +104,11 @@ SUCCESS=0
74104
if [ "$SUCCESS" -eq 0 ]; then
75105
echo "Error: All three models failed. Check that your droid CLI is authenticated"
76106
echo "and the models (${MODEL_A}, ${MODEL_B}, ${MODEL_C}) are available."
107+
echo "STATUS=failed" >"$OUTDIR/done"
77108
exit 1
78109
fi
79110

80-
echo "=== AGENT INSTRUCTIONS ==="
81-
echo "Analyze the specs above. Provide a brief comparison of each model's"
82-
echo "strengths and weaknesses. Then use the AskUser tool to offer:"
83-
echo "- Use Option A (${LABEL_A}) as-is"
84-
echo "- Use Option B (${LABEL_B}) as-is"
85-
echo "- Use Option C (${LABEL_C}) as-is"
86-
echo "- Synthesize a refined spec combining the best of all three"
87-
echo "- No -- none of these work (explain why)"
111+
# ── write completion sentinel ────────────────────────────────────────
112+
echo "STATUS=complete" >"$OUTDIR/done"
88113
echo ""
89-
echo "CRITICAL: Do NOT save the spec directly. After the user picks an option"
90-
echo "or requests synthesis, use the ExitSpecMode tool to present the final"
91-
echo "spec content for review. Only save to specs/active/YYYY-MM-DD-<slug>.md"
92-
echo "AFTER the user approves the spec in spec mode. If rejected, gather"
93-
echo "feedback and revise."
114+
echo "Results written to: ${OUTDIR}/results.md"

0 commit comments

Comments
 (0)