Skip to content

Commit d007c4a

Browse files
fix(rfspec): persist results and support fire-and-forget polling
The run.sh script spawns three droid exec calls that take several minutes, but the Execute tool times out at 60s. When that happens the temp dir self-destructs and results are lost. Changes: - Write model outputs to persistent ~/.factory/rfspec/runs/<id>/ instead of a temp dir - Print RFSPEC_RUN_DIR path immediately so the agent captures it before timeout - Write a done sentinel (STATUS=complete|failed) for polling - Update SKILL.md (v1.3.0) with fire-and-forget + poll workflow instructions
1 parent 6959152 commit d007c4a

3 files changed

Lines changed: 259 additions & 63 deletions

File tree

plugins/rfspec/commands/rfspec

Lines changed: 79 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,80 @@
11
#!/usr/bin/env bash
2-
exec "$(dirname "$0")/../skills/rfspec/scripts/run.sh" "$@"
2+
# Launch rfspec in background and return polling instructions immediately.
3+
# This avoids the Execute tool timeout killing the long-running model calls.
4+
5+
SCRIPT_DIR="$(dirname "$0")"
6+
RUN_SH="${SCRIPT_DIR}/../skills/rfspec/scripts/run.sh"
7+
8+
if [ $# -eq 0 ]; then
9+
exec "$RUN_SH"
10+
fi
11+
12+
# Run the script in background, capturing output to its own log.
13+
# run.sh prints RFSPEC_RUN_DIR=<path> as its first line, so we wait
14+
# just long enough to capture that, then return control to the agent.
15+
BGLOG=$(mktemp /tmp/rfspec-bg-XXXXXXXX)
16+
nohup "$RUN_SH" "$@" >"$BGLOG" 2>&1 &
17+
BG_PID=$!
18+
19+
# Wait briefly for run.sh to create the output dir and print the path
20+
sleep 2
21+
22+
# Extract the run dir from the early output
23+
RUN_DIR=$(grep -m1 'RFSPEC_RUN_DIR=' "$BGLOG" 2>/dev/null | cut -d= -f2-)
24+
25+
echo "User prompt: $*"
26+
echo ""
27+
28+
if [ -z "$RUN_DIR" ]; then
29+
echo "rfspec launched (PID ${BG_PID}), but run dir not yet available."
30+
echo "Check log: ${BGLOG}"
31+
else
32+
echo "RFSPEC_RUN_DIR=${RUN_DIR}"
33+
fi
34+
35+
echo ""
36+
echo "rfspec is running in background (PID ${BG_PID})."
37+
echo "Background log: ${BGLOG}"
38+
echo ""
39+
cat <<'WORKFLOW'
40+
=== RFSPEC WORKFLOW ===
41+
42+
Three models (Opus, GPT-5.4, Gemini) are generating competing spec proposals.
43+
44+
STEP 1: Tell the user the models are running and results will be ready
45+
in a few minutes.
46+
47+
STEP 2: Poll for completion every 30-60 seconds:
48+
WORKFLOW
49+
echo " cat ${RUN_DIR:-<run_dir>}/done 2>/dev/null || echo PENDING"
50+
cat <<'WORKFLOW'
51+
52+
STEP 3: When done, read the results file:
53+
WORKFLOW
54+
echo " Read: ${RUN_DIR:-<run_dir>}/results.md"
55+
cat <<'WORKFLOW'
56+
57+
STEP 4: EVALUATE -- compare the three specs against each other:
58+
- Architectural choices (patterns, libraries, data flow)
59+
- Scope differences (what each included or excluded)
60+
- Concrete vs. vague (which named actual files, functions, steps)
61+
- Risk areas (where one flagged something the others missed)
62+
Write a 2-4 sentence comparison per option. Compare, don't summarize.
63+
64+
STEP 5: PRESENT the choice using AskUser with these options:
65+
- Use Option A as-is
66+
- Use Option B as-is
67+
- Use Option C as-is
68+
- Synthesize a refined spec combining the best of all three
69+
- None of these work
70+
71+
STEP 6: FINALIZE based on user's choice:
72+
- If user picks one option: present it via ExitSpecMode for review.
73+
- If user picks synthesis: start from the strongest option as base,
74+
pull specific elements from others (name what and why), resolve
75+
contradictions. The result must be a single coherent document.
76+
- If user rejects all: ask what's missing, refine prompt, re-run.
77+
78+
Only save to specs/active/YYYY-MM-DD-<slug>.md AFTER user approves
79+
the spec in spec mode. Do NOT save without approval.
80+
WORKFLOW

plugins/rfspec/skills/rfspec/SKILL.md

Lines changed: 79 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
name: rfspec
3-
version: 1.2.0
3+
version: 1.3.0
44
description: |
55
Multi-model spec generation and synthesis. Use when the user wants to:
66
- Get competing proposals from different AI models
@@ -17,20 +17,63 @@ Fan out a prompt to multiple models, compare their responses, and help the user
1717

1818
## Quick Reference
1919

20-
| Task | Action |
21-
|------|--------|
22-
| Generate competing specs | `/rfspec <prompt>` |
23-
| Pick one result | Select via AskUser after comparison |
24-
| Synthesize results | Combine strongest elements when user chooses synthesis |
25-
| Save final spec | Write to `specs/active/YYYY-MM-DD-<slug>.md` |
20+
| Task | Action |
21+
| ------------------------ | ------------------------------------------------------ |
22+
| Generate competing specs | `/rfspec <prompt>` (background) |
23+
| Poll for results | Check `<run_dir>/done` sentinel |
24+
| Pick one result | Select via AskUser after comparison |
25+
| Synthesize results | Combine strongest elements when user chooses synthesis |
26+
| Save final spec | Write to `specs/active/YYYY-MM-DD-<slug>.md` |
2627

2728
## Workflow
2829

29-
1. Run `/rfspec <user's prompt>` -- fires parallel model calls, returns labeled options (A, B, C).
30-
2. Evaluate the results -- see [references/evaluation-guide.md](references/evaluation-guide.md).
31-
3. Present the choice to the user via AskUser.
32-
4. Present the selected or synthesized spec via ExitSpecMode for user review.
33-
5. Save to `specs/active/` only after the user approves in spec mode.
30+
The `/rfspec` command spawns three `droid exec` calls in parallel. These take
31+
several minutes, far exceeding the Execute tool timeout. You MUST use the
32+
fire-and-forget + poll pattern.
33+
34+
### Step 1 -- Launch (background)
35+
36+
Run the command with `fireAndForget=true`:
37+
38+
```
39+
Execute: /rfspec <user's prompt>
40+
fireAndForget: true
41+
```
42+
43+
The script immediately prints `RFSPEC_RUN_DIR=<path>` to its log file.
44+
Read the log file (path printed by Execute) to capture the run directory.
45+
46+
### Step 2 -- Poll for completion
47+
48+
Tell the user the models are running and you will check back. Then poll:
49+
50+
```
51+
Execute: cat <run_dir>/done 2>/dev/null || echo "PENDING"
52+
```
53+
54+
Poll every 30-60 seconds. The sentinel contains `STATUS=complete` or
55+
`STATUS=failed`. While waiting, you can do other work or let the user know
56+
progress.
57+
58+
### Step 3 -- Read results
59+
60+
Once `done` exists, read the results:
61+
62+
```
63+
Read: <run_dir>/results.md
64+
```
65+
66+
This file contains all three model outputs as markdown sections (Option A, B, C).
67+
68+
### Step 4 -- Evaluate and present
69+
70+
Evaluate the results -- see [references/evaluation-guide.md](references/evaluation-guide.md).
71+
Present the choice to the user via AskUser.
72+
73+
### Step 5 -- Finalize
74+
75+
Present the selected or synthesized spec via ExitSpecMode for user review.
76+
Save to `specs/active/` only after the user approves in spec mode.
3477

3578
## Saving
3679

@@ -43,6 +86,19 @@ specs/active/YYYY-MM-DD-<slug>.md
4386

4487
Where `<slug>` is a short kebab-case name derived from the topic.
4588

89+
## Resuming from slash command
90+
91+
If you are loading this skill after `/rfspec` already ran (the slash command told
92+
you to invoke `Skill: rfspec`), you already have the run directory. Pick up from
93+
Step 3:
94+
95+
1. Read `<run_dir>/results.md` to get the model outputs.
96+
2. Follow Step 4 (evaluate and present) and Step 5 (finalize) below.
97+
98+
The `results.md` file includes embedded agent instructions as a fallback, but
99+
prefer the full workflow in this document -- it covers the evaluation guide,
100+
saving rules, and rejection handling that the embedded version omits.
101+
46102
## Pitfalls
47103

48104
- Don't summarize each option individually -- compare them against each other.
@@ -63,29 +119,31 @@ Example 1: User wants competing specs
63119
User says: "Get me specs from multiple models for adding a dark mode toggle"
64120
Actions:
65121

66-
1. Run `/rfspec add a dark mode toggle to the settings page with persistent user preference`
67-
2. Read Options A, B, C
68-
3. Compare: "Option A uses CSS variables with a React context, Option B uses Tailwind's dark class with localStorage, Option C uses a theme provider with system preference detection."
69-
4. Present choice via AskUser
70-
Result: User picks Option B, saved to `specs/active/2026-03-06-dark-mode-toggle.md`
122+
1. Execute `/rfspec add a dark mode toggle ...` with `fireAndForget=true`
123+
2. Read the background log to get `RFSPEC_RUN_DIR`
124+
3. Tell user: "Models are running, I'll check back shortly."
125+
4. Poll `<run_dir>/done` until `STATUS=complete`
126+
5. Read `<run_dir>/results.md`, compare Options A, B, C
127+
6. Present choice via AskUser
128+
Result: User picks Option B, saved to `specs/active/2026-03-06-dark-mode-toggle.md`
71129

72130
Example 2: User wants synthesis
73131
User says: "rfspec this: refactor the auth module to use JWT"
74132
Actions:
75133

76-
1. Run `/rfspec refactor the auth module to use JWT`
77-
2. Compare results, noting Option A has better token rotation but Option C has cleaner middleware
134+
1. Launch background, poll for completion
135+
2. Read results, compare -- Option A has better token rotation, Option C has cleaner middleware
78136
3. User selects "Synthesize"
79137
4. Combine Option A's rotation logic with Option C's middleware structure
80-
Result: Synthesized spec saved to `specs/active/2026-03-06-auth-jwt-refactor.md`
138+
Result: Synthesized spec saved to `specs/active/2026-03-06-auth-jwt-refactor.md`
81139

82140
Example 3: All options rejected
83141
User says: "None of these work, they all miss the caching layer"
84142
Actions:
85143

86144
1. Ask what's missing -- user explains the Redis caching requirement
87145
2. Offer to re-run: `/rfspec refactor auth module to use JWT with Redis session caching`
88-
Result: New round of specs generated with caching addressed
146+
Result: New round of specs generated with caching addressed
89147

90148
## References
91149

0 commit comments

Comments
 (0)