You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Spawned from the claude-code-002 fix: when the test instructs the human to "click into the test file (cc-002) to focus it" and "select exactly lines 1-2 (click line 1, shift-click end of line 2)", those steps are friction the test inflicts without test value. They can be replaced with three lines of code:
The survey looked at all 147 [assisted] tests across 20 files for the same anti-pattern. The pool is smaller than expected: only 7 tests have this problem. Most assisted tests already automate setup correctly and ask the human only for the action that genuinely needs them.
The 7 affected tests
builtInAiAssistants.test.ts
claude-code-004 at line 162 — file content 'line 1\nline 2\nline 3\n'. Selection: new vscode.Selection(0, 0, 1, 6). "Click into the test file (cc-004) to focus it" + "Select exactly lines 1-2".
claude-code-005 (cold step only) at line 205 — file content 'line 1\nline 2\nline 3\nline 4\n'. Selection: new vscode.Selection(0, 0, 1, 6). "Click into the test file (cc-005)" + "Select exactly lines 1-2".
The warm step of claude-code-005 already does it right — it sets editor.selection = new vscode.Selection(2, 0, 3, 7) programmatically. The cold step was missed when the file was written.
clipboardPreservation.test.ts
clipboard-preservation-001 at line 134 — file content is 10 lines "line N content" (N=1..10). Selection: lines 2-4 → new vscode.Selection(1, 0, 3, document.lineAt(3).text.length) or simpler (1, 0, 4, 0).
clipboard-preservation-004 at line 198 — same 10-line content. Selection: lines 1-3 → (0, 0, 3, 0).
clipboard-preservation-005 at line 236 — same 10-line content. Instruction is non-deterministic ("Select 2 or 3 lines"); pin to lines 1-3 → (0, 0, 3, 0).
clipboard-preservation-009 at line 302 — same 10-line content. Instruction is non-deterministic ("Select a few lines"); pin to lines 1-3 → (0, 0, 3, 0).
This file is the largest cluster — 4 of the 5 manual clipboard-preservation tests share the same pattern. clipboard-preservation-005 and -009 are particularly notable because the human is told to "select 2 or 3 lines" / "select a few lines" — non-deterministic instructions. Pin to a specific range so assertions are stable across runs.
coreSendCommands.test.ts
core-send-commands-r-l-003 at line 97 — file content is 10 lines "line N content". Selection: lines 1-3 → (0, 0, 3, 0). The expected link is ${relPath}#L1-L3, so the selection range must produce that. Pinning to exactly (0, 0, 3, 0) gives #L1-L3 deterministically.
What the survey looked at and excluded
The survey explicitly excluded:
Pressing keybinding chords (e.g., "Press Cmd+R Cmd+L"). The chord itself can be argued to be part of the test contract for some TCs — though for claude-code-002 we replaced it with executeCommand(CMD_COPY_LINK_RELATIVE) and the test's contract is unchanged. There's a separate latent question of which TCs would benefit from this swap; out of scope here.
Visual verification (e.g., "confirm the link appears in Claude Code"). That IS the test's assertion when paired with waitForHumanVerdict.
The bigger fish — a related finding worth flagging
While running these surveys, a more serious issue surfaced. claude-code-004 passed an integration run on issues/547 even though the human watched Claude Code receive nothing — because claude-code-004 uses waitForHuman ("click Cancel when done"), not waitForHumanVerdict ("click PASS or FAIL"). The log-based assertions only proved that ComposablePasteDestination.pasteLink and VscodeAdapter.pasteTextFromClipboard fired — not that content arrived in the webview. The human's "yeah I clicked Cancel" was treated as "yeah the test passed".
This is the silent-pass trap, and it's a different bug than the boring-setup one — but it's the same kind of test-rigor failure. Any [assisted] test that uses waitForHuman to verify a visible outcome is susceptible to it: the human can be looking at a broken UI and the test will still go green.
Tests likely affected (rough scan, not a full audit):
claude-code-004 (confirmed silent-pass; the cold-paste test passed against a webview that received nothing)
claude-code-005 (warm step) — same waitForHuman pattern, same risk
clipboard-preservation-001/004/005/009 — instruction asks "verify the link should appear in terminal/Dummy AI" but only requires Cancel click; the clipboard assertion is partial coverage at best
All [assisted] tests in builtInAiAssistants.test.ts other than claude-code-002 (which uses waitForHumanVerdict)
A separate, focused audit is worth doing. Not folding it into this issue because the scope is different — but it should not get lost.
Recommended conversion pattern
For each of the 7 affected tests, the fix is mechanical:
Before:
constfileUri=createWorkspaceFile('cc-004','line 1\nline 2\nline 3\n');tmpFileUris.push(fileUri);awaitopenEditor(fileUri);awaitsettle();// ... bind setup ...awaitwaitForHuman('claude-code-004','Cold paste: select lines 1-2 and press Cmd+R Cmd+L, verify link appears in Claude Code chat, then Cancel',['1. Click into the test file (cc-004) to focus it','2. Select exactly lines 1-2 (click line 1, shift-click end of line 2)','3. Press Cmd+R Cmd+L — the RangeLink should appear in Claude Code chat input','4. Visually confirm the link appears in Claude Code','5. Press Cancel to continue (assertions happen automatically)',],);
After:
constfileUri=createWorkspaceFile('cc-004','line 1\nline 2\nline 3\n');tmpFileUris.push(fileUri);constdoc=awaitvscode.workspace.openTextDocument(fileUri);consteditor=awaitvscode.window.showTextDocument(doc);editor.selection=newvscode.Selection(0,0,1,6);awaitsettle();// ... bind setup ...awaitwaitForHumanVerdict(// also flip waitForHuman → waitForHumanVerdict for real assertion'claude-code-004','Cold paste: press Cmd+R Cmd+L. Did the link appear in Claude Code chat?',['1. Lines 1-2 are already selected in cc-004','2. Press Cmd+R Cmd+L','3. Click PASS if the RangeLink appeared in Claude Code chat input, FAIL otherwise',],);
(The waitForHumanVerdict swap is recommended but separately scoped — it fixes the silent-pass trap for these tests; the boring-setup fix is the smaller, more contained change.)
Implementer's checklist (per test)
For each of the 7 tests above:
Pre-focus and pre-select before any waitForHuman / waitForHumanVerdict call:
Place this AFTER createWorkspaceFile/openEditor and AFTER any bind setup (executeCommand(CMD_BIND_TO_*)), but BEFORE the human prompt.
Strip the boring instructions from consoleSteps. Remove the "Click into the test file" and "Select exactly lines X-Y" lines. Renumber the remaining steps.
Update the action string to reflect what the human still needs to do (typically "Press Cmd+R Cmd+L. Did the link appear in X?" or similar).
STRONGLY RECOMMENDED — flip waitForHuman to waitForHumanVerdict for these specific tests. All 7 are testing visual outcomes ("did the link appear in Claude Code / terminal / Dummy AI?"). Without waitForHumanVerdict they silent-pass when the visible outcome is broken — claude-code-004 did exactly this against an issues/547 regression. The broader waitForHuman audit is out of scope (separate issue), but these 7 specifically should be flipped as part of THIS change because they're the same set being touched.
Verify with pnpm test:release:with-extensions --grep "<test-id>" (per release-test-requirement in CLAUDE.md). The human still needs to interact (press the chord, click PASS/FAIL) — but the setup friction is gone.
Definition of done
All 7 tests above use programmatic focus + selection
All 7 tests use waitForHumanVerdict (not waitForHuman)
Each test's consoleSteps array is free of any "Click into the file" / "Select lines X-Y" instructions
pnpm test:release:with-extensions --grep "claude-code-004|claude-code-005|clipboard-preservation-001|clipboard-preservation-004|clipboard-preservation-005|clipboard-preservation-009|core-send-commands-r-l-003" passes when the human clicks PASS for each visual check
No QA YAML changes needed — these tests stay automated: assisted (the human verdict is still the assertion)
Reference implementation
The canonical "right way" to do this is claude-code-002 in packages/rangelink-vscode-extension/src/__integration-tests__/suite/builtInAiAssistants.test.ts (around line 94). It demonstrates:
Log-based pre-assertions that fire before the verdict prompt for additional diagnostic value
Direct executeCommand(CMD_COPY_LINK_RELATIVE) invocation (replacing "press Cmd+R Cmd+L") — note that this last bit is OPTIONAL for the 7 tests in this issue; the keybinding-press can stay if it's part of the TC's intent. The setup automation is the focus here.
The broader waitForHuman → waitForHumanVerdict audit across all 147 assisted tests (separate issue worth filing).
The underlying issues/547 regression that made claude-code-004 silent-pass against a broken UI (separate issue worth filing).
Bottom line
7 tests, 8 boring instructions, all clustered in 3 files. Mechanical fix. The boring-setup category is much smaller than expected — claude-code-002 was an outlier, not the tip of an iceberg.
The bigger lever from the same investigation is the waitForHuman → waitForHumanVerdict audit, which would catch tests that currently silent-pass against broken UIs. That deserves its own survey and is not the same change as the boring-setup conversion — but for these 7 tests specifically, the two fixes should land together (see Implementer's checklist step 4).
Pointers
The fix pattern in production: claude-code-002 in packages/rangelink-vscode-extension/src/__integration-tests__/suite/builtInAiAssistants.test.ts
Spawned from the
claude-code-002fix: when the test instructs the human to "click into the test file (cc-002) to focus it" and "select exactly lines 1-2 (click line 1, shift-click end of line 2)", those steps are friction the test inflicts without test value. They can be replaced with three lines of code:The survey looked at all 147
[assisted]tests across 20 files for the same anti-pattern. The pool is smaller than expected: only 7 tests have this problem. Most assisted tests already automate setup correctly and ask the human only for the action that genuinely needs them.The 7 affected tests
builtInAiAssistants.test.tsclaude-code-004at line 162 — file content'line 1\nline 2\nline 3\n'. Selection:new vscode.Selection(0, 0, 1, 6). "Click into the test file (cc-004) to focus it" + "Select exactly lines 1-2".claude-code-005(cold step only) at line 205 — file content'line 1\nline 2\nline 3\nline 4\n'. Selection:new vscode.Selection(0, 0, 1, 6). "Click into the test file (cc-005)" + "Select exactly lines 1-2".The warm step of
claude-code-005already does it right — it setseditor.selection = new vscode.Selection(2, 0, 3, 7)programmatically. The cold step was missed when the file was written.clipboardPreservation.test.tsclipboard-preservation-001at line 134 — file content is 10 lines"line N content"(N=1..10). Selection: lines 2-4 →new vscode.Selection(1, 0, 3, document.lineAt(3).text.length)or simpler(1, 0, 4, 0).clipboard-preservation-004at line 198 — same 10-line content. Selection: lines 1-3 →(0, 0, 3, 0).clipboard-preservation-005at line 236 — same 10-line content. Instruction is non-deterministic ("Select 2 or 3 lines"); pin to lines 1-3 →(0, 0, 3, 0).clipboard-preservation-009at line 302 — same 10-line content. Instruction is non-deterministic ("Select a few lines"); pin to lines 1-3 →(0, 0, 3, 0).This file is the largest cluster — 4 of the 5 manual clipboard-preservation tests share the same pattern.
clipboard-preservation-005and-009are particularly notable because the human is told to "select 2 or 3 lines" / "select a few lines" — non-deterministic instructions. Pin to a specific range so assertions are stable across runs.coreSendCommands.test.tscore-send-commands-r-l-003at line 97 — file content is 10 lines"line N content". Selection: lines 1-3 →(0, 0, 3, 0). The expected link is${relPath}#L1-L3, so the selection range must produce that. Pinning to exactly(0, 0, 3, 0)gives#L1-L3deterministically.What the survey looked at and excluded
The survey explicitly excluded:
claude-code-002we replaced it withexecuteCommand(CMD_COPY_LINK_RELATIVE)and the test's contract is unchanged. There's a separate latent question of which TCs would benefit from this swap; out of scope here.waitForHumanVerdict.closeQuickOpenautomation survey — 41 assisted tests can become fully automated #557 (the closeQuickOpen survey).The bigger fish — a related finding worth flagging
While running these surveys, a more serious issue surfaced.
claude-code-004passed an integration run onissues/547even though the human watched Claude Code receive nothing — becauseclaude-code-004useswaitForHuman("click Cancel when done"), notwaitForHumanVerdict("click PASS or FAIL"). The log-based assertions only proved thatComposablePasteDestination.pasteLinkandVscodeAdapter.pasteTextFromClipboardfired — not that content arrived in the webview. The human's "yeah I clicked Cancel" was treated as "yeah the test passed".This is the silent-pass trap, and it's a different bug than the boring-setup one — but it's the same kind of test-rigor failure. Any
[assisted]test that useswaitForHumanto verify a visible outcome is susceptible to it: the human can be looking at a broken UI and the test will still go green.Tests likely affected (rough scan, not a full audit):
claude-code-004(confirmed silent-pass; the cold-paste test passed against a webview that received nothing)claude-code-005(warm step) — samewaitForHumanpattern, same riskclipboard-preservation-001/004/005/009— instruction asks "verify the link should appear in terminal/Dummy AI" but only requires Cancel click; the clipboard assertion is partial coverage at best[assisted]tests inbuiltInAiAssistants.test.tsother thanclaude-code-002(which useswaitForHumanVerdict)A separate, focused audit is worth doing. Not folding it into this issue because the scope is different — but it should not get lost.
Recommended conversion pattern
For each of the 7 affected tests, the fix is mechanical:
Before:
After:
(The
waitForHumanVerdictswap is recommended but separately scoped — it fixes the silent-pass trap for these tests; the boring-setup fix is the smaller, more contained change.)Implementer's checklist (per test)
For each of the 7 tests above:
Pre-focus and pre-select before any
waitForHuman/waitForHumanVerdictcall:Place this AFTER
createWorkspaceFile/openEditorand AFTER any bind setup (executeCommand(CMD_BIND_TO_*)), but BEFORE the human prompt.Strip the boring instructions from
consoleSteps. Remove the "Click into the test file" and "Select exactly lines X-Y" lines. Renumber the remaining steps.Update the action string to reflect what the human still needs to do (typically "Press Cmd+R Cmd+L. Did the link appear in X?" or similar).
STRONGLY RECOMMENDED — flip
waitForHumantowaitForHumanVerdictfor these specific tests. All 7 are testing visual outcomes ("did the link appear in Claude Code / terminal / Dummy AI?"). WithoutwaitForHumanVerdictthey silent-pass when the visible outcome is broken —claude-code-004did exactly this against an issues/547 regression. The broaderwaitForHumanaudit is out of scope (separate issue), but these 7 specifically should be flipped as part of THIS change because they're the same set being touched.Verify with
pnpm test:release:with-extensions --grep "<test-id>"(perrelease-test-requirementinCLAUDE.md). The human still needs to interact (press the chord, click PASS/FAIL) — but the setup friction is gone.Definition of done
waitForHumanVerdict(notwaitForHuman)consoleStepsarray is free of any "Click into the file" / "Select lines X-Y" instructionspnpm test:release:with-extensions --grep "claude-code-004|claude-code-005|clipboard-preservation-001|clipboard-preservation-004|clipboard-preservation-005|clipboard-preservation-009|core-send-commands-r-l-003"passes when the human clicks PASS for each visual checkautomated: assisted(the human verdict is still the assertion)Reference implementation
The canonical "right way" to do this is
claude-code-002inpackages/rangelink-vscode-extension/src/__integration-tests__/suite/builtInAiAssistants.test.ts(around line 94). It demonstrates:vscode.window.showTextDocument+editor.selectionsetupwaitForHumanVerdictwith PASS/FAIL contractexecuteCommand(CMD_COPY_LINK_RELATIVE)invocation (replacing "press Cmd+R Cmd+L") — note that this last bit is OPTIONAL for the 7 tests in this issue; the keybinding-press can stay if it's part of the TC's intent. The setup automation is the focus here.Out of scope
closeQuickOpenautomation survey — 41 assisted tests can become fully automated #557 — the closeQuickOpen survey).waitForHuman→waitForHumanVerdictaudit across all 147 assisted tests (separate issue worth filing).claude-code-004silent-pass against a broken UI (separate issue worth filing).Bottom line
7 tests, 8 boring instructions, all clustered in 3 files. Mechanical fix. The boring-setup category is much smaller than expected —
claude-code-002was an outlier, not the tip of an iceberg.The bigger lever from the same investigation is the
waitForHuman→waitForHumanVerdictaudit, which would catch tests that currently silent-pass against broken UIs. That deserves its own survey and is not the same change as the boring-setup conversion — but for these 7 tests specifically, the two fixes should land together (see Implementer's checklist step 4).Pointers
claude-code-002inpackages/rangelink-vscode-extension/src/__integration-tests__/suite/builtInAiAssistants.test.tsclaude-code-005warm step —editor.selection = new vscode.Selection(2, 0, 3, 7)closeQuickOpenautomation survey — 41 assisted tests can become fully automated #557 (41 tests convertible from assisted to fully automated viacloseQuickOpen)