Skip to content

Commit 5739829

Browse files
author
DavidQ
committed
Drive tool completion audit from Playwright tool-level validation results - PR_26124_011-gate-tools-by-playwright
1 parent 76ea304 commit 5739829

6 files changed

Lines changed: 301 additions & 8 deletions
Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# PR_26124_011 Report
2+
3+
## Purpose
4+
Use Playwright tool-level results as the source of truth for tool status in `tool_completion_audit.md`.
5+
6+
## Implementation
7+
- Added JSON Playwright output:
8+
- `tests/results/playwright-results.json`
9+
- Added sync script:
10+
- `scripts/update-tool-completion-audit-from-playwright.mjs`
11+
- Integrated sync into gate:
12+
- `scripts/run-workspace-v2-playwright-gate.mjs` now updates audit/report after each run.
13+
- Added generated tool validation report:
14+
- `docs/dev/reports/tool_validation_results.md`
15+
16+
## Mapping Rule
17+
- Tool test ownership is determined by `@<tool-id>` in Playwright test titles.
18+
- Each tool status is computed from mapped tests only:
19+
- all mapped tests pass => `PASS`
20+
- any mapped test fails => `FAIL`
21+
22+
## Outputs Updated By Sync
23+
- `docs/dev/reports/tool_completion_audit.md`
24+
- per-tool `Status` and `Exact failure reason` are now auto-derived from Playwright results.
25+
- gate evidence line now updates with pass/fail counts from the same results file.
26+
- `docs/dev/reports/tool_validation_results.md`
27+
- tool name
28+
- pass/fail
29+
- failing reason (if any)
30+
31+
## Validation
32+
- `node --check playwright.config.cjs`
33+
- `node --check scripts/run-workspace-v2-playwright-gate.mjs`
34+
- `node --check scripts/update-tool-completion-audit-from-playwright.mjs`
35+
- `npm run test:workspace-v2`
36+
37+
Most recent run result:
38+
- Gate: `18 passed`, `1 failed` (non-tool-level test timeout in `tests/playwright/workspace-v2.validation.spec.js`)
39+
- Tool-level mapping: all tools `PASS`
40+
- Audit/report synchronized to that run.

docs/dev/reports/tool_completion_audit.md

Lines changed: 37 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
- `vector-map-editor-v2`
1212

1313
## Evidence Used
14-
- `npm run test:workspace-v2` -> PASS (`1 passed`, non-zero gate on failure retained).
14+
- `npm run test:workspace-v2` -> FAIL (`18 passed`, `1 failed`).
1515
- `node tests/runtime/V2CrossToolFlow.test.mjs` -> PASS.
1616
- `node tests/runtime/V2ToolLaunch.test.mjs` -> FAIL (palette fixture contract drift in test logic).
1717
- `node tests/runtime/V2ToolActionFlow.test.mjs` -> FAIL (string-token matcher drift in test logic).
@@ -23,62 +23,68 @@
2323

2424
### workspace-v2
2525
- **Status:** PASS
26+
- Exact failure reason: All mapped Playwright tool-level tests passed.
27+
2628
- Valid JSON loads + expected UI: PASS
2729
- Invalid JSON rejected + clear error: PASS
2830
- No defaults/fallbacks: PASS (palette baseline is explicit and intentional contract behavior)
2931
- Workspace integration/no payload mutation: PASS
3032
- Launch paths (workspace only; sample launch out-of-scope until sample JSON is schema-compliant): PASS
31-
- Exact failure reason: none
3233
- Required fix: none
3334

3435
### asset-manager-v2
3536
- **Status:** PASS
37+
- Exact failure reason: All mapped Playwright tool-level tests passed.
38+
3639
- Valid JSON loads + expected UI: PASS (covered by Playwright gate and fixture path)
3740
- Invalid JSON rejected + clear error: PASS (explicit invalid/empty/runtime branches)
3841
- No defaults/fallbacks: PASS (no hidden sample/default data injection in tool runtime)
3942
- Workspace integration/no payload mutation: PASS (incoming payload is cloned; persistence only occurs on explicit Add/Remove actions)
4043
- Launch paths (workspace only; sample launch out-of-scope until sample JSON is schema-compliant): PASS
41-
- Exact failure reason: none
4244
- Required fix: none
4345

4446
### palette-manager-v2
4547
- **Status:** PASS
48+
- Exact failure reason: All mapped Playwright tool-level tests passed.
49+
4650
- Valid JSON loads + expected UI: PASS by code contract (`payloadJson.paletteDocument`) and fixture shape alignment
4751
- Invalid JSON rejected + clear error: PASS
4852
- No defaults/fallbacks: PASS
4953
- Workspace integration/no payload mutation: PASS (read-only in current tool runtime)
5054
- Launch paths (workspace only; sample launch out-of-scope until sample JSON is schema-compliant): PASS
51-
- Exact failure reason: none
5255
- Required fix: none
5356

5457
### svg-asset-studio-v2
5558
- **Status:** PASS
59+
- Exact failure reason: All mapped Playwright tool-level tests passed.
60+
5661
- Valid JSON loads + expected UI: PASS (fixture/contract path and valid-state rendering logic present)
5762
- Invalid JSON rejected + clear error: PASS
5863
- No defaults/fallbacks: PASS
5964
- Workspace integration/no payload mutation: PASS (read-only session consumption)
6065
- Launch paths (workspace only; sample launch out-of-scope until sample JSON is schema-compliant): PASS (workspace handoff flow validated)
61-
- Exact failure reason: none
6266
- Required fix: none
6367

6468
### tilemap-studio-v2
6569
- **Status:** PASS
70+
- Exact failure reason: All mapped Playwright tool-level tests passed.
71+
6672
- Valid JSON loads + expected UI: PASS (fixture/contract path and valid-state rendering logic present)
6773
- Invalid JSON rejected + clear error: PASS
6874
- No defaults/fallbacks: PASS
6975
- Workspace integration/no payload mutation: PASS (read-only session consumption)
7076
- Launch paths (workspace only; sample launch out-of-scope until sample JSON is schema-compliant): PASS (workspace handoff flow validated)
71-
- Exact failure reason: none
7277
- Required fix: none
7378

7479
### vector-map-editor-v2
7580
- **Status:** PASS
81+
- Exact failure reason: All mapped Playwright tool-level tests passed.
82+
7683
- Valid JSON loads + expected UI: PASS (fixture/contract path and valid-state rendering logic present)
7784
- Invalid JSON rejected + clear error: PASS
7885
- No defaults/fallbacks: PASS
7986
- Workspace integration/no payload mutation: PASS (read-only session consumption)
8087
- Launch paths (workspace only; sample launch out-of-scope until sample JSON is schema-compliant): PASS (workspace handoff flow validated)
81-
- Exact failure reason: none
8288
- Required fix: none
8389

8490
## Cross-Cutting Findings
@@ -89,3 +95,27 @@
8995
- `V2SessionValidation.test.mjs` expects legacy palette validation path.
9096
- `V2ToolActionFlow.test.mjs` checks brittle string tokens for route assembly.
9197
- Status updates above reflect the scoped fixes for the previously listed FAIL tools.
98+
99+
100+
101+
102+
103+
104+
105+
106+
107+
108+
109+
110+
111+
112+
113+
114+
115+
116+
117+
118+
119+
120+
121+
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Tool Validation Results
2+
3+
Derived from `tests/results/playwright-results.json` generated by `npm run test:workspace-v2`.
4+
5+
| Tool | Status | Failing Reason |
6+
| --- | --- | --- |
7+
| `workspace-v2` | PASS | n/a |
8+
| `asset-manager-v2` | PASS | n/a |
9+
| `palette-manager-v2` | PASS | n/a |
10+
| `svg-asset-studio-v2` | PASS | n/a |
11+
| `tilemap-studio-v2` | PASS | n/a |
12+
| `vector-map-editor-v2` | PASS | n/a |
13+

playwright.config.cjs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,8 @@ module.exports = {
1515
],
1616
reporter: [
1717
["list"],
18-
["html", { outputFolder: "tests/results/report", open: "always" }]
18+
["html", { outputFolder: "tests/results/report", open: "always" }],
19+
["json", { outputFile: "tests/results/playwright-results.json" }]
1920
],
2021
use: {
2122
headless: false,

scripts/run-workspace-v2-playwright-gate.mjs

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,9 +42,31 @@ const failedCount = failedMatch ? Number.parseInt(failedMatch[2], 10) : 0;
4242

4343
console.log(`Workspace V2 Playwright Gate Summary: passed=${passedCount} failed=${failedCount}`);
4444

45+
const auditSyncResult = spawnSync(
46+
command,
47+
[path.join(repoRoot, "scripts", "update-tool-completion-audit-from-playwright.mjs")],
48+
{
49+
cwd: repoRoot,
50+
encoding: "utf8",
51+
stdio: ["ignore", "pipe", "pipe"]
52+
}
53+
);
54+
55+
if (typeof auditSyncResult.stdout === "string" && auditSyncResult.stdout) {
56+
process.stdout.write(auditSyncResult.stdout);
57+
}
58+
if (typeof auditSyncResult.stderr === "string" && auditSyncResult.stderr) {
59+
process.stderr.write(auditSyncResult.stderr);
60+
}
61+
4562
if (result.error) {
4663
console.error(`Workspace V2 Playwright gate execution failed: ${result.error.message}`);
4764
process.exitCode = 1;
65+
} else if (auditSyncResult.error) {
66+
console.error(`Workspace V2 audit sync failed: ${auditSyncResult.error.message}`);
67+
process.exitCode = 1;
68+
} else if (auditSyncResult.status !== 0) {
69+
process.exitCode = typeof auditSyncResult.status === "number" && auditSyncResult.status !== 0 ? auditSyncResult.status : 1;
4870
} else if (result.status !== 0 || failedCount > 0) {
4971
process.exitCode = typeof result.status === "number" && result.status !== 0 ? result.status : 1;
5072
}
Lines changed: 187 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,187 @@
1+
/*
2+
Toolbox Aid
3+
David Quesenberry
4+
05/03/2026
5+
update-tool-completion-audit-from-playwright.mjs
6+
*/
7+
import fs from "node:fs";
8+
import path from "node:path";
9+
import { fileURLToPath } from "node:url";
10+
11+
const __filename = fileURLToPath(import.meta.url);
12+
const __dirname = path.dirname(__filename);
13+
const repoRoot = path.resolve(__dirname, "..");
14+
const resultsPath = path.join(repoRoot, "tests", "results", "playwright-results.json");
15+
const auditPath = path.join(repoRoot, "docs", "dev", "reports", "tool_completion_audit.md");
16+
const validationReportPath = path.join(repoRoot, "docs", "dev", "reports", "tool_validation_results.md");
17+
18+
function readJsonFile(filePath) {
19+
return JSON.parse(fs.readFileSync(filePath, "utf8"));
20+
}
21+
22+
function readTextFile(filePath) {
23+
return fs.readFileSync(filePath, "utf8");
24+
}
25+
26+
function writeTextFile(filePath, text) {
27+
fs.writeFileSync(filePath, text);
28+
}
29+
30+
function collectToolIdsFromAudit(auditText) {
31+
const toolMatches = [...auditText.matchAll(/###\s+([a-z0-9-]+-v2)\s*$/gm)];
32+
const toolIds = [];
33+
for (const match of toolMatches) {
34+
const toolId = match[1].trim();
35+
if (!toolIds.includes(toolId)) {
36+
toolIds.push(toolId);
37+
}
38+
}
39+
return toolIds;
40+
}
41+
42+
function collectTestsFromPlaywrightSuite(suite, tests) {
43+
if (suite.specs && Array.isArray(suite.specs)) {
44+
for (const spec of suite.specs) {
45+
const specTitle = typeof spec.title === "string" ? spec.title : "";
46+
const titlePath = Array.isArray(spec.titlePath) ? spec.titlePath : [];
47+
if (!spec.tests || !Array.isArray(spec.tests)) {
48+
continue;
49+
}
50+
for (const test of spec.tests) {
51+
const resultStatus = Array.isArray(test.results)
52+
? test.results.map((result) => result.status).find((status) => status && status !== "skipped")
53+
: null;
54+
tests.push({
55+
title: typeof test.title === "string" ? test.title : specTitle,
56+
fullTitle: [...titlePath, test.title || specTitle].filter(Boolean).join(" > "),
57+
status: resultStatus || (test.outcome === "expected" ? "passed" : "failed"),
58+
error: Array.isArray(test.results)
59+
? test.results.map((result) => result.error && result.error.message).find(Boolean) || ""
60+
: ""
61+
});
62+
}
63+
}
64+
}
65+
if (suite.suites && Array.isArray(suite.suites)) {
66+
for (const childSuite of suite.suites) {
67+
collectTestsFromPlaywrightSuite(childSuite, tests);
68+
}
69+
}
70+
}
71+
72+
function collectAllPlaywrightTests(playwrightJson) {
73+
const tests = [];
74+
if (playwrightJson.suites && Array.isArray(playwrightJson.suites)) {
75+
for (const suite of playwrightJson.suites) {
76+
collectTestsFromPlaywrightSuite(suite, tests);
77+
}
78+
}
79+
return tests;
80+
}
81+
82+
function getToolStatusFromTests(toolId, tests) {
83+
const matchingTests = tests.filter((test) => test.fullTitle.includes(`@${toolId}`) || test.title.includes(`@${toolId}`));
84+
if (matchingTests.length === 0) {
85+
return {
86+
status: "FAIL",
87+
reason: "No Playwright tool-level tests found for this tool."
88+
};
89+
}
90+
const failedTests = matchingTests.filter((test) => test.status !== "passed");
91+
if (failedTests.length > 0) {
92+
const firstFailure = failedTests[0];
93+
const failureMessage = firstFailure.error ? firstFailure.error.replace(/\s+/g, " ").trim() : "Playwright reported a failing tool-level test.";
94+
return {
95+
status: "FAIL",
96+
reason: `${firstFailure.fullTitle}${failureMessage ? ` -> ${failureMessage}` : ""}`
97+
};
98+
}
99+
return {
100+
status: "PASS",
101+
reason: "All mapped Playwright tool-level tests passed."
102+
};
103+
}
104+
105+
function updateToolSectionStatus(auditText, toolId, status, reason) {
106+
const sectionPattern = new RegExp(`(###\\s+${toolId}[\\s\\S]*?)(?=\\n###\\s+[a-z0-9-]+-v2\\s*$|\\s*$)`);
107+
const sectionMatch = auditText.match(sectionPattern);
108+
if (!sectionMatch) {
109+
return auditText;
110+
}
111+
const section = sectionMatch[1];
112+
const sectionLines = section.split(/\r?\n/);
113+
const headingLine = sectionLines[0];
114+
const remainingLines = sectionLines
115+
.slice(1)
116+
.filter((line) => !/^\s*-\s+\*\*Status:\*\*/.test(line))
117+
.filter((line) => !/^\s*-\s+Exact failure reason:/.test(line));
118+
const rebuiltLines = [
119+
headingLine,
120+
`- **Status:** ${status}`,
121+
`- Exact failure reason: ${reason}`,
122+
...remainingLines
123+
];
124+
const rebuiltSection = `${rebuiltLines.join("\n").replace(/\n{3,}/g, "\n\n").trimEnd()}\n`;
125+
return auditText.replace(sectionPattern, rebuiltSection);
126+
}
127+
128+
function buildValidationReport(toolIds, toolStatuses) {
129+
const lines = [];
130+
lines.push("# Tool Validation Results");
131+
lines.push("");
132+
lines.push("Derived from `tests/results/playwright-results.json` generated by `npm run test:workspace-v2`.");
133+
lines.push("");
134+
lines.push("| Tool | Status | Failing Reason |");
135+
lines.push("| --- | --- | --- |");
136+
for (const toolId of toolIds) {
137+
const entry = toolStatuses[toolId];
138+
const reasonText = entry.status === "FAIL" ? entry.reason : "n/a";
139+
lines.push(`| \`${toolId}\` | ${entry.status} | ${reasonText.replace(/\|/g, "\\|")} |`);
140+
}
141+
lines.push("");
142+
return `${lines.join("\n")}\n`;
143+
}
144+
145+
function updateGateEvidenceLine(auditText, passedCount, failedCount) {
146+
const gateStatus = failedCount > 0 ? "FAIL" : "PASS";
147+
const gateLine = `- \`npm run test:workspace-v2\` -> ${gateStatus} (\`${passedCount} passed\`, \`${failedCount} failed\`).`;
148+
if (/^- `npm run test:workspace-v2` -> .*/m.test(auditText)) {
149+
return auditText.replace(/^- `npm run test:workspace-v2` -> .*/m, gateLine);
150+
}
151+
return auditText;
152+
}
153+
154+
function main() {
155+
if (!fs.existsSync(resultsPath)) {
156+
const reportText = "# Tool Validation Results\n\nPlaywright results file is missing. Run `npm run test:workspace-v2` first.\n";
157+
writeTextFile(validationReportPath, reportText);
158+
console.error(`Missing Playwright results: ${resultsPath}`);
159+
process.exitCode = 1;
160+
return;
161+
}
162+
163+
const playwrightJson = readJsonFile(resultsPath);
164+
const auditText = readTextFile(auditPath);
165+
const toolIds = collectToolIdsFromAudit(auditText);
166+
const tests = collectAllPlaywrightTests(playwrightJson);
167+
const passedCount = tests.filter((test) => test.status === "passed").length;
168+
const failedCount = tests.filter((test) => test.status !== "passed").length;
169+
const toolStatuses = {};
170+
171+
let updatedAuditText = updateGateEvidenceLine(auditText, passedCount, failedCount);
172+
for (const toolId of toolIds) {
173+
toolStatuses[toolId] = getToolStatusFromTests(toolId, tests);
174+
updatedAuditText = updateToolSectionStatus(
175+
updatedAuditText,
176+
toolId,
177+
toolStatuses[toolId].status,
178+
toolStatuses[toolId].reason
179+
);
180+
}
181+
182+
writeTextFile(auditPath, updatedAuditText);
183+
writeTextFile(validationReportPath, buildValidationReport(toolIds, toolStatuses));
184+
console.log(`Updated tool audit and validation report from Playwright results for ${toolIds.length} tools.`);
185+
}
186+
187+
main();

0 commit comments

Comments
 (0)