test(ensemble): extract compose CSV + academic verdict parsers to bin/ + bats#17
Merged
Conversation
…/ + bats
把 compose 的 --lens-file CSV 解析、academic --auto-iterate 的 verdict tag 解析
從 SKILL.md inline snippet 抽成 shipped script(單一真相源)+ bats 覆蓋。
抽取過程 dogfood 各抓到 1 個真 bug。
新增:
- bin/pai-parse-lens-csv(python csv 模組)+ test(14):含逗號/引號/換行 focus、
needsSrt 變體、空欄、BOM、CRLF、缺檔/缺欄。
- bin/pai-parse-verdict(bash grep,取 last-match)+ test(11):last-match、
{N} placeholder、嚴格大寫、查無 tag、stdin/file。
- compose/academic SKILL.md 改呼叫 script;CI/run.sh 加 py_compile + shellcheck 擴及。
bats 25 → 50。
修正(extraction dogfood):
- compose CSV BOM 靜默丟列:utf-8 → utf-8-sig(負控確認舊寫法對 BOM CSV 輸出 [])。
- academic verdict first-match 假收斂:echoed instruction 範例標籤會被 first-match
誤抓 → 改 last-match(verdict 在 review 最末)。
bump 2.14.1 → 2.15.0(plugin.json + marketplace.json)。
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
摘要
把最後兩處「住在 SKILL.md markdown 裡」的解析 snippet —— compose 的
--lens-fileCSV 解析、academic--auto-iterate的 Codex verdict tag 解析 —— 抽成 shipped script + bats 覆蓋。延續前幾個 PR 的「inline → 單一真相源 + 自動化測試」架構。抽取過程 dogfood 各抓到 1 個真 bug。新增
bin/pai-parse-lens-csv(python)bin/pai-parse-verdict(bash)test/pai-parse-lens-csv.bats(14)test/pai-parse-verdict.bats(11){N}placeholder、嚴格大寫、查無 tag、stdin/file。compose / academic SKILL.md 改成呼叫 script;CI +
test/run.sh加py_compile(python lint)、shellcheck 擴及pai-parse-verdict。bats 總數 25 → 50。抽取過程 dogfood 抓到的 2 個真 bug
1. compose CSV 的 BOM 靜默丟列(fail-silent):原 inline snippet 用
encoding='utf-8'。Excel/Windows 存的 CSV 帶 UTF-8 BOM 時,首欄 header 變key→ 每列r.get('key')回 None →if not k: continue→ 整批 lens 被靜默丟棄,使用者的 CSV 形同被忽略(harness 接著回「no active lenses」,至少 fail-closed 不假 pass,但使用者完全不知道自己的 CSV 沒生效)。改utf-8-sig。2. academic verdict 的 first-match 假收斂(fail-open):原 regex 取 first-match。但 Codex 輸出開頭可能 echo「輸出格式說明」裡的範例標籤 ——
<verdict>CONVERGED</verdict>/<verdict>PERMANENT_CONVERGENCE</verdict>字面寫在 instruction 中。first-match 會誤抓那些範例 → 假收斂、提前 halt 自治迴圈。契約明寫 verdict「at the very end」→ 改 last-match。兩個 bug 都已固化成 bats regression(CSV 的「BOM 不丟列」、verdict 的「echoed 範例在前、真 verdict 在後 → 取最後一個」)。
驗證
版本
2.14.1→2.15.0(plugin.json + marketplace.json 同步)。CHANGELOG 已補[2.15.0]。覆蓋現況(本 PR 後)
四個最 bug-prone 的非 LLM 邏輯 surface 都有自動化 regression。剩下的(多輪 apply-fix、focus-rotation、git-commit-per-round)是 LLM 驅動的編排,不適合單元測試。