test(academic): extract auto-iterate loop state machine to bin/ + tests#18
Merged
Conversation
回應「LLM 驅動的編排邏輯要怎麼測」:Phase 5b 迴圈拆層後 80% 是確定性膠水 (Functional Core, Imperative Shell),抽出可窮舉測試,未測表面縮到只剩 apply_fixes 那一次 Edit(eval 範疇)。 新增: - bin/pai-iterate-decide — 主迴圈轉移函式(純狀態機,JSON in/out): halt(converged/max-rounds)、最後一輪仍套 fix、mode 奇偶、 last_3_同focus_CONVERGED→輪替、pool 繞回、max-rounds clamp [1,30]。 test/pai-iterate-decide.test.mjs(17)窮舉轉移。 - bin/pai-iter-commit — per-round checkpoint commit + 空輪防護(不留空 iter-N commit)。test/pai-iter-commit.bats(9)fixture repo 斷言 commit graph。 - academic SKILL.md Phase 5b 改寫:決策委派兩 script,唯一 LLM 步驟剩 apply_fixes。 - test/README.md 補「確定性核心 + 模型 seam」測試哲學一節。 - CI/run.sh:shellcheck 擴及、node 跑全部 *.test.mjs。bats 50→59、node 8→25。 bump 2.15.0 → 2.16.0(plugin.json + marketplace.json)。
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
摘要
回應「多輪 apply-fix、focus-rotation、git-commit-per-round 這些 LLM 驅動的編排邏輯要怎麼測」—— 答案是:把迴圈拆層後它們大多不是 LLM 邏輯。Phase 5b 的
--auto-iterate主迴圈約 80% 是確定性膠水(halt 判定、mode 交替、focus-rotation heuristic、commit),只有apply_fixes那一次 Edit 是真非確定。本 PR 把確定性核心抽成兩個 shipped script + 26 個新測試(Functional Core, Imperative Shell),未測表面縮到只剩apply_fixes(eval 範疇)。新增
bin/pai-iterate-decide(node,JSON in/out){round, maxRounds, verdict, convergeOn, currentFocus, focusHistory, focusPool}→{halt, reason, applyFixes, nextRound, nextMode, nextFocus, rotated}bin/pai-iter-commit(bash)iter-N:訊息 + 空輪防護(無變更不留空 commit)academic SKILL.md Phase 5b 主迴圈改寫成委派這兩個 script —— 迴圈剩「跑一輪 ensemble → parse verdict → 問 decider → (若 applyFixes)apply + commit → 依 decider 前進」,唯一非確定步驟是
apply_fixes。測試涵蓋的語意(窮舉自 SKILL.md 原 pseudo-code)
verdict==convergeOn→converged(不套 fix);nextRound > maxRounds→max-rounds(本輪 fix 照套 —— 原迴圈 break 只在 converge,這個微妙語意現在被測試鎖死)CONVERGED ≠ PERMANENT_CONVERGENCE(前者預設不 halt);自訂--converge-on生效--max-roundsclamp)add -A納入、空輪跳過不留空 commit、round 驗證(0/leading-zero/非數字/11 位溢位)、非 repo → exit 1測試哲學(已寫進 test/README.md)
驗證
版本
2.15.0→2.16.0(plugin.json + marketplace.json 同步)。CHANGELOG 已補[2.16.0]。覆蓋現況(本 PR 後)
pai-build-diffensemble-workflow.jspai-parse-lens-csvpai-parse-verdictpai-iterate-decidepai-iter-commit唯一仍未單元測的:
apply_fixes(真 LLM Edit)—— 屬 eval 範疇。