fix(specialist): cap max specialist nesting depth#491
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughAdds a hard maximum specialist nesting depth in ChangesDepth limit enforcement
Estimated code review effort: 2 (Simple) | ~10 minutes Possibly related issues
Possibly related PRs
Suggested reviewers: 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Vasanthdev2004 asked for this PR to be split since it bundled several independent static-analysis fixes. Remove the specialist depth cap, bash output OOM cap, and persistence error logging: each now has its own PR (Gitlawb#491, Gitlawb#492, Gitlawb#493). Also drop the "cat" addition to the Windows POSIX-utility detection in shell_runtime.go. PR Gitlawb#476 already covers cat detection comprehensively under the MSYS/sandbox angle, so keeping it here would duplicate that work. What remains: the Windows cmd.exe quoting-guidance rewrite, the Windows-specific interactive-command suggestions (steering away from Unix head/tail/ps toward native or PowerShell alternatives), the clipboard PowerShell -Command fix, and the CI test-slack change in exec_command_test.go. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
Vasanthdev2004
left a comment
There was a problem hiding this comment.
Clean little guard, and it's the specialist-depth split from #468 done exactly as asked. Cap's at 8, the check fires before any specialist work starts, and the comparison is right — it rejects only past the cap, so legit nesting up to it still works. And the cap is actually reachable: CurrentDepth propagates through --depth, so a runaway chain does hit it rather than this being dead code. Test covers it, CodeRabbit's clean. Closes #470. Approving.
jatmn
left a comment
There was a problem hiding this comment.
I found an issue that needs to be addressed before this is ready.
Findings
-
[P2] Stop launching the over-cap specialist child
internal/specialist/exec.go:222
The new guard rejects only when the parent run'sCurrentDepthis already greater thanmaxSpecialistDepth, but the child process is launched with--depth CurrentDepth+1a few lines later. A specialist currently at depth 8 therefore still passes this check and starts another specialist session at depth 9, so work begins beyond the advertised maximum before the next Task call is rejected. Please reject when the next child would exceed the cap, and cover that boundary for both fresh and resumed specialist runs. -
[P2] Demonstrate the recursive Task path before closing #470
internal/cli/app.go:984
The PR body says this closes #470 because "a chain of specialists recursively spawning further specialists via--depth" can recurse indefinitely, but the normal child path appears to suppress that chain:BuildArgs/BuildResumeArgsalways launch the child with--tag specialist, andshouldRegisterExecSpecialistToolsreturns false for that tag, with an existing--list-toolstest asserting that a specialist child does not get the Task tool even under--auto high. Please either add evidence/tests for the reachable recursive Task path this cap is meant to stop, or update the PR/issue-closing claim so #470 is not closed by a guard that does not cover the reported behavior.
Addresses jatmn's review on PR Gitlawb#491: - Executor.Run checked `CurrentDepth > maxSpecialistDepth`, but this call is what launches the child at CurrentDepth+1. A parent already AT the cap passed the check and started a child one level past it, only getting rejected on that child's own next call. Change to `>=` so the guard fires before the over-cap child is launched. - Add TestRunRejectsDepthAtMax to cover the boundary explicitly.
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the update. I rechecked the previously discussed paths and found one issue that still needs to be addressed.
Findings
-
[P2] Do not auto-close #470 from this defense-in-depth guard
PR description
The code now fixes the off-by-one by rejectingCurrentDepth >= maxSpecialistDepth, so the first review item is addressed. The PR body also now agrees with the second review item that today's normal specialist/swarm child path cannot recursively reach Task because child sessions are launched with--tag specialistandshouldRegisterExecSpecialistToolssuppresses Task for that tag. However, the body still saysCloses #470, which would close the issue describing an actually recursive Task chain even though the PR now says this is only a future-proof guard and not an active-exploit fix. Please complete the prior review request by removing the closing keyword from this PR, or add concrete evidence/tests for a currently reachable recursive path that this cap actually stops. -
[P3] Cover the resume boundary that was requested
internal/specialist/exec_test.go:348
The previous review asked for the over-cap boundary to be covered for both fresh and resumed specialist runs.TestRunRejectsDepthAtMaxdoes verify the shared pre-branch guard today, but it only passes a plain prompt and never exercises theResumeparameters, so a future refactor could move the guard into the fresh path and still leave the requested resume boundary uncovered. Please add the explicit resume-at-cap case as well, or otherwise make the test table prove both fresh and resume calls reject before launching a child.
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the update. I rechecked the previously discussed paths and found one issue that still needs to be addressed.
Findings
- [P2] Remove the closing keyword from the commit history too
548cf478
The PR body now correctly says this is only related to #470 and is not closing it, but the first commit in this PR still containsCloses #470. That leaves the same issue-closing risk in the branch metadata: if this is merged in a way that preserves the original commit message, #470 can still be closed even though the current PR description says this defense-in-depth guard is not an active fix for the recursive Task path. Please complete the prior review request by rewriting/removing that closing keyword from the PR's commits, or add concrete evidence/tests for a currently reachable recursive path that this cap actually stops.
A Task specialist can recursively spawn further specialists via --depth. Without a ceiling, a runaway or malicious chain of Task calls can recurse indefinitely, exhausting process and memory resources. Reject any run whose depth exceeds a hard cap of 8. Related to Gitlawb#470, not closing it: today's normal specialist/swarm path cannot reach this recursively, since child sessions are launched with --tag specialist and shouldRegisterExecSpecialistTools suppresses the Task tool for that tag (see PR description for details). This is a defense-in-depth guard, not a fix for an active exploit. Split out of Gitlawb#468 at reviewer request. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
Addresses jatmn's review on PR Gitlawb#491: - Executor.Run checked `CurrentDepth > maxSpecialistDepth`, but this call is what launches the child at CurrentDepth+1. A parent already AT the cap passed the check and started a child one level past it, only getting rejected on that child's own next call. Change to `>=` so the guard fires before the over-cap child is launched. - Add TestRunRejectsDepthAtMax to cover the boundary explicitly.
The pre-branch depth guard in Run sits before the fresh/resume split, but the boundary test only exercised the fresh path, leaving a future refactor free to move the guard without any test catching a resume regression.
087e3c9 to
c89032f
Compare
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the update. I rechecked the previously discussed paths and do not see any remaining actionable issues from my side.
Summary
Adds a hard cap on specialist (Task tool) nesting depth, as a defense-in-depth safeguard.
Related to #470. Not closing it: as explained below, today's normal specialist/swarm path can't actually reach this recursively, so this cap is a future-proof guard rather than a fix for an active exploit of the recursive chain #470 describes.
Changes
maxSpecialistDepth = 8ininternal/specialist/exec.go.Executor.Runnow rejects a call whoseCurrentDepthis already at or past the cap with a clear error, before any specialist work starts. (Fixed from an earlier off-by-one: the check must trigger atCurrentDepth >= maxSpecialistDepth, not>, since this call is what launches the child atCurrentDepth+1— checking only>let a parent already at the cap start one more level before the guard caught it.)Why this is defense-in-depth, not an active-exploit fix
Per jatmn's review: today, a specialist (or swarm member) can't actually reach this code path recursively. Every specialist/swarm child is launched with
--tag specialist(internal/specialist/exec.go, the only place--tagis set on a spawned child), andshouldRegisterExecSpecialistTools(internal/cli/app.go) refuses to register the Task tool whenevertag == "specialist", regardless of--auto high— already covered by the existingexec_test.gocase asserting a specialist child does not get the Task tool. So a specialist cannot itself call Task to spawn a grandchild, andCurrentDepthcan't structurally exceed 1 through today's normal flow.The cap is still worth having: it's cheap, matches the issue's suggested fix (hard
MaxDepthwith a clear error), and protects against a future change to that tag-gating logic silently reopening the recursive path. It is not closing an actively exploitable bug today.Test plan
TestRunRejectsDepthExceedingMaxandTestRunRejectsDepthAtMaxininternal/specialist/exec_test.go, covering both the already-past-cap case and the exact-boundary case the off-by-one fix addresses.TestRunRejectsDepthAtMaxis now a table covering both a fresh call and aResumecall, since the guard sits before the fresh/resume split inRun.go build ./...go vet ./internal/specialist/...go test ./internal/specialist/... -count=1(all pass)This was split out of #468 at reviewer request (Vasanthdev2004 asked for the bundled fixes to be split into independent PRs).
🤖 Generated with Claude Code
Summary by CodeRabbit