feat(sdk): thread reasoningEffort through agent spawning pipeline by ahhlun · Pull Request #1148 · bradygaster/squad

ahhlun · 2026-05-22T00:03:19Z

Summary

Threads reasoningEffort through the full agent spawning pipeline, enabling per-agent configuration of reasoning effort levels (low, medium, high, xhigh).

Problem

SquadSessionConfig already has a reasoningEffort?: SquadReasoningEffort field that passes through to the Copilot SDK, but Squad's spawning pipeline never sets it. Users cannot configure reasoning effort per-agent through charter files, config, or SDK builders.

Solution

Thread reasoningEffort through all 3 configuration layers and both spawning paths, mirroring the existing pattern for model:

Configuration Layers

Charter markdown — Parse **Reasoning Effort:** from ## Model section in charter.md
Config persistence — defaultReasoningEffort and agentReasoningEffortOverrides in .squad/config.json
SDK builders — reasoningEffort on defineAgent() and defineDefaults()

Spawning Paths

Lifecycle manager — SpawnAgentOptions.reasoningEffortOverride → SquadSessionConfig.reasoningEffort
Fan-out pipeline — AgentSpawnConfig.reasoningEffortOverride → createSession({reasoningEffort})

Resolution

New resolveReasoningEffort() with layered priority hierarchy matching model resolution:

Layer 0a: Per-agent config override
Layer 0b: Global config default
Layer 1: Spawn-time override
Layer 2: Charter preference
Layer 3: Default (undefined — let SDK/API decide)

The value auto is a sentinel meaning "let the downstream SDK/API decide" and resolves to undefined.

Effort Clamping

New clampReasoningEffort() caps requested effort to the model's maximum supported level. This prevents API errors when a user requests e.g. xhigh on a model that only supports up to high.

Behavior:

xhigh on a model with [low,medium,high] → high (clamped)
xhigh on a model with no effort support → undefined (stripped)
max and xhigh treated as equivalent rank
resolveReasoningEffort() accepts optional supportedEfforts from listModels() and auto-clamps

Usage Example

In charter.md:

## Model

- **Preferred:** claude-opus-4.7-1m-internal
- **Reasoning Effort:** xhigh

In .squad/config.json:

{
  "defaultReasoningEffort": "high",
  "agentReasoningEffortOverrides": {
    "fenster": "xhigh"
  }
}

In squad.config.ts:

defineAgent({
  name: 'fenster',
  role: 'Core Dev',
  model: 'claude-opus-4.7-1m-internal',
  reasoningEffort: 'xhigh',
});

Files Changed (13)

File	Change
`packages/squad-sdk/src/agents/charter-compiler.ts`	Parse `Reasoning Effort:`, add to `ParsedCharter`, `CharterConfigOverrides`, `CompiledCharter`
`packages/squad-sdk/src/agents/index.ts`	Add `reasoningEffort` to `AgentCharter` interface, wire into compile methods
`packages/squad-sdk/src/agents/lifecycle.ts`	Add `reasoningEffortOverride` to `SpawnAgentOptions`, pass to `SquadSessionConfig`
`packages/squad-sdk/src/coordinator/fan-out.ts`	Add `reasoningEffortOverride` to `AgentSpawnConfig`, optional `resolveReasoningEffort` on `FanOutDependencies`
`packages/squad-sdk/src/config/models.ts`	Add `read/writeReasoningEffort()`, `resolveReasoningEffort()`, `clampReasoningEffort()`
`packages/squad-sdk/src/config/schema.ts`	Add `reasoningEffort` to `AgentConfig`, `defaultReasoningEffort` to `ModelConfig`
`packages/squad-sdk/src/builders/types.ts`	Add `reasoningEffort` to `AgentDefinition` and `DefaultsDefinition`
`packages/squad-sdk/src/builders/index.ts`	Validate `reasoningEffort` in `defineAgent()` and `defineDefaults()`
`packages/squad-sdk/templates/charter.md`	Add `Reasoning Effort: auto` to `## Model` section
`test/charter-compiler.test.ts`	Tests for parsing reasoning effort from charter
`test/fan-out.test.ts`	Tests for passing reasoning effort through spawn
`test/model-preference.test.ts`	Tests for config persistence, resolution hierarchy, and clamping
`.changeset/reasoning-effort.md`	Minor bump for squad-sdk

Testing

Unit Tests

✅ 183 tests pass across 5 test files (charter-compiler, fan-out, model-preference, builders, lifecycle)
✅ Build passes (npm run build)
✅ Backwards compatible — resolveReasoningEffort on FanOutDependencies is optional

Live Integration Test (claude-opus-4.7-1m-internal)

Ran a live test creating real Copilot sessions to verify reasoning effort is actually applied at the API level:

	`xhigh`	default (medium)
Reasoning delta events	8 ✅	0
`assistant.reasoning` event	✅ present	❌ absent
Output tokens	128	206

Key findings:

reasoningEffort: "xhigh" on claude-opus-4.7-1m-internal produced 8 reasoning/thinking delta events with actual thinking content. The default (medium) produced zero.
The model spends tokens on internal reasoning with higher effort, resulting in more concise output (128 vs 206 tokens).
Confirmed the full pipeline works: SquadSessionConfig.reasoningEffort → Copilot SDK → Copilot CLI → API → model thinking tokens.

Live Model Capabilities (from `listModels()` API)

Model	Supported Efforts	Default
claude-sonnet-4.6	`[none,low,medium,high]`	medium
claude-opus-4.6	`[none,low,medium,high]`	medium
claude-opus-4.7	`[none,medium]`	medium
claude-opus-4.7-1m-internal	`[none,low,medium,high,xhigh]`	medium
gpt-5.5	`[none,low,medium,high,xhigh]`	medium
gpt-5.4	`[none,low,medium,high,xhigh]`	medium
gpt-5.2-codex	`[none,low,medium,high,xhigh]`	high
gpt-5.2	`[none,low,medium,high]`	medium
gpt-5-mini	`[none,low,medium,high]`	medium

Clamping Validation

Tested sending xhigh to claude-sonnet-4.6 (max: high) — API silently accepted (no crash), confirming clamping is a nice-to-have for transparency rather than crash prevention.

Add support for configuring reasoning effort level per-agent through: - Charter markdown: **Reasoning Effort:** field in ## Model section - Config persistence: defaultReasoningEffort and agentReasoningEffortOverrides in .squad/config.json - SDK builders: reasoningEffort on defineAgent() and defineDefaults() - Layered resolution: resolveReasoningEffort() with priority hierarchy matching model resolution - Lifecycle manager: SpawnAgentOptions.reasoningEffortOverride passes through to SquadSessionConfig - Fan-out pipeline: AgentSpawnConfig.reasoningEffortOverride passes through to createSession() The value 'auto' is treated as a sentinel meaning 'let the SDK/API decide' and resolves to undefined. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds end-to-end “reasoning effort” support across charter parsing, persistent config, spawn/fan-out session creation, and the public builder API.

Changes:

Parse **Reasoning Effort:** in charters and thread it through charter compilation + session creation.
Add persistent config support (defaultReasoningEffort, agentReasoningEffortOverrides) plus a layered resolveReasoningEffort() resolver.
Add tests covering config round-trips, charter parsing/compilation, and fan-out session parameter propagation.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
test/model-preference.test.ts	Adds unit tests for reasoning-effort config read/write + layered resolution.
test/fan-out.test.ts	Verifies `reasoningEffort` propagation (override/charter/auto) into `createSession`.
test/charter-compiler.test.ts	Adds charter markdown parsing + compile-time resolution tests for reasoning effort.
packages/squad-sdk/templates/charter.md	Updates charter template to include “Reasoning Effort: auto”.
packages/squad-sdk/src/coordinator/fan-out.ts	Threads `reasoningEffortOverride` into spawn/session creation (with optional resolver).
packages/squad-sdk/src/config/schema.ts	Extends config schema types with reasoning-effort fields.
packages/squad-sdk/src/config/models.ts	Implements read/write + layered resolution for reasoning effort.
packages/squad-sdk/src/builders/types.ts	Exposes `reasoningEffort` in builder types.
packages/squad-sdk/src/builders/index.ts	Validates `reasoningEffort` in `defineAgent`/`defineDefaults`.
packages/squad-sdk/src/agents/lifecycle.ts	Threads reasoning effort through agent lifecycle session creation.
packages/squad-sdk/src/agents/index.ts	Adds `reasoningEffort` to `AgentCharter`.
packages/squad-sdk/src/agents/charter-compiler.ts	Parses and compiles a resolved reasoning effort from charter/config overrides.
.changeset/reasoning-effort.md	Declares a minor release and documents the feature.

Add clampReasoningEffort() that caps the requested effort to the highest level the model supports. This prevents API errors when a user requests e.g. 'xhigh' on a model that only supports up to 'high' (like GPT-5.5). resolveReasoningEffort() now accepts an optional supportedEfforts array (from listModels()) and auto-clamps the result. Behavior: - xhigh on GPT-5.5 [low,medium,high] -> high (clamped down) - xhigh on GPT-5.2-codex [low,medium,high,xhigh] -> xhigh (unchanged) - xhigh on gpt-4.1 (no effort support) -> undefined (stripped) - max and xhigh treated as equivalent rank Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Validate reasoning effort at parse time: charter compiler now rejects invalid values and normalizes 'auto' to undefined early, preventing invalid strings from reaching SquadSessionConfig - Validate readReasoningEffort() against allowed set (low/medium/high/xhigh/auto), matching the per-agent override validation for consistency - Add isValid() guard in resolveReasoningEffort() so invalid values at any layer (spawn override, charter, config) fall through gracefully - Remove duplicate extractReasoningEffort() regex in lifecycle.ts; use compileCharterFull() output instead (single source of truth) - Add plain-object check after JSON.parse in write functions to prevent corruption when config.json contains non-object JSON (e.g. array/number) - Fix changeset docs to match actual priority order: per-agent config > global config > spawn override > charter > undefined - Add tests for invalid inputs at all resolver layers Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ahhlun · 2026-05-22T03:47:19Z

All 10 review comments addressed in commit 423a7b0:

Changeset priority order — Fixed docs to match implementation: per-agent config > global config > spawn override > charter > undefined.
Validation consistency — readReasoningEffort() now validates against [low, medium, high, xhigh, auto], matching per-agent override validation. Invalid values (e.g. typos) are ignored.
JSON.parse safety — Both write functions now verify parsed config is a plain object before mutating; falls back to { version: 1 } if config.json contains non-object JSON.
auto/invalid normalization in charter compiler — parseCharterMarkdown() now rejects auto and invalid values at parse time (normalized to undefined). compileCharterFull() validates config overrides separately so invalid overrides fall through to the charter value.
Removed duplicate parsing in lifecycle — Replaced extractReasoningEffort() regex with compileCharterFull() output. Single source of truth for charter parsing.
Invalid input tests — Added tests for invalid values at every resolver layer (spawnOverride: 'invalid', charterPreference: 'turbo', persisted defaultReasoningEffort: 'invalid'). All fall through to undefined.

190 tests pass (up from 183).

Add Per-Agent Reasoning Effort section to squad.agent.md so the coordinator knows to use reasoningEffort as a session parameter instead of switching to different model variants (e.g. using claude-opus-4.7-1m-internal with xhigh effort instead of switching to claude-opus-4.7-xhigh). Includes: - Layered resolution instructions matching the SDK implementation - Config persistence commands (always use xhigh, per-agent, clear) - Explicit rule: same model + different effort, NOT different model - Updated spawn output format showing effort annotation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.

- Centralize VALID_REASONING_EFFORTS as exported const from config/models.ts; charter-compiler and builders import it instead of maintaining duplicates - Validate reasoning effort in lifecycle before setting on SquadSessionConfig: reject auto and invalid values so they never reach the SDK - Validate in fan-out fallback path: invalid reasoningEffortOverride values are filtered out instead of passed through to createSession - Fix clampReasoningEffort to clamp UP to model's minimum when requested effort is below the supported range (e.g. low on a model that only supports [medium] returns medium instead of undefined) - Validate write APIs: writeReasoningEffort and writeAgentReasoningEffortOverrides now reject invalid values instead of persisting them to config.json - Add test for clamp-up behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

ahhlun · 2026-05-22T05:33:09Z

All review feedback addressed across commits 423a7b0, b437d8a, and 046138a.

Round 1 (10 comments) — 423a7b0:

Fixed changeset priority order docs
Validated readReasoningEffort against allowed set
Added JSON.parse plain-object safety check
Normalized auto/invalid in charter compiler
Removed duplicate parsing in lifecycle
Added invalid-input tests at every layer

Round 2 (7 new comments) — 046138a:

Centralized VALID_REASONING_EFFORTS (single exported const, no duplication)
Validated in lifecycle and fan-out fallback path (auto/invalid never reach SDK)
clampReasoningEffort now clamps UP to model minimum (not just down)
Write APIs reject invalid values before persisting
Added clamp-up test

Coordinator update — b437d8a:

Taught coordinator about reasoning effort as a session parameter (not a model variant)

237 tests pass across 6 test files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-08T15:08:55Z

👋 Friendly nudge — this PR has had no activity for 16 days.

What needs attention:

👀 No approving reviews yet. Request a review from a teammate.

If this PR is abandoned, please close it. If it's blocked on something external, leave a comment so the team knows.
This is an automated check that runs on weekdays. It won't nudge the same PR more than once per week.

ahhlun · 2026-06-09T06:29:04Z

@bradygaster @tamirdresher can you help with review?

tamirdresher · 2026-06-09T08:43:55Z

@ahhlun nice work — the thing this PR sets out to do is done. reasoningEffort threads through SquadSessionConfig end-to-end: wiring lands in both spawn paths (lifecycle.ts:226 and fan-out.ts:143), charter parsing handles missing field, casing, whitespace, and the auto sentinel cleanly, backwards compat holds, and builder validation is symmetric across defineAgent / defineDefaults. ~30 unit tests across the 3 new test files cover the resolver, clamper, and persistence round-trips well. Approving.

On the integrator-injection shape

resolveReasoningEffort() and clampReasoningEffort() sitting as optional deps on FanOutDependencies — same shape as resolveModel — is the right SDK boundary. Downstream integrators (squad-cli, custom coordinators) wire them in; the SDK just exposes the hooks. That's the existing pattern and it's correct here too.

One testing question

The live integration test in the PR description confirms reasoningEffort: "xhigh" passed directly into SquadSessionConfig reaches the Copilot SDK → CLI → API and fires 8 reasoning delta events. That proves the spawn-time override path (Layer 1).

The other two paths this PR ships — charter **Reasoning Effort:** xhigh (Layer 2) and .squad/config.json (Layers 0a/0b, once the integrator wires the resolver) — weren't live-verified the same way. Worth one more pass on the existing test harness: drive a real Copilot CLI session with **Reasoning Effort: xhigh** in the charter and no spawn-time override, confirm you see the same 8 delta events. Closes the loop for Layer 2 end-to-end without needing the integrator work for Layers 0a/0b. (Those will live-verify naturally once squad-cli is wired in the follow-up.)

Smaller nits — not blockers

clampReasoningEffort() clamps UP as well as down — low against a model that only supports [high] returns high. Test enshrines it (model-preference.test.ts:646-650) so it''s deliberate, but a code comment would save the next reader from filing it as a bug.
writeReasoningEffort() / writeAgentReasoningEffortOverrides() silently DELETE valid prefs on invalid input — no way to distinguish a typo from an intentional clear. A console.warn or thrown error would help.
Invalid charter values drop silently in charter-compiler.ts:280-283 — editor gets zero feedback. Same fix.
''low'' | ''medium'' | ''high'' | ''xhigh'' is defined in 3 places (SquadReasoningEffort in adapter/types.ts:43, VALID_REASONING_EFFORTS in config/models.ts, inline in builders/types.ts:69,123) and then widened to bare string in schema.ts. Worth consolidating to one canonical type.

Address nits here or in a follow-up — your call. Looking forward to the integrator wiring PR.

@tamirdresher

Follow-up to @tamirdresher's review on PR bradygaster#1148: - clampReasoningEffort: document the deliberate upward clamp (JSDoc and inline comment) so a request below a model's minimum isn't mistaken for a bug. - writeReasoningEffort: warn and preserve the existing preference on invalid input instead of silently clearing it (only null clears). - writeAgentReasoningEffortOverrides: warn about dropped invalid entries instead of discarding them silently. - charter parsing: warn on an invalid Reasoning Effort value so charter authors get feedback instead of a silent drop. - Consolidate the reasoning-effort union on the canonical SquadReasoningEffort type: builders/types.ts references it and VALID_REASONING_EFFORTS is tied to it via satisfies. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-12T00:47:33Z

🛫 PR Readiness Check

ℹ️ This comment updates on each push. Last checked: commit ff77480

PR Scope: 📦🔧 Mixed (product + infrastructure)

⚠️ 4 item(s) to address before review

Status	Check	Details
❌	Single commit	7 commits — consider squashing before review
✅	Not in draft	Ready for review
❌	Branch up to date	dev is 164 commit(s) ahead — rebase recommended
❌	Copilot review	No Copilot review yet — it may still be processing
✅	Changeset present	Changeset file found
✅	Scope clean	No .squad/ or docs/proposals/ files
✅	No merge conflicts	No merge conflicts
✅	Copilot threads resolved	12 active Copilot thread(s) resolved (5 outdated skipped)
❌	CI passing	6 check(s) still running

Files Changed (19 files, +1022 −6)

File	+/−
`.changeset/reasoning-effort.md`	+17 −0
`.github/agents/squad.agent.md`	+27 −0
`.squad-templates/squad.agent.md`	+27 −0
`packages/squad-cli/templates/squad.agent.md.template`	+27 −0
`packages/squad-sdk/src/adapter/types.ts`	+4 −0
`packages/squad-sdk/src/agents/charter-compiler.ts`	+32 −0
`packages/squad-sdk/src/agents/index.ts`	+5 −0
`packages/squad-sdk/src/agents/lifecycle.ts`	+20 −6
`packages/squad-sdk/src/builders/index.ts`	+8 −0
`packages/squad-sdk/src/builders/types.ts`	+8 −0
`packages/squad-sdk/src/config/models.ts`	+333 −0
`packages/squad-sdk/src/config/schema.ts`	+3 −0
`packages/squad-sdk/src/coordinator/fan-out.ts`	+16 −0
`packages/squad-sdk/templates/charter.md`	+1 −0
`packages/squad-sdk/templates/squad.agent.md.template`	+27 −0
`templates/squad.agent.md.template`	+27 −0
`test/charter-compiler.test.ts`	+66 −0
`test/fan-out.test.ts`	+72 −0
`test/model-preference.test.ts`	+302 −0

Total: +1022 −6

This check runs automatically on every push. Fix any ❌ items and push again.
See CONTRIBUTING.md and PR Requirements for details.

github-actions · 2026-06-12T00:47:38Z

🟠 Impact Analysis — PR #1148

Risk tier: 🟠 HIGH

📊 Summary

Metric	Count
Files changed	19
Files added	1
Files modified	18
Files deleted	0
Modules touched	6
Critical files	2

🎯 Risk Factors

19 files changed (6-20 → MEDIUM)
6 modules touched (5-8 → HIGH)
Critical files touched: packages/squad-sdk/src/agents/index.ts, packages/squad-sdk/src/builders/index.ts

📦 Modules Affected

ci-workflows (1 file)

.github/agents/squad.agent.md

root (2 files)

.changeset/reasoning-effort.md
templates/squad.agent.md.template

squad-cli (1 file)

packages/squad-cli/templates/squad.agent.md.template

squad-sdk (11 files)

packages/squad-sdk/src/adapter/types.ts
packages/squad-sdk/src/agents/charter-compiler.ts
packages/squad-sdk/src/agents/index.ts
packages/squad-sdk/src/agents/lifecycle.ts
packages/squad-sdk/src/builders/index.ts
packages/squad-sdk/src/builders/types.ts
packages/squad-sdk/src/config/models.ts
packages/squad-sdk/src/config/schema.ts
packages/squad-sdk/src/coordinator/fan-out.ts
packages/squad-sdk/templates/charter.md
packages/squad-sdk/templates/squad.agent.md.template

templates (1 file)

.squad-templates/squad.agent.md

tests (3 files)

test/charter-compiler.test.ts
test/fan-out.test.ts
test/model-preference.test.ts

⚠️ Critical Files

packages/squad-sdk/src/agents/index.ts
packages/squad-sdk/src/builders/index.ts

This report is generated automatically for every PR. See #733 for details.

ahhlun · 2026-06-12T00:48:03Z

@tamirdresher thanks for the thorough review! Addressed all four nits in ff77480:

clampReasoningEffort clamp-up — documented the deliberate upward clamp in both the JSDoc and an inline comment, so a request below a model''s minimum reads as intentional rather than a bug.
writeReasoningEffort — now console.warns on invalid input and preserves the existing preference instead of silently clearing it (only null clears).
writeAgentReasoningEffortOverrides — now console.warns listing any dropped invalid entries.
Charter parsing — console.warns on an invalid **Reasoning Effort:** value so charter authors get feedback instead of a silent drop.
Type consolidation — unified the union on the canonical SquadReasoningEffort (adapter/types.ts); builders/types.ts now references it and VALID_REASONING_EFFORTS is tied to it via satisfies. Left schema.ts as string to match the existing loosely-typed on-disk model field.

npm run lint is clean and all reasoning-effort tests pass. Leaving the Layer-2 charter live-verification for the follow-up wiring PR as you suggested.

Copilot AI review requested due to automatic review settings May 22, 2026 00:03

Copilot AI reviewed May 22, 2026

View reviewed changes

allai-dev and others added 2 commits May 21, 2026 17:13

ahhlun requested a review from Copilot May 22, 2026 05:23

Copilot AI reviewed May 22, 2026

View reviewed changes

Merge origin/dev into squad/reasoning-effort-support

74059de

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Conversation

ahhlun commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Configuration Layers

Spawning Paths

Resolution

Effort Clamping

Usage Example

Files Changed (13)

Testing

Unit Tests

Live Integration Test (claude-opus-4.7-1m-internal)

Live Model Capabilities (from listModels() API)

Clamping Validation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ahhlun commented May 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ahhlun commented May 22, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

ahhlun commented Jun 9, 2026

Uh oh!

tamirdresher commented Jun 9, 2026

Uh oh!

github-actions Bot commented Jun 12, 2026

🛫 PR Readiness Check

⚠️ 4 item(s) to address before review

Files Changed (19 files, +1022 −6)

Uh oh!

github-actions Bot commented Jun 12, 2026

🟠 Impact Analysis — PR #1148

📊 Summary

🎯 Risk Factors

📦 Modules Affected

⚠️ Critical Files

Uh oh!

ahhlun commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ahhlun commented May 22, 2026 •

edited

Loading

Live Model Capabilities (from `listModels()` API)