Skip to content

Fix/features completed too soon#777

Closed
gsxdsm wants to merge 94 commits intoAutoMaker-Org:mainfrom
gsxdsm:fix/features-completed-too-soon
Closed

Fix/features completed too soon#777
gsxdsm wants to merge 94 commits intoAutoMaker-Org:mainfrom
gsxdsm:fix/features-completed-too-soon

Conversation

@gsxdsm
Copy link
Collaborator

@gsxdsm gsxdsm commented Feb 17, 2026

Features are often moving to completed before all tasks are done. This improves that behavior increasing turns and adding automatic retries

Summary by CodeRabbit

  • New Features

    • Added Claude Opus 4.6 model with adaptive thinking capabilities replacing Opus 4.5
    • Introduced GPT-5.3-Codex model with enhanced reasoning for code generation
    • Multi-worktree auto-mode support for parallel feature development
  • Improvements

    • Enhanced auto-mode coordination and feature execution stability
    • Improved plan approval workflow with better recovery handling
    • Better error handling and user feedback throughout the platform
    • Updated Docker setup with Playwright browser testing support
  • Infrastructure

    • Comprehensive architectural documentation and testing guidelines
    • Extensive internal refactoring of auto-mode services for maintainability

Shironex and others added 30 commits February 2, 2026 15:47
…esting

Fixes AutoMaker-Org#725

AI agents in automated testing mode require Playwright to verify implementations,
but Docker containers had only system dependencies installed, not browser binaries.
This caused verification failures with permissions errors.

Changes:
- Install Playwright Chromium in Dockerfile (~300MB increase)
- Update docker-compose.override.yml.example with clearer Playwright documentation
- Add "Playwright for Automated Testing" section to README
- Document optional volume mount for persisting browsers across rebuilds

Browsers are now pre-installed and work out of the box for Docker users.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Clean up npx cache after Playwright installation to reduce image size
- Clarify README: volume mounts persist cache across container lifecycles,
  not image rebuilds
- Add first-use warning: empty volume overrides pre-installed browsers,
  users must re-install with docker exec command

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Move Playwright install after node_modules copy to use pinned version
- Use local playwright binary instead of npx to avoid registry fetch
- Add --user automaker -w /app flags to docker exec commands
- Change bold text to proper heading in README (MD036 lint fix)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Enhanced getClaudeAuthIndicators() to return detailed check information
  including file paths checked and specific error details for debugging
- Added debug logging to server startup credential detection for easier
  troubleshooting in Docker environments
- Show paths that were checked in the warning message to help users debug
  mount issues
- Added support for CLAUDE_CODE_OAUTH_TOKEN environment variable
- Return authType in verify-claude-auth response to distinguish between
  OAuth and CLI authentication methods
- Updated UI to show specific success messages for Claude Code subscription
  vs generic CLI auth
- Added Docker troubleshooting tips to sandbox risk dialog
- Added comprehensive unit tests for OAuth credential detection scenarios

Closes AutoMaker-Org#721

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Updated getClaudeAuthIndicators() to ensure that empty or token-less credential files do not prevent the detection of valid credentials in subsequent paths.
- Improved error handling for settings file readability checks, providing clearer feedback on file access issues.
- Added unit tests to validate the new behavior, ensuring that the system continues to check all credential paths even when some files are empty or invalid.

This change improves the robustness of the credential detection process and enhances user experience by allowing for more flexible credential management.
- Upgraded @openai/codex-sdk from version 0.77.0 to 0.98.0 in package-lock.json and package.json.
- Introduced new model 'GPT-5.3-Codex' with enhanced capabilities in codex-models.ts and related files.
- Updated descriptions for existing models to reflect their latest features and improvements.
- Adjusted Codex model configuration and display to include the new model and its attributes.

These changes enhance the Codex model offerings and ensure compatibility with the latest SDK version.
…upport

- Changed model identifier from `claude-opus-4-5-20251101` to `claude-opus-4-6` across various files, including documentation and code references.
- Updated the SDK to support adaptive thinking for Opus 4.6, allowing the model to determine its own reasoning depth.
- Enhanced the thinking level options to include 'adaptive' and adjusted related components to reflect this change.
- Updated tests to ensure compatibility with the new model and its features.

These changes improve the model's capabilities and user experience by leveraging adaptive reasoning.
- Added `isAdaptiveThinkingModel` utility to improve model identification logic in the AddFeatureDialog.
- Updated the ThinkingLevelSelector to conditionally display information based on available thinking levels.
- Enhanced model name formatting in agent-context-parser to include 'GPT-5.3 Codex' for better clarity.

These changes improve the user experience by refining model handling and UI feedback related to adaptive thinking capabilities.
- Introduced a new event type 'dev-server:url-detected' to enhance event handling for the development server.
- This addition allows for better tracking and response to URL detection during server operations.

These changes improve the event system's capability to manage server-related events effectively.
- STACK.md - Technologies and dependencies
- ARCHITECTURE.md - System design and patterns
- STRUCTURE.md - Directory layout
- CONVENTIONS.md - Code style and patterns
- TESTING.md - Test structure
- INTEGRATIONS.md - External services
- CONCERNS.md - Technical debt and issues
Refactoring auto-mode-service.ts (5k+ lines) into smaller, focused services with clear boundaries.
User preference: keep .planning/ local-only
- Test lease counting basics (acquire/release semantics)
- Test running count queries (project and worktree level)
- Test feature state queries (isRunning, getRunningFeature, getAllRunning)
- Test edge cases (multiple features, multiple worktrees)
- 36 test cases documenting expected behavior

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Lease-based reference counting for nested execution support
- acquire() creates entry with leaseCount: 1 or increments existing
- release() decrements leaseCount, deletes at 0 or with force:true
- Project and worktree-level running counts
- RunningFeature interface exported for type sharing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- AutoModeService now delegates to ConcurrencyManager for all running feature tracking
- Constructor accepts optional ConcurrencyManager for dependency injection
- Remove local RunningFeature interface (imported from ConcurrencyManager)
- Migrate all this.runningFeatures usages to concurrencyManager methods
- Update tests to use concurrencyManager.acquire() instead of direct Map access
- ConcurrencyManager accepts getCurrentBranch function for testability

BREAKING: AutoModeService no longer exposes runningFeatures Map directly.
Tests must use concurrencyManager.acquire() to add running features.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add TypedEventBus as wrapper around EventEmitter
- Implement emitAutoModeEvent method for auto-mode event format
- Add emit, subscribe, getUnderlyingEmitter methods
- Create comprehensive test suite (20 tests)
- Verify exact event format for frontend compatibility
- Create WorktreeResolver class for git worktree discovery
- Extract getCurrentBranch, findWorktreeForBranch, listWorktrees methods
- Add WorktreeInfo interface for worktree metadata
- Always resolve paths to absolute for cross-platform compatibility
- Add 20 unit tests covering all worktree operations
- Import TypedEventBus into AutoModeService
- Add eventBus property initialized via constructor injection
- Remove private emitAutoModeEvent method (now in TypedEventBus)
- Update all 66 emitAutoModeEvent calls to use this.eventBus
- Constructor accepts optional TypedEventBus for testing
- Create FeatureStateManager class for feature status updates
- Extract updateFeatureStatus, markFeatureInterrupted, resetStuckFeatures
- Extract updateFeaturePlanSpec, saveFeatureSummary, updateTaskStatus
- Persist BEFORE emit pattern for data integrity (Pitfall 2)
- Handle corrupted JSON with readJsonWithRecovery backup support
- Preserve pipeline_* statuses in markFeatureInterrupted
- Fix bug: version increment now checks old content before applying updates
- Add 33 unit tests covering all state management operations
…utoModeService

- Add WorktreeResolver and FeatureStateManager as constructor parameters
- Remove top-level getCurrentBranch function (now in WorktreeResolver)
- Delegate loadFeature, updateFeatureStatus to FeatureStateManager
- Delegate markFeatureInterrupted, resetStuckFeatures to FeatureStateManager
- Delegate updateFeaturePlanSpec, saveFeatureSummary, updateTaskStatus
- Replace findExistingWorktreeForBranch calls with worktreeResolver
- Update tests to mock featureStateManager instead of internal methods
- All 89 tests passing across 3 service files
- Extract plan approval workflow from AutoModeService
- Timeout-wrapped Promise creation via waitForApproval()
- Resolution handling (approve/reject) with needsRecovery flag
- Cancellation support for stopped features
- Per-project configurable timeout (default 30 minutes)
- Event emission through TypedEventBus for plan_rejected
- 24 tests covering approval, rejection, timeout, cancellation, recovery
- Tests use Vitest fake timers for timeout testing
- Covers needsRecovery flag for server restart recovery
- Covers plan_rejected event emission
- Covers configurable timeout from project settings
- Add PlanApprovalService import and constructor parameter
- Delegate waitForPlanApproval, cancelPlanApproval, hasPendingApproval
- resolvePlanApproval checks needsRecovery flag and calls executeFeature
- Remove pendingApprovals Map (now in PlanApprovalService)
- Remove PendingApproval interface (moved to plan-approval-service.ts)
- Extract parseTasksFromSpec for parsing tasks from spec content
- Extract marker detection functions (task start/complete, phase complete)
- Extract detectSpecFallback for non-Claude model support
- Extract extractSummary with multi-format support and last-match behavior
- Add 65 unit tests covering all functions and edge cases
- Add import for all spec parsing functions from spec-parser.ts
- Remove 209 lines of function definitions (now imported)
- Functions extracted: parseTasksFromSpec, parseTaskLine, detectTaskStartMarker,
  detectTaskCompleteMarker, detectPhaseCompleteMarker, detectSpecFallback, extractSummary
- All server tests pass (1608 tests)
- Create AgentExecutor class with constructor injection for TypedEventBus,
  FeatureStateManager, PlanApprovalService, and SettingsService
- Extract streaming pipeline from AutoModeService.runAgent()
- Implement execute() with stream processing, marker detection, file output
- Support recovery path with executePersistedTasks()
- Handle spec generation and approval workflow
- Multi-agent task execution with progress events
- Single-agent continuation fallback
- Debounced file writes (500ms)
- Heartbeat logging for silent model calls
- Abort signal handling throughout execution

Key interfaces:
- AgentExecutionOptions: All execution parameters
- AgentExecutionResult: responseText, specDetected, tasksCompleted, aborted
- Callbacks: waitForApproval, saveFeatureSummary, updateFeatureSummary, buildTaskPrompt
- Test constructor injection with all dependencies
- Test interface exports (AgentExecutionOptions, AgentExecutionResult)
- Test callback type signatures (WaitForApprovalFn, SaveFeatureSummaryFn, etc.)
- Test dependency injection patterns with custom implementations
- Verify execute method signature

Note: Full integration tests for streaming/marker detection require
complex mocking of @automaker/utils module which has hoisting issues.
Integration testing covered in E2E and auto-mode-service tests.
- Add AgentExecutor import to auto-mode-service.ts
- Add agentExecutor as constructor parameter (optional, with default)
- Initialize AgentExecutor with TypedEventBus, FeatureStateManager,
  PlanApprovalService, and SettingsService dependencies

This enables constructor injection for testing and prepares for
incremental delegation of runAgent() logic to AgentExecutor.
The AgentExecutor contains the full execution pipeline;
runAgent() delegation will be done incrementally to ensure
stability.
- Replace stream processing loop with AgentExecutor.execute() delegation
- Build AgentExecutionOptions object from runAgent() parameters
- Create callbacks for waitForApproval, saveFeatureSummary, etc.
- Remove ~930 lines of duplicated stream processing code
- Progress events now flow through AgentExecutor

File: auto-mode-service.ts reduced from 5086 to 4157 lines
- Add 11 new test cases for execute() behavior
- Test callback invocation (progress events, tool events)
- Test error handling (API errors, auth failures)
- Test result structure and response accumulation
- Test abort signal propagation
- Test branchName propagation in event payloads

Test file: 388 -> 935 lines (+547 lines)
Shironex and others added 25 commits February 15, 2026 16:59
- Extract getNvmWindowsCliPaths() helper to DRY up NVM_SYMLINK logic
- Update DEFAULT_MODELS.codex to gpt53Codex
- Simplify redundant ternary in thinking-level-selector
- Replace local supportsReasoningEffort with shared import from @automaker/types
- Use model.id fallback in phase-model-selector thinking level resolution

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…de-and-codex-models

feat: add Claude Opus 4.6 and GPT-5.3-Codex model support
…tup-warning-ignores-claude-oauth-credenti-fuzx
- Fix warning box path lines being 2 chars too wide (BOX_CONTENT_WIDTH - 4)
- Wrap getClaudeAuthIndicators in try/catch to prevent 500 on auth success
- Convert dynamic import to static import for @automaker/platform
- Simplify verbose debug logging to log objects directly
- Remove unnecessary truthy checks on always-populated path strings

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tartup-warning-ignores-claude-oauth-credenti-fuzx

fix(auth): Improve OAuth credential detection and startup warning
The Dockerfile's playwright install step requires the binary in
node_modules/.bin/, but playwright was only a UI dependency. This adds
@playwright/test to server devDependencies so the Docker build can
successfully run `./node_modules/.bin/playwright install chromium`.

Fixes the "playwright: not found" error during Docker image build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- .gitignore: add missing trailing newline
- Dockerfile: remove misleading || echo fallback in Playwright install
- index.ts: truncate long paths from beginning instead of end in warning box
- verify-claude-auth.ts: use effectiveAuthMethod to prevent undefined authType
- agent-context-parser.ts: handle claude-opus alias as Opus 4.6
- thinking-level-selector.tsx: improve model prop documentation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…aywright-missing-browsers

fix(docker): Pre-install Playwright Chromium browsers for automated t…
- Added error handling for feature creation in BoardView component to log errors and display user-friendly messages.
- Updated persistFeatureCreate function to throw errors on failure, allowing for better state management.
- Introduced removal of features from state if server creation fails, improving user experience during conflicts.

Also added @playwright/test to devDependencies in package-lock.json for improved testing capabilities.
- Enhanced the Kanban card component to support additional feature statuses ('interrupted' and 'ready') in the backlog display logic.
- Updated relevant components to reflect these changes, ensuring consistent behavior across the UI.
Add missing 'adaptive' thinking level to kanban card labels and export
TerminalPromptTheme type from @automaker/types package.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ervice-gsxdsm

refactor: AutoModeService decomposition (Phases 1-6) AutoMaker-Org#733 + fixes
…ervice-gsxdsm

fix: Exclude waiting_approval cards from active running state display
…eature. Fix Claude weekly usage indicator. Fix mobile card drag
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 17, 2026

📝 Walkthrough

Walkthrough

This PR implements a comprehensive refactor of the monolithic AutoModeService (~5700 lines) into a modular, multi-service architecture with clear boundaries. It introduces new services (ConcurrencyManager, WorktreeResolver, FeatureStateManager, ExecutionService, RecoveryService, PipelineOrchestrator, PlanApprovalService, AgentExecutor, GlobalAutoModeService, and a compatibility shim), updates all auto-mode routes and providers, modernizes the Claude model to Opus 4.6 with adaptive thinking support, adds Playwright testing infrastructure to Docker, expands UI components for multi-worktree and drag-and-drop support, and includes comprehensive planning documentation and test coverage.

Changes

Cohort / File(s) Summary
Planning & Documentation
.gitignore, .planning/*
Added planning directories and comprehensive refactor documentation (architecture, conventions, stack, structure, testing, integrations, codebase concerns).
Model & Dependency Updates
CLAUDE.md, apps/server/package.json, apps/server/src/lib/sdk-options.ts, apps/server/src/providers/claude-provider.ts, apps/server/src/providers/codex-models.ts, apps/server/src/providers/provider-factory.ts
Updated Claude Opus from 4.5 to 4.6 with adaptive thinking; added GPT-5.3-Codex model; enhanced thinking-level handling (adaptive vs. manual budgeting); updated SDK dependencies.
Core Auto-Mode Architecture
apps/server/src/services/auto-mode/*, apps/server/src/services/agent-executor*, apps/server/src/services/auto-loop-coordinator.ts, apps/server/src/services/concurrency-manager.ts, apps/server/src/services/execution-service.ts, apps/server/src/services/execution-types.ts, apps/server/src/services/feature-state-manager.ts, apps/server/src/services/merge-service.ts, apps/server/src/services/pipeline-orchestrator.ts, apps/server/src/services/pipeline-types.ts, apps/server/src/services/plan-approval-service.ts, apps/server/src/services/recovery-service.ts, apps/server/src/services/spec-parser.ts, apps/server/src/services/typed-event-bus.ts, apps/server/src/services/worktree-resolver.ts
Comprehensive refactor: new modular services replacing monolithic AutoModeService. Includes concurrency management (per-worktree leasing), execution orchestration, feature state persistence, plan approval workflows, recovery mechanisms, pipeline coordination, spec parsing utilities, and type-safe event emission.
Route & Integration Updates
apps/server/src/index.ts, apps/server/src/routes/auto-mode/*, apps/server/src/routes/enhance-prompt/routes/enhance.ts, apps/server/src/routes/features/*, apps/server/src/routes/github/routes/validate-issue.ts, apps/server/src/routes/projects/*, apps/server/src/routes/running-agents/*, apps/server/src/routes/setup/routes/verify-claude-auth.ts
Updated all auto-mode routes to use AutoModeServiceCompat; added projectPath validation and explicit parameter ordering; enhanced error handling; added provider-aware model resolution; extended auth type detection (api_key, oauth, cli); improved orphan feature detection logging.
Service Layer Updates
apps/server/src/services/agent-service.ts, apps/server/src/services/claude-usage-service.ts, apps/server/src/services/ideation-service.ts
Updated provider/model routing for custom providers; enhanced Claude CLI auth detection and trust-prompt recognition; improved ANSI code handling for usage parsing; refined model selection logic.
Docker & Infrastructure
Dockerfile, README.md
Added Playwright Chromium pre-installation for E2E testing; documented browser caching and testing workflows.
Comprehensive Test Coverage
apps/server/tests/integration/services/auto-mode-service.integration.test.ts, apps/server/tests/unit/lib/*, apps/server/tests/unit/providers/*, apps/server/tests/unit/services/*
Removed old integration tests; added extensive unit test suites for all new services (agent-executor, auto-loop-coordinator, concurrency-manager, execution-service, feature-state-manager, pipeline-orchestrator, plan-approval-service, recovery-service, spec-parser, typed-event-bus, worktree-resolver); updated model tests to reflect Opus 4.6.
UI Component Updates
apps/ui/eslint.config.mjs, apps/ui/src/components/dialogs/sandbox-risk-dialog.tsx, apps/ui/src/components/ui/task-progress-panel.tsx, apps/ui/src/components/usage-popover.tsx, apps/ui/src/components/views/board-view*
Added multi-worktree task aggregation; introduced isActivelyRunning flag to handle waiting_approval state; added drag-and-drop improvements (handle listeners/attributes); enhanced pace visualization in usage popover; updated feature creation flows with branch context; improved error handling and logging.
Documentation Updates
apps/ui/docs/AGENT_ARCHITECTURE.md
Updated example model configuration to reference Claude Opus 4.6.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related issues

Possibly related PRs

Suggested labels

Refactor, Tests, Architecture

Poem

🐰 From one great beast, we built a garden small,
With services precise, each standing tall,
Concurrency dances with worktrees anew,
Plans approve, agents execute true,
The rabbit refactored the whole stall! 🌿

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 48.65% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (88 files):

⚔️ .gitignore (content)
⚔️ CLAUDE.md (content)
⚔️ Dockerfile (content)
⚔️ README.md (content)
⚔️ apps/server/package.json (content)
⚔️ apps/server/src/index.ts (content)
⚔️ apps/server/src/lib/sdk-options.ts (content)
⚔️ apps/server/src/providers/claude-provider.ts (content)
⚔️ apps/server/src/providers/codex-models.ts (content)
⚔️ apps/server/src/providers/provider-factory.ts (content)
⚔️ apps/server/src/routes/auto-mode/index.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/analyze-project.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/approve-plan.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/commit-feature.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/context-exists.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/follow-up-feature.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/resume-feature.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/resume-interrupted.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/run-feature.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/start.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/status.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/stop-feature.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/stop.ts (content)
⚔️ apps/server/src/routes/auto-mode/routes/verify-feature.ts (content)
⚔️ apps/server/src/routes/enhance-prompt/routes/enhance.ts (content)
⚔️ apps/server/src/routes/features/index.ts (content)
⚔️ apps/server/src/routes/features/routes/list.ts (content)
⚔️ apps/server/src/routes/github/routes/validate-issue.ts (content)
⚔️ apps/server/src/routes/projects/index.ts (content)
⚔️ apps/server/src/routes/projects/routes/overview.ts (content)
⚔️ apps/server/src/routes/running-agents/index.ts (content)
⚔️ apps/server/src/routes/running-agents/routes/index.ts (content)
⚔️ apps/server/src/routes/setup/routes/verify-claude-auth.ts (content)
⚔️ apps/server/src/services/agent-service.ts (content)
⚔️ apps/server/src/services/claude-usage-service.ts (content)
⚔️ apps/server/src/services/ideation-service.ts (content)
⚔️ apps/server/tests/unit/lib/model-resolver.test.ts (content)
⚔️ apps/server/tests/unit/lib/sdk-options.test.ts (content)
⚔️ apps/server/tests/unit/providers/claude-provider.test.ts (content)
⚔️ apps/server/tests/unit/providers/provider-factory.test.ts (content)
⚔️ apps/server/tests/unit/services/claude-usage-service.test.ts (content)
⚔️ apps/ui/docs/AGENT_ARCHITECTURE.md (content)
⚔️ apps/ui/eslint.config.mjs (content)
⚔️ apps/ui/src/components/dialogs/sandbox-risk-dialog.tsx (content)
⚔️ apps/ui/src/components/ui/task-progress-panel.tsx (content)
⚔️ apps/ui/src/components/usage-popover.tsx (content)
⚔️ apps/ui/src/components/views/board-view.tsx (content)
⚔️ apps/ui/src/components/views/board-view/components/kanban-card/agent-info-panel.tsx (content)
⚔️ apps/ui/src/components/views/board-view/components/kanban-card/card-actions.tsx (content)
⚔️ apps/ui/src/components/views/board-view/components/kanban-card/card-header.tsx (content)
⚔️ apps/ui/src/components/views/board-view/components/kanban-card/kanban-card.tsx (content)
⚔️ apps/ui/src/components/views/board-view/components/list-view/list-row.tsx (content)
⚔️ apps/ui/src/components/views/board-view/dialogs/add-feature-dialog.tsx (content)
⚔️ apps/ui/src/components/views/board-view/hooks/use-board-actions.ts (content)
⚔️ apps/ui/src/components/views/board-view/hooks/use-board-drag-drop.ts (content)
⚔️ apps/ui/src/components/views/board-view/hooks/use-board-persistence.ts (content)
⚔️ apps/ui/src/components/views/board-view/mobile-usage-bar.tsx (content)
⚔️ apps/ui/src/components/views/board-view/shared/model-constants.ts (content)
⚔️ apps/ui/src/components/views/board-view/shared/thinking-level-selector.tsx (content)
⚔️ apps/ui/src/components/views/board-view/worktree-panel/worktree-panel.tsx (content)
⚔️ apps/ui/src/components/views/graph-view-page.tsx (content)
⚔️ apps/ui/src/components/views/settings-view/model-defaults/phase-model-selector.tsx (content)
⚔️ apps/ui/src/components/views/settings-view/providers/codex-model-configuration.tsx (content)
⚔️ apps/ui/src/components/views/setup-view/steps/claude-setup-step.tsx (content)
⚔️ apps/ui/src/hooks/use-auto-mode.ts (content)
⚔️ apps/ui/src/hooks/use-query-invalidation.ts (content)
⚔️ apps/ui/src/lib/agent-context-parser.ts (content)
⚔️ apps/ui/src/lib/electron.ts (content)
⚔️ apps/ui/src/lib/http-api-client.ts (content)
⚔️ apps/ui/src/store/app-store.ts (content)
⚔️ apps/ui/src/store/types/state-types.ts (content)
⚔️ apps/ui/src/store/utils/usage-utils.ts (content)
⚔️ docker-compose.override.yml.example (content)
⚔️ docs/llm-shared-packages.md (content)
⚔️ docs/server/providers.md (content)
⚔️ docs/server/utilities.md (content)
⚔️ libs/model-resolver/README.md (content)
⚔️ libs/model-resolver/tests/resolver.test.ts (content)
⚔️ libs/platform/src/index.ts (content)
⚔️ libs/platform/src/system-paths.ts (content)
⚔️ libs/types/src/codex-models.ts (content)
⚔️ libs/types/src/event.ts (content)
⚔️ libs/types/src/index.ts (content)
⚔️ libs/types/src/model-display.ts (content)
⚔️ libs/types/src/model.ts (content)
⚔️ libs/types/src/settings.ts (content)
⚔️ package-lock.json (content)
⚔️ package.json (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the core issue addressed in the PR: features being marked completed prematurely. The changeset confirms this is the main focus.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch fix/features-completed-too-soon
  • Post resolved changes as copyable diffs in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @gsxdsm, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the core auto-mode functionality, breaking down a large service into modular components to improve maintainability and debuggability. The changes enhance the reliability of automated feature implementation by refining task completion logic, introducing smarter failure tracking with automatic pausing, and enabling agent-driven test retries. Additionally, it updates AI model integrations and improves the Docker environment for automated testing, alongside minor UI adjustments for better user experience and error handling.

Highlights

  • Major Auto-Mode Refactoring: The monolithic auto-mode-service.ts file has been refactored into several smaller, focused services including AgentExecutor, AutoLoopCoordinator, ConcurrencyManager, ExecutionService, FeatureStateManager, MergeService, PipelineOrchestrator, PlanApprovalService, RecoveryService, TypedEventBus, and WorktreeResolver. A compatibility layer (AutoModeServiceCompat) was introduced to ensure existing API routes continue to function during this transition.
  • Improved Feature Completion Logic: The logic for marking features as complete has been enhanced. When a feature reaches 'waiting_approval' or 'verified' status, any 'in_progress' tasks within its plan are now automatically marked as 'completed', and the 'currentTaskId' is cleared. This prevents features from appearing to have active tasks when they are awaiting review or are fully done.
  • Enhanced Auto-Mode Loop Robustness: The auto-mode loop now includes more robust failure tracking. It pauses automatically after a configurable number of consecutive failures and immediately pauses for quota exhaustion or rate limit errors, preventing excessive API calls during problematic periods.
  • Automated Test Retries with Agent Fixes: Pipeline steps now support automatic retries for failed tests. If tests fail, the agent will attempt to fix the issues up to a maximum of 5 attempts, improving the reliability of automated feature implementation.
  • Updated AI Model SDKs and Capabilities: The Claude SDK has been updated to 0.2.32 and the Codex SDK to 0.98.0. The Claude Opus model is now 4.6, which supports adaptive thinking, and the UI has been updated to reflect this new capability and normalize thinking levels when switching between adaptive and non-adaptive models.
  • Docker Image Enhancements: The Docker image now includes Playwright Chromium pre-installed, enabling AI agents to perform verification tests out-of-the-box without additional setup. Troubleshooting tips for Docker users have also been added to the Sandbox Risk Dialog.
  • Improved UI for Feature Creation and Running Status: The UI now includes better error handling for feature creation, preventing features from being added to the board if server-side creation fails. Visual cues for running auto-mode features have been refined, ensuring that features in 'waiting_approval' status do not display as 'actively running'.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .gitignore
    • Added new entries for local planning documents and a new mcp.json file.
  • .planning/PROJECT.md
    • Added a new planning document outlining the refactoring of AutoModeService.
  • .planning/codebase/ARCHITECTURE.md
    • Added a new architecture document detailing the overall system design.
  • .planning/codebase/CONCERNS.md
    • Added a new document outlining codebase concerns such as tech debt, known bugs, and performance bottlenecks.
  • .planning/codebase/CONVENTIONS.md
    • Added a new document detailing coding conventions, including naming patterns, code style, and error handling.
  • .planning/codebase/INTEGRATIONS.md
    • Added a new document describing external integrations and APIs used in the project.
  • .planning/codebase/STACK.md
    • Added a new document detailing the technology stack used in the project.
  • .planning/codebase/STRUCTURE.md
    • Added a new document outlining the codebase directory layout and purposes.
  • .planning/codebase/TESTING.md
    • Added a new document describing testing patterns and strategies.
  • CLAUDE.md
    • Updated the Claude Opus model version from 4.5-20251101 to 4.6.
  • Dockerfile
    • Installed Playwright Chromium browser for AI agent verification tests and added related comments.
  • README.md
    • Updated the Docker section to include details about Playwright Chromium installation and persistence options.
  • apps/server/package.json
    • Updated @anthropic-ai/claude-agent-sdk to 0.2.32 and @openai/codex-sdk to 0.98.0.
    • Added @playwright/test as a dev dependency.
  • apps/server/src/index.ts
    • Updated Claude authentication checks to include CLAUDE_CODE_OAUTH_TOKEN environment variable.
    • Refactored AutoModeService to AutoModeServiceCompat for improved modularity.
    • Enhanced warning messages for unconfigured Claude authentication to include paths checked.
  • apps/server/src/lib/sdk-options.ts
    • Modified buildThinkingOptions to support adaptive thinking for Opus 4.6 models, omitting maxThinkingTokens when adaptive thinking is enabled.
  • apps/server/src/providers/claude-provider.ts
    • Updated Claude Opus model to 4.6 with adaptive thinking support and increased maxOutputTokens to 128000.
    • Adjusted thinking level logic to handle adaptive thinking for Opus 4.6.
  • apps/server/src/providers/codex-models.ts
    • Added GPT-5.3-Codex model and updated descriptions for GPT-5.2-Codex, GPT-5.1-Codex-Max, GPT-5.1-Codex-Mini, and GPT-5.2 models.
  • apps/server/src/providers/provider-factory.ts
    • Updated JSDoc for getProviderForModel to reflect the new Claude Opus 4.6 model ID.
  • apps/server/src/routes/auto-mode/index.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat and adjusted JSDoc.
  • apps/server/src/routes/auto-mode/routes/analyze-project.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/approve-plan.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
    • Added validation for projectPath and adjusted the order of parameters in resolvePlanApproval call.
  • apps/server/src/routes/auto-mode/routes/commit-feature.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/context-exists.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/follow-up-feature.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
    • Removed finally block for releasing starting slot, as this is now handled by the new architecture.
  • apps/server/src/routes/auto-mode/routes/resume-feature.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/resume-interrupted.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/run-feature.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
    • Removed finally block for releasing starting slot, as this is now handled by the new architecture.
  • apps/server/src/routes/auto-mode/routes/start.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/status.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
    • Modified getStatusForProject to be an async call.
  • apps/server/src/routes/auto-mode/routes/stop-feature.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/stop.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/verify-feature.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/enhance-prompt/routes/enhance.ts
    • Adjusted model resolution logic to correctly pass custom provider model IDs to the API.
  • apps/server/src/routes/features/index.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/features/routes/list.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
    • Added error handling for detectOrphanedFeatures to catch and log failures.
  • apps/server/src/routes/github/routes/validate-issue.ts
    • Added resolveModelString import.
    • Adjusted model resolution logic to correctly pass custom provider model IDs to the API.
  • apps/server/src/routes/projects/index.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/projects/routes/overview.ts
    • Updated import types for AutoModeServiceCompat, RunningAgentInfo, and ProjectAutoModeStatus.
    • Modified getStatusForProject to be an async call.
  • apps/server/src/routes/running-agents/index.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
  • apps/server/src/routes/running-agents/routes/index.ts
    • Updated the import type for AutoModeService to AutoModeServiceCompat.
    • Removed unnecessary spread operator when calling getRunningAgents.
  • apps/server/src/routes/setup/routes/verify-claude-auth.ts
    • Added getClaudeAuthIndicators import.
    • Refined Claude authentication type determination for success messages.
  • apps/server/src/services/agent-executor-types.ts
    • Added new type definitions for agent execution options, results, and callbacks.
  • apps/server/src/services/agent-executor.ts
    • Added a new service responsible for core agent execution, streaming, and task management.
  • apps/server/src/services/agent-service.ts
    • Adjusted model resolution logic to correctly pass custom provider model IDs to the API.
  • apps/server/src/services/auto-loop-coordinator.ts
    • Added a new service to manage the auto-mode loop lifecycle, failure tracking, and concurrency for projects and worktrees.
  • apps/server/src/services/auto-mode/compat.ts
    • Added a new compatibility layer service that wraps GlobalAutoModeService and AutoModeServiceFacade to provide the old AutoModeService interface.
  • apps/server/src/services/auto-mode/facade.ts
    • Added a new facade service providing a clean, per-project interface for auto-mode functionality, delegating to underlying services.
  • apps/server/src/services/auto-mode/global-service.ts
    • Added a new service to manage global auto-mode operations across all projects.
  • apps/server/src/services/auto-mode/index.ts
    • Added a new index file to export auto-mode services and related types.
  • apps/server/src/services/auto-mode/types.ts
    • Added new type definitions for auto-mode services, including facade options, status, and error interfaces.
  • apps/server/src/services/claude-usage-service.ts
    • Improved ANSI code stripping logic to correctly handle cursor movement sequences and preserve word boundaries.
    • Enhanced trust prompt detection to include additional variants.
  • apps/server/src/services/concurrency-manager.ts
    • Added a new service to manage running feature slots with lease-based reference counting, supporting nested calls and worktree-specific tracking.
  • apps/server/src/services/execution-service.ts
    • Added a new service to coordinate the feature execution lifecycle, including prompt building, agent execution, and pipeline integration.
  • apps/server/src/services/execution-types.ts
    • Added new type definitions for ExecutionService callbacks and related services.
  • apps/server/src/services/feature-state-manager.ts
    • Added a new service to manage feature status updates, persistence, and recovery from stuck states.
  • apps/server/src/services/ideation-service.ts
    • Adjusted model resolution logic to correctly pass custom provider model IDs to the API.
  • apps/server/src/services/merge-service.ts
    • Added a new service to perform git merge operations directly, including conflict detection and branch validation.
  • apps/server/src/services/pipeline-orchestrator.ts
    • Added a new service to orchestrate pipeline step execution, including agent interaction, test retries, and merge attempts.
  • apps/server/src/services/pipeline-types.ts
    • Added new type definitions for PipelineOrchestrator context, status, and results.
  • apps/server/src/services/plan-approval-service.ts
    • Added a new service to manage the plan approval workflow, including timeouts and recovery mechanisms.
  • apps/server/src/services/recovery-service.ts
    • Added a new service for crash recovery and feature resumption, handling interrupted features and execution state persistence.
  • apps/server/src/services/spec-parser.ts
    • Added a new service with pure functions for parsing spec content, detecting task/phase markers, and extracting summaries.
  • apps/server/src/services/typed-event-bus.ts
    • Added a new type-safe event bus wrapper for EventEmitter to ensure consistent auto-mode event emission.
  • apps/server/src/services/worktree-resolver.ts
    • Added a new service for git worktree discovery and path resolution.
  • apps/server/tests/integration/services/auto-mode-service.integration.test.ts
    • Removed integration tests due to the refactoring of auto-mode-service.ts.
  • apps/server/tests/unit/lib/model-resolver.test.ts
    • Updated expected Claude Opus model version to 4.6 in unit tests.
  • apps/server/tests/unit/lib/sdk-options.test.ts
    • Added unit tests for buildThinkingOptions to verify adaptive thinking logic for Opus 4.6.
  • apps/server/tests/unit/providers/claude-provider.test.ts
    • Updated expected Claude Opus model version to 4.6 in unit tests.
  • apps/server/tests/unit/providers/provider-factory.test.ts
    • Updated expected Claude Opus model version to 4.6 in unit tests.
  • apps/server/tests/unit/services/agent-executor.test.ts
    • Added new unit tests for AgentExecutor service.
  • apps/server/tests/unit/services/auto-loop-coordinator.test.ts
    • Added new unit tests for AutoLoopCoordinator service.
  • apps/server/tests/unit/services/auto-mode-service-planning.test.ts
    • Removed unit tests due to the refactoring of auto-mode-service.ts.
  • apps/server/tests/unit/services/auto-mode-service.test.ts
    • Removed unit tests due to the refactoring of auto-mode-service.ts.
  • apps/server/tests/unit/services/claude-usage-service.test.ts
    • Added unit tests for stripAnsiCodes to verify correct handling of cursor movement sequences in Claude CLI output.
  • apps/server/tests/unit/services/concurrency-manager.test.ts
    • Added new unit tests for ConcurrencyManager service.
  • apps/server/tests/unit/services/execution-service.test.ts
    • Added new unit tests for ExecutionService service.
  • apps/server/tests/unit/services/feature-state-manager.test.ts
    • Added new unit tests for FeatureStateManager service.
  • apps/server/tests/unit/services/pipeline-orchestrator.test.ts
    • Added new unit tests for PipelineOrchestrator service.
  • apps/server/tests/unit/services/plan-approval-service.test.ts
    • Added new unit tests for PlanApprovalService service.
  • apps/server/tests/unit/services/recovery-service.test.ts
    • Added new unit tests for RecoveryService service.
  • apps/server/tests/unit/services/spec-parser.test.ts
    • Added new unit tests for SpecParser service.
  • apps/server/tests/unit/services/typed-event-bus.test.ts
    • Added new unit tests for TypedEventBus service.
  • apps/server/tests/unit/services/worktree-resolver.test.ts
    • Added new unit tests for WorktreeResolver service.
  • apps/ui/docs/AGENT_ARCHITECTURE.md
    • Updated the Claude Opus model version to 4.6 in the agent architecture documentation.
  • apps/ui/eslint.config.mjs
    • Added queueMicrotask to the list of globals.
  • apps/ui/src/components/dialogs/sandbox-risk-dialog.tsx
    • Added Docker troubleshooting tips to the sandbox risk dialog.
  • apps/ui/src/components/ui/task-progress-panel.tsx
    • Updated task status initialization to rely on persisted task status rather than calculated completion count.
    • Modified auto_mode_task_started event handling to only update the started task's status, not previous tasks.
    • Added auto_mode_task_status event handling for real-time task status updates.
  • apps/ui/src/components/usage-popover.tsx
    • Added pace indicator to usage cards to show expected usage over time.
    • Imported getExpectedWeeklyPacePercentage and getPaceStatusLabel for usage calculation.
  • apps/ui/src/components/views/board-view.tsx
    • Adjusted runningAutoTasks to use runningAutoTasksAllWorktrees for broader scope.
    • Added error handling for feature creation functions (handleAddFeature, createPrCommentsFeature, createResolveConflictsFeature, createMergeConflictResolutionFeature) to display toasts and prevent UI inconsistencies.
    • Updated the title for merge conflict resolution features to include branch names.
    • Added toast notifications for auto-mode start/stop failures.
  • apps/ui/src/components/views/board-view/components/kanban-card/agent-info-panel.tsx
    • Added 'Adaptive' label for thinking level.
    • Implemented a defensive UI-side check to mark all tasks as completed if the feature status is 'waiting_approval' or 'verified', preventing stale 'in_progress' spinners.
  • apps/ui/src/components/views/board-view/components/kanban-card/card-actions.tsx
    • Extended the visibility of edit and make buttons to features with 'interrupted' and 'ready' statuses.
  • apps/ui/src/components/views/board-view/components/kanban-card/card-header.tsx
    • Extended the visibility of backlog header actions (spawn sub-task, delete) to features with 'interrupted' and 'ready' statuses.
    • Moved drag handle listeners and attributes to the drag handle element itself, rather than the entire card.
  • apps/ui/src/components/views/board-view/components/kanban-card/kanban-card.tsx
    • Adjusted isActivelyRunning logic to ensure features in 'waiting_approval' status do not display as actively running.
    • Updated draggable conditions to include 'interrupted' and 'ready' statuses.
    • Removed drag listeners from the main card wrapper, delegating them to the drag handle within CardHeaderSection.
  • apps/ui/src/components/views/board-view/components/list-view/list-row.tsx
    • Adjusted isActivelyRunning logic to ensure features in 'waiting_approval' status do not display as actively running.
  • apps/ui/src/components/views/board-view/dialogs/add-feature-dialog.tsx
    • Added logic to normalize the thinking level when switching between adaptive and non-adaptive models, automatically setting 'adaptive' for Opus 4.6 and 'high' for other models when appropriate.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a major and well-executed refactoring of the auto-mode-service, breaking it down into smaller, more focused services. This significantly improves the architecture and maintainability of the auto-mode functionality. The changes also include several UI/UX improvements, such as more accurate status indicators for running features, better error handling when creating features, and refined drag-and-drop behavior on the Kanban board. Additionally, model definitions have been updated, and support for adaptive thinking in newer Claude models has been added. Overall, this is a high-quality pull request that addresses technical debt while also improving the user experience. I have one minor suggestion for cleanup in the .gitignore file.

# GSD planning docs (local-only)
.planning/
.mcp.json
.planning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This entry is a duplicate of the one on line 100 and can be removed to keep the .gitignore file clean.

@gsxdsm gsxdsm closed this Feb 17, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)
apps/ui/src/components/ui/task-progress-panel.tsx (1)

134-144: ⚠️ Potential issue | 🟡 Minor

Inconsistent currentTaskId clearing between auto_mode_task_complete and auto_mode_task_status.

Line 142 unconditionally sets currentTaskId to null, but the auto_mode_task_status handler at line 159 correctly uses a conditional update: it only clears currentTaskId when the completed task matches the current one. If a completion event arrives for a task that isn't the currently active one (e.g., out-of-order events or overlapping tasks), line 142 will incorrectly clear the active task indicator.

Proposed fix: use the same conditional pattern
         case 'auto_mode_task_complete':
           if ('taskId' in event) {
             const taskEvent = event as Extract<AutoModeEvent, { type: 'auto_mode_task_complete' }>;
             setTasks((prev) =>
               prev.map((t) =>
                 t.id === taskEvent.taskId ? { ...t, status: 'completed' as const } : t
               )
             );
-            setCurrentTaskId(null);
+            setCurrentTaskId((current) => (current === taskEvent.taskId ? null : current));
           }
           break;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/ui/src/components/ui/task-progress-panel.tsx` around lines 134 - 144,
The auto_mode_task_complete handler unconditionally clears currentTaskId; change
it to the same conditional pattern used in the auto_mode_task_status handler so
we only clear the active task when the completed task matches it: when handling
the Extract<AutoModeEvent, { type: 'auto_mode_task_complete' }>
(taskEvent.taskId), update tasks as you already do and then call
setCurrentTaskId(prev => prev === taskEvent.taskId ? null : prev) (or an
equivalent conditional functional update) so currentTaskId is only nulled if it
refers to the completed task.
apps/server/src/services/claude-usage-service.ts (1)

22-22: ⚠️ Potential issue | 🟡 Minor

Unused timeout property — the actual timeout is a hardcoded 45 s.

this.timeout (30 000 ms) declared on line 22 is never referenced; the setTimeout on line 323 uses a literal 45000. Either use the property or remove it to avoid confusion.

Proposed fix
- private timeout = 30000; // 30 second timeout
+ private timeout = 45000; // 45 second timeout

Then on line 323:

-    }, 45000); // 45 second timeout
+    }, this.timeout);

Also applies to: 323-323

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/claude-usage-service.ts` at line 22, The declared
private property timeout (value 30000) is unused while a literal 45000ms is
passed to the setTimeout call; update the code to use the timeout property
instead of the hardcoded literal (replace the 45000 argument in the setTimeout
invocation with this.timeout) so the class-level timeout configuration is
honored (or alternatively remove the unused timeout property if you prefer the
45s behavior) — refer to the timeout property and the setTimeout call in
claude-usage-service.ts to make the change.
apps/ui/src/components/views/board-view/components/list-view/list-row.tsx (1)

244-244: ⚠️ Potential issue | 🟡 Minor

Same hasError observation as in kanban-card.tsx.

Line 244 uses the original isCurrentAutoTask for hasError, while Line 356 passes isActivelyRunning to RowActions. If you decide to switch hasError to use isActivelyRunning in kanban-card.tsx, apply the same change here for consistency.

Suggested fix (only if applying the matching change in kanban-card.tsx)
-  const hasError = feature.error && !isCurrentAutoTask;
+  const hasError = feature.error && !isActivelyRunning;

Also applies to: 356-356

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/ui/src/components/views/board-view/components/list-view/list-row.tsx` at
line 244, Change the hasError computation to mirror kanban-card.tsx by using the
isActivelyRunning flag instead of isCurrentAutoTask: replace "const hasError =
feature.error && !isCurrentAutoTask;" with a check that uses isActivelyRunning
(e.g., "const hasError = feature.error && !isActivelyRunning;") so the value
passed to RowActions and the rest of the component is consistent with
isActivelyRunning.
apps/ui/src/components/views/board-view/components/kanban-card/kanban-card.tsx (1)

183-183: ⚠️ Potential issue | 🟡 Minor

Intentional divergence on isCurrentAutoTask vs isActivelyRunning — verify hasError semantics.

CardHeaderSection (Line 251) and CardActions (Line 277) receive isActivelyRunning, while AgentInfoPanel (Line 271) keeps the original isCurrentAutoTask — this looks intentional so agent metadata remains visible during waiting_approval.

However, hasError on Line 183 still uses isCurrentAutoTask:

const hasError = feature.error && !isCurrentAutoTask;

This means a feature in waiting_approval that also has an error won't display error styling if it's still tracked as a running auto task. If error indication should be visible once a task reaches waiting_approval, consider switching to !isActivelyRunning here.

Suggested fix if error should show for waiting_approval
-  const hasError = feature.error && !isCurrentAutoTask;
+  const hasError = feature.error && !isActivelyRunning;

Also applies to: 251-251, 271-271, 277-277

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/ui/src/components/views/board-view/components/kanban-card/kanban-card.tsx`
at line 183, The hasError calculation uses isCurrentAutoTask which hides error
styling for features in waiting_approval; update the logic to use
isActivelyRunning instead so errors are shown once a task leaves active running
state: change the hasError assignment (symbol: hasError) to check feature.error
&& !isActivelyRunning, and propagate the same semantic to components that rely
on it (CardHeaderSection, CardActions) while leaving AgentInfoPanel using
isCurrentAutoTask if you want agent metadata visible during waiting_approval;
ensure all usages are consistent with this intent.
apps/ui/src/components/views/board-view.tsx (1)

1382-1389: ⚠️ Potential issue | 🟡 Minor

Restarting auto mode on concurrency change may lose in-flight work — consider a dedicated API instead.

The stop-then-start sequence to apply a new maxConcurrency value has a window where auto mode is fully stopped. If the backend doesn't gracefully handle this (e.g., preserving in-flight tasks), it could lead to interrupted features. Additionally, there's no error handling on the stop() call — if it rejects, start() will still fire.

🛡️ Suggested improvement: guard the restart chain
             if (autoMode.isRunning) {
-              autoMode.stop().then(() => {
-                autoMode.start().catch((error) => {
-                  logger.error('[AutoMode] Failed to restart with new concurrency:', error);
-                });
+              autoMode.stop().then(
+                () => {
+                  autoMode.start().catch((error) => {
+                    logger.error('[AutoMode] Failed to restart with new concurrency:', error);
+                    toast.error('Failed to restart auto mode with new concurrency');
+                  });
+                },
+                (error) => {
+                  logger.error('[AutoMode] Failed to stop for concurrency change:', error);
+                  toast.error('Failed to update concurrency');
+                }
+              );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/ui/src/components/views/board-view.tsx` around lines 1382 - 1389, The
current stop-then-start restart in the concurrency change handler (checking
autoMode.isRunning and calling autoMode.stop().then(() => autoMode.start())) can
leave a downtime window and doesn't handle stop() rejection; change this to
either call a dedicated API on the backend to update maxConcurrency (preferred)
or, if client-side restart is required, chain promises safely: call
autoMode.stop(), handle rejection (log and abort restart), and only call
autoMode.start() after a successful stop; ensure you reference and update usages
of autoMode.stop, autoMode.start, autoMode.isRunning, and the maxConcurrency
change path, and add error handling via logger.error where stop() may reject so
start() is never invoked on a failed stop.
apps/server/src/routes/github/routes/validate-issue.ts (1)

362-375: ⚠️ Potential issue | 🟡 Minor

Custom provider models are rejected by validation before provider lookup can occur.

The isValidModel check on lines 363–367 only accepts Claude, Cursor, Codex, and OpenCode models. Custom provider models like "GLM-4.7" or "MiniMax-M2.1" would fail this check and return a 400 error, never reaching the getProviderByModelId call on lines 178–189 that could actually resolve them.

This is inconsistent with how other services handle the same scenario: IdeationService and AgentService both call getProviderByModelId directly without this restrictive validation, allowing custom providers to work seamlessly.

If custom provider models should be supported here (as the comment on line 191 suggests), either:

  1. Remove the isValidModel check and defer validation to getProviderByModelId
  2. Extend the validation to include custom provider models by checking if a provider exists for the given model ID
  3. Document that custom providers are intentionally not supported for issue validation
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/routes/github/routes/validate-issue.ts` around lines 362 -
375, The current isValidModel check (using isClaudeModel, isCursorModel,
isCodexModel, isOpencodeModel) incorrectly rejects custom provider models before
provider resolution; replace that check with a provider lookup via
getProviderByModelId (or first call getProviderByModelId when model isn't one of
the known built-ins) so custom model IDs (e.g., "GLM-4.7") are resolved by
getProviderByModelId and only return 400 if no provider is found or provider
validation fails; update the branch that currently returns the 400 to instead
attempt getProviderByModelId for model IDs not matched by the built-in
predicates and only error after lookup fails.
apps/server/src/routes/auto-mode/routes/run-feature.ts (1)

29-55: ⚠️ Potential issue | 🔴 Critical

Real TOCTOU vulnerability exists: route-level capacity check has no atomic enforcement in the service layer.

Between checkWorktreeCapacity() at line 30 and executeFeature() at line 50, concurrent requests can both pass the check. More critically, concurrencyManager.acquire() (called within executeFeature) performs no capacity enforcement—it only tracks already-running features. Two simultaneous requests will both pass the route check and both successfully call acquire(), exceeding the worktree limit.

The route-level check is advisory only. Move capacity enforcement into acquire() to make it atomic, or implement a reservation mechanism that holds the slot between the check and execution.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/routes/auto-mode/routes/run-feature.ts` around lines 29 - 55,
The route-level capacity check (checkWorktreeCapacity) is advisory and
vulnerable to TOCTOU because executeFeature calls concurrencyManager.acquire()
later without enforcing capacity; modify concurrencyManager.acquire (or the
acquire code path inside autoModeService.executeFeature) to perform atomic
capacity enforcement/reservation so a slot is claimed as part of acquire(), or
implement a short-lived reservation API on autoModeService that atomically
reserves capacity and returns a token that executeFeature uses to actually
start; update executeFeature to rely on that atomic acquire/reserve mechanism
instead of trusting the pre-check.
🟡 Minor comments (18)
apps/ui/src/components/usage-popover.tsx-238-250 (1)

238-250: ⚠️ Potential issue | 🟡 Minor

Color can contradict the label when the difference rounds to zero.

getPaceStatusLabel uses Math.round(diff) — so when actual is, say, 50.3% and pace is 50.0%, the label reads "On pace" but the color here evaluates 50.3 > 50.0 → orange, creating a mismatch.

Align the color check with the same rounding the label uses:

Proposed fix
-              safePercentage > (pacePercentage ?? 0) ? 'text-orange-500' : 'text-green-500'
+              Math.round(safePercentage - (pacePercentage ?? 0)) > 0
+                ? 'text-orange-500'
+                : 'text-green-500'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/ui/src/components/usage-popover.tsx` around lines 238 - 250, The color
decision for the pace label can contradict the text because the label generation
(getPaceStatusLabel) uses Math.round(diff) while the color check compares raw
safePercentage and pacePercentage; update the color logic in the component that
renders paceLabel (references: paceLabel, safePercentage, pacePercentage) to use
the same rounding as the label — e.g., compute rounded actual and rounded pace
(or rounded diff) with Math.round and then use that rounded comparison to pick
'text-orange-500' vs 'text-green-500' so text and color always agree.
apps/server/src/services/claude-usage-service.ts-400-408 (1)

400-408: ⚠️ Potential issue | 🟡 Minor

'safety check' is a fairly generic match — verify it won't false-positive on model output.

Since cleanOutput accumulates all PTY data (including any text the CLI might print before the REPL), a substring as common as "safety check" could theoretically trigger an unintended trust-approval. In the /usage flow the risk is low, but if this code path is ever reused for interactive sessions it could auto-approve unexpectedly. Consider matching a more specific phrase (e.g., "Quick safety check" or the full prompt sentence) to tighten the guard.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/claude-usage-service.ts` around lines 400 - 408, The
current conditional uses a very generic substring ("safety check") against
cleanOutput which can false-positive; in the approval logic (the conditional
referencing hasApprovedTrust and cleanOutput) tighten the match to a more
specific phrase such as "Quick safety check" or the full prompt sentence used by
the CLI (e.g., "Quick safety check — do you want to work in this folder?")
instead of "safety check", and update the conditional that checks cleanOutput
(in the same block that references hasApprovedTrust) so it only approves when
the stricter phrase is found.
apps/ui/src/components/views/board-view/components/kanban-card/card-actions.tsx-296-348 (1)

296-348: ⚠️ Potential issue | 🟡 Minor

Add interrupted and ready status support to list view backlog actions for UI parity.

The kanban view's backlog action block now checks for backlog || interrupted || ready statuses, but the list view's RowActions (line 387) only checks feature.status === 'backlog'. Features in interrupted or ready status will display Edit/View Plan/Make buttons in the kanban view but lack corresponding actions in the list view dropdown menu.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/ui/src/components/views/board-view/components/kanban-card/card-actions.tsx`
around lines 296 - 348, Update the status check in the list view RowActions so
it matches the kanban backlog block by allowing feature.status to be 'backlog'
OR 'interrupted' OR 'ready' (replace the single equality check feature.status
=== 'backlog' with a compound check), so the same Edit/View Plan/Make actions
(onEdit, onViewPlan, onImplement) and their data-testid values are rendered for
interrupted and ready features as they are in the Kanban CardActions.
.planning/codebase/CONVENTIONS.md-8-10 (1)

8-10: ⚠️ Potential issue | 🟡 Minor

Documentation error: file naming is kebab-case, not PascalCase.

The bullet says "PascalCase for class/service files" but all examples (auto-mode-service.ts, feature-loader.ts, claude-provider.ts) are kebab-case. This should read "kebab-case for class/service files" to match the actual examples and codebase conventions.

📝 Suggested fix
-- PascalCase for class/service files: `auto-mode-service.ts`, `feature-loader.ts`, `claude-provider.ts`
+- kebab-case for class/service files: `auto-mode-service.ts`, `feature-loader.ts`, `claude-provider.ts`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.planning/codebase/CONVENTIONS.md around lines 8 - 10, Update the wording in
the CONVENTIONS.md entry that currently reads "PascalCase for class/service
files" to correctly state "kebab-case for class/service files" so it matches the
examples `auto-mode-service.ts`, `feature-loader.ts`, and `claude-provider.ts`;
make the text consistent with the route/handler kebab-case guidance to avoid
confusion.
.planning/codebase/STRUCTURE.md-233-233 (1)

233-233: ⚠️ Potential issue | 🟡 Minor

Documentation contradiction: PascalCase claim vs kebab-case examples.

The text says "Components: PascalCase.tsx" but the examples (board-view.tsx, session-manager.tsx) are kebab-case. This will confuse contributors about the actual convention.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.planning/codebase/STRUCTURE.md at line 233, The docs claim "Components:
PascalCase.tsx" but show kebab-case examples (`board-view.tsx`,
`session-manager.tsx`); update the content in STRUCTURE.md to be consistent by
either changing the header to "Components: kebab-case.tsx" or replacing the
examples with PascalCase filenames (e.g., `BoardView.tsx`, `SessionManager.tsx`)
so the rule and examples match; ensure the term "PascalCase.tsx" or the example
filenames are the unique symbols you adjust (PascalCase.tsx, board-view.tsx,
session-manager.tsx).
.planning/codebase/STRUCTURE.md-238-238 (1)

238-238: ⚠️ Potential issue | 🟡 Minor

Markdown rendering issue: underscores interpreted as emphasis markers.

The _ characters in _.test.ts or _.spec.ts are being parsed as emphasis markers. Escape them or use backticks to render correctly.

Proposed fix
-- **Tests:** _.test.ts or _.spec.ts (co-located with source)
+- **Tests:** `*.test.ts` or `*.spec.ts` (co-located with source)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.planning/codebase/STRUCTURE.md at line 238, The markdown line containing
"_.test.ts or _.spec.ts" is rendering underscores as emphasis; update that line
to prevent parsing by either escaping the underscores (e.g., \_.test.ts and
\_.spec.ts) or wrapping the filenames in inline code/backticks (`_.test.ts` or
`_.spec.ts`) so the underscores render literally; locate the occurrence of the
string "_.test.ts or _.spec.ts" in STRUCTURE.md and apply one of these fixes.
apps/server/src/services/merge-service.ts-139-143 (1)

139-143: ⚠️ Potential issue | 🟡 Minor

Squash commit can throw if there are no changes to commit.

If the merge results in no changes (already up-to-date), git commit will fail with a non-zero exit code. This isn't caught, so it will propagate as an unhandled exception, inconsistent with the MergeServiceResult return contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/merge-service.ts` around lines 139 - 143, The squash
commit can throw when there are no staged changes—wrap the commit step in a
check and/or try-catch so the service returns a proper MergeServiceResult
instead of propagating an exception. Specifically, before calling
execGitCommand(['commit', '-m', squashMessage], projectPath) (in the squash
branch where options?.squash and squashMessage are used), run a lightweight
check for staged changes (e.g., execGitCommand(['diff', '--cached', '--quiet'],
projectPath) or execGitCommand(['status', '--porcelain'], projectPath)) and only
call commit if changes exist; alternatively catch errors from execGitCommand and
if the error message/exit indicates "nothing to commit" treat it as a successful
no-op and return the appropriate MergeServiceResult rather than throwing. Ensure
handling uses the same MergeServiceResult code paths as successful merges.
apps/server/src/services/recovery-service.ts-177-195 (1)

177-195: ⚠️ Potential issue | 🟡 Minor

No null guard for prompts.taskExecution.resumeFeatureTemplate.

Line 187 accesses prompts.taskExecution.resumeFeatureTemplate and immediately calls .replace() on it. If getPromptCustomization returns a taskExecution object without resumeFeatureTemplate (e.g., due to a misconfigured settings service), this will throw a TypeError: Cannot read properties of undefined.

Suggested guard
     const prompts = await getPromptCustomization(this.settingsService, '[RecoveryService]');
+    const resumeTemplate = prompts.taskExecution?.resumeFeatureTemplate;
+    if (!resumeTemplate) {
+      logger.warn(`No resumeFeatureTemplate found, falling back to direct execution`);
+      return this.executeFeatureFn(projectPath, featureId, useWorktrees, false, undefined, {
+        _calledInternally: true,
+      });
+    }
     const featurePrompt = `## Feature Implementation Task\n\n**Feature ID:** ${feature.id}\n**Title:** ${feature.title || 'Untitled Feature'}\n**Description:** ${feature.description}\n`;
-    let prompt = prompts.taskExecution.resumeFeatureTemplate;
+    let prompt = resumeTemplate;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/recovery-service.ts` around lines 177 - 195, The
code in executeFeatureWithContext uses
prompts.taskExecution.resumeFeatureTemplate without guarding for null/undefined;
update executeFeatureWithContext to validate that prompts.taskExecution and
prompts.taskExecution.resumeFeatureTemplate exist (from getPromptCustomization)
and either supply a safe default template string or throw a clear error before
calling .replace(); then pass the resulting prompt into executeFeatureFn as
before. Reference symbols: executeFeatureWithContext, getPromptCustomization,
prompts.taskExecution.resumeFeatureTemplate, and executeFeatureFn.
apps/server/src/services/plan-approval-service.ts-220-226 (1)

220-226: ⚠️ Potential issue | 🟡 Minor

Log level mismatch: logger.info used for an error condition.

Line 220 logs "ERROR: No pending approval found..." at info level. This should be logger.warn or logger.error to match the severity and be filterable by log level.

Proposed fix
-      logger.info(
+      logger.warn(
         `ERROR: No pending approval found for feature ${featureId} and recovery not possible`
       );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/plan-approval-service.ts` around lines 220 - 226,
The log call uses logger.info for an error condition ("ERROR: No pending
approval found for feature ${featureId}") which misrepresents severity; change
the logger invocation to a higher level (logger.warn or logger.error) in the
plan-approval-service code path that returns the failure object (the block
referencing logger and featureId) so the message is filterable by severity and
consistent with the returned { success: false, error: ... } response.
apps/server/src/services/pipeline-orchestrator.ts-582-583 (1)

582-583: ⚠️ Potential issue | 🟡 Minor

Duplicate condition — both sides of || are identical.

if (trimmed.includes('AssertionError') || trimmed.includes('AssertionError')) {

Both operands are the same string, making the || redundant. One of these was likely intended to catch a different variant (e.g., a common misspelling, or a different assertion class like AssertError).

Proposed fix (example — adjust the second variant as appropriate)
-      if (trimmed.includes('AssertionError') || trimmed.includes('AssertionError')) {
+      if (trimmed.includes('AssertionError') || trimmed.includes('AssertError')) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/pipeline-orchestrator.ts` around lines 582 - 583,
The if condition currently checks trimmed.includes('AssertionError') twice
(redundant); locate the check around the failedTests push in
pipeline-orchestrator.ts (the variables trimmed and failedTests) and replace the
duplicate operand with the intended variant (e.g.,
trimmed.includes('AssertError') or trimmed.includes('AssertionError:')) or
remove the redundant || entirely so the condition correctly detects the intended
assertion failure string before pushing into failedTests. Ensure only one
meaningful includes() is evaluated.
apps/server/src/services/auto-loop-coordinator.ts-283-303 (1)

283-303: ⚠️ Potential issue | 🟡 Minor

Non-null assertion errorInfo! at Line 302 is unsafe.

When branchNameOrError is null (valid new-signature call for the main worktree), the code falls to the else branch and asserts errorInfo!. But if a caller passes (projectPath, null) without the third argument, errorInfo is undefined and the assertion silently lies — actualErrorInfo will be undefined, causing a crash at Line 309 when accessing .message.

The same pattern exists in signalShouldPauseForProject at Line 337.

Proposed fix (add a runtime guard)
     } else {
       // New signature: (projectPath, branchName, errorInfo)
       branchName = branchNameOrError;
-      actualErrorInfo = errorInfo!;
+      if (!errorInfo) {
+        throw new Error('errorInfo is required when using (projectPath, branchName, errorInfo) signature');
+      }
+      actualErrorInfo = errorInfo;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-loop-coordinator.ts` around lines 283 - 303, In
trackFailureAndCheckPauseForProject validate the third-argument case instead of
using the unsafe non-null assertion: when branchNameOrError is not the
old-signature object, check that errorInfo is defined and has the expected shape
before assigning to actualErrorInfo (otherwise throw or return/fail gracefully
with a clear message); apply the same runtime guard to
signalShouldPauseForProject where errorInfo! is used, referencing the
branchNameOrError and actualErrorInfo variables to locate and replace the
non-null assertions with explicit checks and handling.
apps/server/src/services/auto-mode/types.ts-96-106 (1)

96-106: ⚠️ Potential issue | 🟡 Minor

branchName is string | undefined here but string | null elsewhere.

RunningAgentInfo.branchName is typed as ?: string (optional / undefined), while related interfaces throughout the PR (ProjectAutoModeStatus, AutoModeConfig, RunningFeature, PipelineContext, etc.) consistently use string | null. This mismatch can cause subtle bugs when checking for the main worktree — branchName === null won't match undefined.

Consider aligning to branchName: string | null (or ?: string | null if the field can be absent).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/types.ts` around lines 96 - 106,
RunningAgentInfo.branchName is defined as an optional string (undefined) while
other types use string | null; change the RunningAgentInfo interface (the
branchName member) to use the same nullable shape as the rest of the codebase
(branchName: string | null or optionally ?: string | null) so checks like
branchName === null behave consistently with
ProjectAutoModeStatus/AutoModeConfig/RunningFeature/PipelineContext; update the
RunningAgentInfo declaration accordingly and run type checks to ensure no other
places expect undefined.
apps/server/src/services/agent-executor.ts-145-178 (1)

145-178: ⚠️ Potential issue | 🟡 Minor

Heartbeat logs misleading "Waiting for first model response" during the entire task-loop execution.

When existingApprovedPlanContent triggers the early executeTasksLoop path, receivedAnyStreamMessage is never set to true because the main stream loop is skipped. The heartbeat interval (Line 147) fires every 15 seconds logging "Waiting for first model response…" throughout the task loop, which is incorrect — the tasks are actively running via their own provider streams.

Move clearInterval(streamHeartbeat) (or set receivedAnyStreamMessage = true) before awaiting executeTasksLoop:

Proposed fix
     if (existingApprovedPlanContent && persistedTasks && persistedTasks.length > 0) {
+      clearInterval(streamHeartbeat);
       const result = await this.executeTasksLoop(
         options,
         persistedTasks,
         existingApprovedPlanContent,
         responseText,
         scheduleWrite,
         callbacks
       );
-      clearInterval(streamHeartbeat);
       if (writeTimeout) clearTimeout(writeTimeout);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/agent-executor.ts` around lines 145 - 178, The
heartbeat interval (streamHeartbeat) keeps logging "Waiting for first model
response…" during executeTasksLoop because receivedAnyStreamMessage is never set
when taking the early existingApprovedPlanContent path; to fix, stop the
heartbeat before awaiting executeTasksLoop by clearing streamHeartbeat (or
setting receivedAnyStreamMessage = true) immediately after the early-path check
and before calling this.executeTasksLoop, and also ensure any
writeTimeout/rawWriteTimeout are cleared prior to the await and writeToFile is
still awaited afterward so no stray logs occur while tasks run; update
references to streamHeartbeat, receivedAnyStreamMessage, executeTasksLoop, and
AgentExecutor.STREAM_HEARTBEAT_MS accordingly.
apps/server/tests/unit/services/agent-executor.test.ts-395-440 (1)

395-440: ⚠️ Potential issue | 🟡 Minor

Test name is misleading — aborted is never asserted.

The test is named "should return aborted=true when abort signal is already aborted" but Lines 438–439 only check that result and result.responseText are defined. It never asserts result.aborted === true. Either rename the test to reflect what it actually verifies, or add the missing assertion.

Proposed fix
       // When stream is empty and signal is aborted before stream starts,
       // the result depends on whether abort was checked
       expect(result).toBeDefined();
       expect(result.responseText).toBeDefined();
+      expect(result.aborted).toBe(true);
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/tests/unit/services/agent-executor.test.ts` around lines 395 -
440, Rename the test or add the missing assertion so the name matches behavior:
either change the it(...) string to say it "completes without error when abort
signal is already aborted" or add an assertion that checks result.aborted ===
true after calling AgentExecutor.execute(...). Locate the test using the
AgentExecutor instantiation and the execute call in the test case and update the
assertion section (currently checking result and result.responseText) to include
the result.aborted check if you choose to assert the aborted flag.
apps/server/src/services/concurrency-manager.ts-81-89 (1)

81-89: ⚠️ Potential issue | 🟡 Minor

Parameter abortController is silently ignored when allowReuse=true.

When reusing an existing entry (line 87), the provided params.abortController is discarded and the original controller is retained. While no current callers pass both allowReuse=true and abortController, accepting but ignoring the parameter creates a confusing API contract and invites future misuse.

Consider either removing abortController from the reuse path's accepted parameters or explicitly documenting that it is intentionally ignored during reuse.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/concurrency-manager.ts` around lines 81 - 89, The
acquire method currently ignores params.abortController when reusing an existing
RunningFeature; update acquire so that when an existing entry is found and
params.allowReuse is true but params.abortController is provided, the method
rejects explicitly (throw an Error like 'abortController must be undefined when
allowReuse=true') instead of silently dropping the controller; locate the logic
in acquire (references: AcquireParams, RunningFeature, runningFeatures.get,
allowReuse, abortController, leaseCount) and add the check before incrementing
leaseCount so callers cannot pass an abortController that will be ignored.
apps/server/src/services/auto-mode/facade.ts-647-649 (1)

647-649: ⚠️ Potential issue | 🟡 Minor

Use path.basename() instead of split('/').pop().

p.split('/').pop() won't extract the filename correctly on Windows paths (which use \). The path module is already imported — use path.basename(p) instead.

🐛 Proposed fix
         feature.imagePaths = imagePaths.map((p) => ({
           path: p,
-          filename: p.split('/').pop() || p,
+          filename: path.basename(p),
           mimeType: 'image/*',
         }));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/facade.ts` around lines 647 - 649, The
filename extraction using p.split('/').pop() in the mapping that sets
feature.imagePaths is platform-unsafe; replace that expression with
path.basename(p) so Windows backslashes are handled correctly—update the map
callback in the code that assigns feature.imagePaths (the imagePaths.map((p) =>
({ path: p, filename: ... })) block) to call path.basename(p) for the filename.
apps/server/src/services/auto-mode/facade.ts-683-737 (1)

683-737: ⚠️ Potential issue | 🟡 Minor

Missing facade-level error handling in verifyFeature.

Unlike other public methods, verifyFeature doesn't wrap its body in a try/catch that calls handleFacadeError(). If worktreeResolver.findWorktreeForBranch() or featureStateManager.loadFeature() throws, the error propagates without being logged or surfaced to the UI.

🛡️ Proposed wrapper
   async verifyFeature(featureId: string): Promise<boolean> {
+    try {
       const feature = await this.featureStateManager.loadFeature(this.projectPath, featureId);
       // ... existing body ...
       return allPassed;
+    } catch (error) {
+      this.handleFacadeError(error, 'verifyFeature', featureId);
+      throw error;
+    }
   }
apps/server/src/services/auto-mode/facade.ts-228-228 (1)

228-228: ⚠️ Potential issue | 🟡 Minor

Use resolveModelString() from @automaker/model-resolver instead of hardcoding fallback.

Lines 228 and 355 both hardcode 'claude-sonnet-4-20250514' as a fallback when no model is provided. Per the coding guidelines, model resolution must use resolveModelString() to handle aliases and maintain a single source of truth for defaults. Other services (pipeline-orchestrator, execution-service, ideation-service) correctly use resolveModelString(model, DEFAULT_MODELS.claude).

Add the import:

import { resolveModelString, DEFAULT_MODELS } from '@automaker/model-resolver';

Then replace both occurrences with:

const resolvedModel = resolveModelString(model, DEFAULT_MODELS.claude);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/facade.ts` at line 228, Replace hardcoded
fallback model strings in auto-mode facade with the shared resolver: add an
import for resolveModelString and DEFAULT_MODELS from `@automaker/model-resolver`,
then replace both occurrences where resolvedModel is set (the lines assigning
const resolvedModel = model || 'claude-sonnet-4-20250514') with a call to
resolveModelString(model, DEFAULT_MODELS.claude) so the code uses the canonical
model resolution logic (locations to change are the resolvedModel assignments in
facade.ts around the existing usages).

Comment on lines +263 to +274
if (!thinkingLevel || thinkingLevel === 'none') {
return {};
}

// Adaptive thinking (Opus 4.6): don't set maxThinkingTokens
// The model will use adaptive thinking by default
if (thinkingLevel === 'adaptive') {
logger.debug(
`buildThinkingOptions: thinkingLevel="adaptive" -> no maxThinkingTokens (model decides)`
);
return {};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Claude Agent SDK adaptive thinking maxThinkingTokens configuration

💡 Result:

In the Claude Agent SDK, “adaptive thinking” isn’t configured via a thinking: { type: "adaptive" } switch (that setting is for the Messages API, not the Agent SDK). Instead, the Agent SDK exposes a thinking budget cap:

  • TypeScript: set maxThinkingTokens in the SDK options (and you can change it mid-session via setMaxThinkingTokens). [1]
  • Python: set max_thinking_tokens in the SDK options. [2]
  • Note: thinking is disabled by default in the SDK, and if you explicitly set maxThinkingTokens/max_thinking_tokens, you won’t receive StreamEvent partial deltas—only complete messages after each turn. [3]

Examples

TypeScript:

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const msg of query({
  prompt: "Solve this step by step: ...",
  options: { maxThinkingTokens: 4000 }
})) { /* ... */ }

[1]

Python:

from claude_agent_sdk import query, ClaudeAgentOptions

async for msg in query(
  prompt="Solve this step by step: ...",
  options=ClaudeAgentOptions(max_thinking_tokens=4000),
):
  pass

[2]

If what you specifically want is Messages API adaptive thinking, configure:

  • thinking: { "type": "adaptive" } (currently documented for Claude Opus 4.6) and optionally guide it with output_config: { "effort": ... }. [4]

Sources: [1] Agent SDK TS reference, [2] Agent SDK Python reference, [3] Agent SDK streaming limitations, [4] Adaptive thinking (Messages API) docs.


🏁 Script executed:

# Check package.json for SDK version
fd package.json -x grep -l "@anthropic-ai/claude-agent-sdk" {}

# Also look for any SDK type definitions or documentation in the codebase
rg "adaptive.*thinking" --type ts --type tsx -A 2 -B 2

# Check if there are comments explaining thinking modes in the file
cat -n apps/server/src/lib/sdk-options.ts | sed -n '250,290p'

Repository: AutoMaker-Org/automaker

Length of output: 1880


Remove or correct the adaptive thinking logic—returning {} disables thinking in the Agent SDK, not enables it.

The Agent SDK disables thinking by default and only enables it when maxThinkingTokens is explicitly set. The comment claiming "The model will use adaptive thinking by default" is incorrect. Returning {} for thinkingLevel === 'adaptive' will disable thinking entirely, not activate adaptive reasoning.

"Adaptive thinking" is a Messages API feature (thinking: { type: "adaptive" }), not an Agent SDK feature. If you need 'adaptive' behavior in the Agent SDK, set maxThinkingTokens to a high value (e.g., 10000) instead of returning an empty object.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/lib/sdk-options.ts` around lines 263 - 274, The adaptive
branch incorrectly returns {} (disabling thinking); update the thinkingLevel ===
'adaptive' branch in buildThinkingOptions (or the function handling
thinkingLevel) to set an explicit high maxThinkingTokens (e.g., 10000) so the
Agent SDK enables adaptive-like thinking instead of disabling it, and replace
the misleading comment about model default behavior with a note that adaptive
behavior in the Agent SDK requires an explicit maxThinkingTokens value rather
than an empty object.

Comment on lines 31 to 39
// Start follow-up in background
// followUpFeature derives workDir from feature.branchName
// Default to false to match run-feature/resume-feature behavior.
// Worktrees should only be used when explicitly enabled by the user.
autoModeService
// Default to false to match run-feature/resume-feature behavior.
// Worktrees should only be used when explicitly enabled by the user.
.followUpFeature(projectPath, featureId, prompt, imagePaths, useWorktrees ?? false)
.catch((error) => {
logger.error(`[AutoMode] Follow up feature ${featureId} error:`, error);
})
.finally(() => {
// Release the starting slot when follow-up completes (success or error)
// Note: The feature should be in runningFeatures by this point
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the followUpFeature implementation to see if it manages capacity
rg -nP --type=ts 'async\s+followUpFeature' -g '!**/test/**' -g '!**/routes/**'

Repository: AutoMaker-Org/automaker

Length of output: 214


🏁 Script executed:

#!/bin/bash
# Also check run-feature.ts to see the checkWorktreeCapacity pattern
rg -nP --type=ts -B 5 -A 15 'checkWorktreeCapacity' -g '!**/test/**'

Repository: AutoMaker-Org/automaker

Length of output: 7334


🏁 Script executed:

#!/bin/bash
# Check if there's any capacity management in autoModeService or related files
fd 'auto-mode.*service' -type f --extension ts

Repository: AutoMaker-Org/automaker

Length of output: 237


🏁 Script executed:

#!/bin/bash
# Get the followUpFeature implementation in compat.ts
sed -n '166,210p' apps/server/src/services/auto-mode/compat.ts

Repository: AutoMaker-Org/automaker

Length of output: 1478


🏁 Script executed:

#!/bin/bash
# Get the followUpFeature implementation in facade.ts
sed -n '609,700p' apps/server/src/services/auto-mode/facade.ts

Repository: AutoMaker-Org/automaker

Length of output: 3353


🏁 Script executed:

#!/bin/bash
# Also check resume-feature.ts to see if it has capacity checks (mentioned in the review as comparison)
cat -n apps/server/src/routes/auto-mode/routes/resume-feature.ts | head -60

Repository: AutoMaker-Org/automaker

Length of output: 1748


🏁 Script executed:

#!/bin/bash
# Search for executeFeature implementation to see if it manages capacity internally
rg -nP --type=ts 'async\s+executeFeature' -A 30 -g '!**/test/**' -g '!**/routes/**' | head -80

Repository: AutoMaker-Org/automaker

Length of output: 6920


🏁 Script executed:

#!/bin/bash
# Check the full executeFeature implementation in facade.ts to verify if it has capacity checks
sed -n '750,850p' apps/server/src/services/auto-mode/facade.ts

Repository: AutoMaker-Org/automaker

Length of output: 3373


🏁 Script executed:

#!/antml:
# Search for any capacity checks within executeFeature flow
rg -nP 'checkWorktreeCapacity|hasCapacity' apps/server/src/services/auto-mode/facade.ts | head -20

Repository: AutoMaker-Org/automaker

Length of output: 202


Add capacity check before starting follow-up execution.

run-feature.ts validates worktree capacity before calling the service and returns HTTP 429 if exhausted. The follow-up-feature handler skips this check, yet it starts an agent via followUpFeatureexecuteFeature, which consumes a worktree slot just like run-feature. This creates inconsistent resource management across endpoints. Add the same checkWorktreeCapacity validation here to match run-feature behavior and prevent capacity exhaustion.

Note: resume-feature.ts has the same gap and should also be updated.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/routes/auto-mode/routes/follow-up-feature.ts` around lines 31
- 39, Before invoking autoModeService.followUpFeature, call the same
checkWorktreeCapacity(...) used by the run-feature handler and return HTTP 429
when capacity is exhausted; if checkWorktreeCapacity(projectPath) indicates no
capacity, send a 429 response and do not call followUpFeature (mirror
run-feature.ts behavior). Update the follow-up-feature request handler to
perform this validation (use projectPath and featureId to locate the invocation)
and ensure the error path logs/returns the 429 consistently; apply the identical
change to the resume-feature handler so both endpoints enforce worktree capacity
before calling executeFeature via autoModeService.

Comment on lines +396 to +407
taskOutput += msg.result || '';
responseText += msg.result || '';
}
}
if (!taskCompleteDetected)
await this.featureStateManager.updateTaskStatus(
projectPath,
featureId,
task.id,
'completed'
);
tasksCompleted = taskIndex + 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Tasks always marked "completed" even if the agent produced no [TASK_COMPLETE] marker.

Lines 400–406 unconditionally mark every task as completed when the stream ends without a [TASK_COMPLETE] marker. Given the PR objective ("features were often marked completed before all tasks finished"), this fallback may be counterproductive — a missing marker could indicate the agent didn't actually finish the task. Consider logging a warning or emitting an event when the fallback triggers, so premature completions are observable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/agent-executor.ts` around lines 396 - 407, The
current fallback unconditionally calls featureStateManager.updateTaskStatus(...)
which marks tasks completed even when taskCompleteDetected is false; change this
so we only call this.featureStateManager.updateTaskStatus(projectPath,
featureId, task.id, 'completed') when taskCompleteDetected is true, and when
false do not mark completed—instead log a warning (use the local logger or
processLogger) including projectPath, featureId and task.id and emit an
observable event (e.g. this.eventEmitter.emit('task.incomplete' or similar) with
the same identifiers) so premature completions are visible; ensure you still
update tasksCompleted = taskIndex + 1 only if appropriate or leave it unchanged
when the task was not truly completed.

Comment on lines +41 to +44
export function getWorktreeAutoLoopKey(projectPath: string, branchName: string | null): string {
const normalizedBranch = branchName === 'main' ? null : branchName;
return `${projectPath}::${normalizedBranch ?? '__main__'}`;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the auto-loop-coordinator.ts file to see the exact implementation
cat -n apps/server/src/services/auto-loop-coordinator.ts | head -60

Repository: AutoMaker-Org/automaker

Length of output: 2395


🏁 Script executed:

# Also check line 394 in resolveMaxConcurrency
cat -n apps/server/src/services/auto-loop-coordinator.ts | sed -n '390,400p'

Repository: AutoMaker-Org/automaker

Length of output: 767


🏁 Script executed:

# Search for getCurrentBranch usage to understand how it works
rg -n "getCurrentBranch" apps/server/src/ --type ts --type tsx

Repository: AutoMaker-Org/automaker

Length of output: 94


🏁 Script executed:

# Check what git utilities are available in `@automaker/git-utils`
fd -t f "git-utils" --exec cat {} \; | head -100

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Search for how primary branch is resolved elsewhere
rg -n "getCurrentBranch\|getPrimaryBranch\|getDefaultBranch" apps/server/src/ --type ts --type tsx -A 2

Repository: AutoMaker-Org/automaker

Length of output: 94


🏁 Script executed:

# Search for getCurrentBranch without problematic file type flags
rg -n "getCurrentBranch" apps/server/src/ -A 2 -B 1

Repository: AutoMaker-Org/automaker

Length of output: 6054


🏁 Script executed:

# Search for where primary branch might be resolved
rg -n "primary.*branch\|default.*branch\|master\|getCurrentBranch" apps/server/src/ -i

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Check if `@automaker/git-utils` exists and what it exports
fd "git-utils" --type f

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Look for imports of git-utils in the codebase
rg -n "from.*git-utils\|import.*git-utils" apps/

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Check ConcurrencyManager to see how it uses getCurrentBranch
rg -n "class ConcurrencyManager\|getRunningCountForWorktree" apps/server/src/services/ -A 10

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Check the AutoLoopCoordinator constructor and dependencies
rg -n "class AutoLoopCoordinator\|constructor" apps/server/src/services/auto-loop-coordinator.ts -A 15

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Check where AutoLoopCoordinator is instantiated and what's passed to it
rg -n "new AutoLoopCoordinator\|AutoLoopCoordinator(" apps/ -B 2 -A 3

Repository: AutoMaker-Org/automaker

Length of output: 207


🏁 Script executed:

# Check the full resolveMaxConcurrency method and its context
cat -n apps/server/src/services/auto-loop-coordinator.ts | sed -n '370,410p'

Repository: AutoMaker-Org/automaker

Length of output: 1974


🏁 Script executed:

# Look at how facade.ts handles primary branch normalization (the correct way)
cat -n apps/server/src/services/auto-mode/facade.ts | sed -n '900,910p'

Repository: AutoMaker-Org/automaker

Length of output: 755


🏁 Script executed:

# Check the AutoLoopCoordinator constructor and full class definition
cat -n apps/server/src/services/auto-loop-coordinator.ts | sed -n '70,130p'

Repository: AutoMaker-Org/automaker

Length of output: 2676


🏁 Script executed:

# Check how AutoLoopCoordinator is used/instantiated
rg "AutoLoopCoordinator" apps/server/src/ -B 2 -A 2

Repository: AutoMaker-Org/automaker

Length of output: 8386


🏁 Script executed:

# Verify the issue: check if facade.ts or global-service.ts correctly uses worktreeResolver for branch normalization
cat -n apps/server/src/services/auto-mode/facade.ts | sed -n '1055,1070p'

Repository: AutoMaker-Org/automaker

Length of output: 671


🏁 Script executed:

# Check how startAutoLoopForProject is called and whether branchName is pre-normalized
rg -n "startAutoLoopForProject" apps/server/src/services/ -B 3 -A 1

Repository: AutoMaker-Org/automaker

Length of output: 1883


🏁 Script executed:

# Verify the complete context of how AutoLoopCoordinator is instantiated in facade
cat -n apps/server/src/services/auto-mode/facade.ts | sed -n '215,280p'

Repository: AutoMaker-Org/automaker

Length of output: 3505


🏁 Script executed:

# Check where startAutoLoop is called and if branchName is pre-normalized there
cat -n apps/server/src/services/auto-mode/facade.ts | sed -n '489,510p'

Repository: AutoMaker-Org/automaker

Length of output: 975


🏁 Script executed:

# Check if there's any normalization happening before calling AutoLoopCoordinator.startAutoLoopForProject
rg -n "startAutoLoopForProject\|startAutoLoop" apps/server/src/services/auto-mode/facade.ts -B 5 -A 5

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Check the routes that call startAutoLoop to see if branchName is normalized there
rg -n "startAutoLoop\|autoModeService\|autoLoopCoordinator" apps/server/src/routes/ -B 2 -A 2

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Check if there's any route or API handler that calls startAutoLoop
fd -type f "\.ts$" apps/server/src/routes | xargs rg "startAutoLoop" -B 2 -A 2

Repository: AutoMaker-Org/automaker

Length of output: 21086


🏁 Script executed:

# Check the compat.ts file to see how it delegates to startAutoLoop
cat -n apps/server/src/services/auto-mode/compat.ts | sed -n '114,135p'

Repository: AutoMaker-Org/automaker

Length of output: 823


🏁 Script executed:

# Verify whether ConcurrencyManager pattern (injecting getCurrentBranch) is available
cat -n apps/server/src/services/concurrency-manager.ts | sed -n '50,70p'

Repository: AutoMaker-Org/automaker

Length of output: 1100


🏁 Script executed:

# Check the start.ts route to see how normalizedBranchName is set
cat -n apps/server/src/routes/auto-mode/routes/start.ts | head -80

Repository: AutoMaker-Org/automaker

Length of output: 2690


🏁 Script executed:

# Check if branchName is normalized before being passed to startAutoLoopForProject
rg -n "normalizedBranch" apps/server/src/routes/auto-mode/routes/start.ts -B 5 -A 5

Repository: AutoMaker-Org/automaker

Length of output: 1525


Inject getCurrentBranch callback or normalize branch names before passing to AutoLoopCoordinator.

getWorktreeAutoLoopKey (lines 41-44) and resolveMaxConcurrency (lines 393-394) hardcode 'main' as the primary branch name. For projects using 'master' or other default branches, this creates a separate loop key instead of normalizing to null, potentially causing duplicate auto-loops.

Two solutions:

  1. Inject a getCurrentBranch callback into AutoLoopCoordinator (following the ConcurrencyManager pattern)
  2. Normalize the branch name in AutoModeServiceFacade.startAutoLoop() before passing it to the coordinator (matching the normalization already done in checkWorktreeCapacity at line 904)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-loop-coordinator.ts` around lines 41 - 44, The
coordinator currently hardcodes 'main' in getWorktreeAutoLoopKey and
resolveMaxConcurrency causing mis-normalization for repos using 'master' or
other defaults; inject a getCurrentBranch(projectPath) callback into
AutoLoopCoordinator (mirroring ConcurrencyManager's pattern) and use its result
to determine the normalized branch (treat the repo's primary branch as null)
before generating the key in getWorktreeAutoLoopKey and before calling
resolveMaxConcurrency; alternatively, if you prefer the simpler change,
normalize the branch name inside AutoModeServiceFacade.startAutoLoop() (as done
in checkWorktreeCapacity) before passing it into AutoLoopCoordinator so the
coordinator functions always get a normalized branch value.

Comment on lines +149 to +159
async stopFeature(featureId: string): Promise<boolean> {
// Stop feature is tricky - we need to find which project the feature is running in
// The concurrency manager tracks this
const runningAgents = await this.getRunningAgents();
const agent = runningAgents.find((a) => a.featureId === featureId);
if (agent) {
const facade = this.createFacade(agent.projectPath);
return facade.stopFeature(featureId);
}
return false;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, locate the compat.ts file
find . -name "compat.ts" -path "*/server/*" 2>/dev/null

Repository: AutoMaker-Org/automaker

Length of output: 113


🏁 Script executed:

#!/bin/bash
# Search for the stopFeature method and getRunningAgents implementation
rg "stopFeature|getRunningAgents" apps/server/src --type ts -B 2 -A 10

Repository: AutoMaker-Org/automaker

Length of output: 12262


🏁 Script executed:

#!/bin/bash
# Look for ConcurrencyManager class and its methods
rg "class ConcurrencyManager" apps/server/src --type ts -A 30

Repository: AutoMaker-Org/automaker

Length of output: 2924


🏁 Script executed:

#!/bin/bash
# Search for any feature lookup methods in the codebase
rg "findRunningFeature|getRunningFeature|featureId.*lookup" apps/server/src --type ts

Repository: AutoMaker-Org/automaker

Length of output: 524


🏁 Script executed:

#!/bin/bash
# Find RunningFeature interface definition
rg "interface RunningFeature|type RunningFeature" apps/server/src --type ts -A 10

Repository: AutoMaker-Org/automaker

Length of output: 880


🏁 Script executed:

#!/bin/bash
# Check ConcurrencyManager.getRunningFeature implementation
rg "getRunningFeature\(" apps/server/src/services/concurrency-manager.ts -A 3

Repository: AutoMaker-Org/automaker

Length of output: 188


🏁 Script executed:

#!/bin/bash
# Check getAllRunning implementation
rg "getAllRunning\(" apps/server/src/services/concurrency-manager.ts -A 8

Repository: AutoMaker-Org/automaker

Length of output: 431


Replace getRunningAgents() with direct concurrency manager lookup to find feature's project.

The current implementation calls getRunningAgents(), which loads feature metadata for every running agent via featureLoader.get() just to locate the project for one feature. Since RunningFeature already contains projectPath, use this.concurrencyManager.getRunningFeature(featureId) to retrieve it directly in O(1) time:

async stopFeature(featureId: string): Promise<boolean> {
  const running = this.concurrencyManager.getRunningFeature(featureId);
  if (running) {
    const facade = this.createFacade(running.projectPath);
    return facade.stopFeature(featureId);
  }
  return false;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/compat.ts` around lines 149 - 159, The
stopFeature method currently calls getRunningAgents() and scans results; replace
that with a direct lookup using
this.concurrencyManager.getRunningFeature(featureId) so you avoid loading
feature metadata for all agents—if getRunningFeature returns a RunningFeature,
call this.createFacade(running.projectPath) and return
facade.stopFeature(featureId), otherwise return false; update the async
stopFeature(featureId: string): Promise<boolean> implementation to use
getRunningFeature and keep the same return behavior.

Comment on lines +308 to +313
const runningEntry = this.concurrencyManager.acquire({
featureId,
projectPath,
isAutoMode: false,
allowReuse: true,
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

isAutoMode: false is hardcoded in resumeFromStep, which may misrepresent auto-mode resumptions.

Line 311 always passes isAutoMode: false when acquiring a concurrency slot. If the auto-loop coordinator triggers a pipeline resume, this incorrectly marks the running feature as non-auto-mode, which could affect failure tracking, status reporting, and concurrency decisions that branch on isAutoMode.

Consider propagating the actual auto-mode flag (e.g., from the feature metadata or a method parameter).

#!/bin/bash
# Check how resumeFromStep and resumePipeline are called to see if isAutoMode context is available
rg -n "resumeFromStep\|resumePipeline" apps/server/src/services/ -B3 -A3
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/pipeline-orchestrator.ts` around lines 308 - 313,
resumeFromStep always passes a hardcoded isAutoMode: false into
concurrencyManager.acquire, which misrepresents auto-mode resumptions; update
resumeFromStep to accept or derive the real auto-mode flag (e.g., add an
isAutoMode parameter to resumeFromStep or read it from the feature metadata
accessible in resumeFromStep) and pass that boolean into
concurrencyManager.acquire instead of false; also update call sites (e.g.,
resumePipeline callers) to propagate the flag as needed so resumeFromStep and
concurrencyManager.acquire receive the correct isAutoMode context.

Comment on lines +54 to +61
export type ExecuteFeatureFn = (
projectPath: string,
featureId: string,
useWorktrees: boolean,
useScreenshots: boolean,
model?: string,
options?: { _calledInternally?: boolean }
) => Promise<void>;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find all definitions of ExecuteFeatureFn
rg -n "ExecuteFeatureFn" --type=ts -B2 -A10 | head -100

Repository: AutoMaker-Org/automaker

Length of output: 9641


🏁 Script executed:

#!/bin/bash
# Search for type/interface definitions of ExecuteFeatureFn
rg -n "^export type ExecuteFeatureFn|^export interface ExecuteFeatureFn" --type=ts -A8

Repository: AutoMaker-Org/automaker

Length of output: 2875


🏁 Script executed:

#!/bin/bash
# Find which files contain ExecuteFeatureFn
fd -e ts | xargs grep -l "ExecuteFeatureFn" | sort

Repository: AutoMaker-Org/automaker

Length of output: 422


🏁 Script executed:

#!/bin/bash
# Check pipeline-orchestrator.ts imports
head -30 apps/server/src/services/pipeline-orchestrator.ts | grep -E "import|ExecuteFeatureFn"

Repository: AutoMaker-Org/automaker

Length of output: 1016


🏁 Script executed:

#!/bin/bash
# Check where ExecuteFeatureFn is used in pipeline-orchestrator.ts
rg -n "ExecuteFeatureFn" apps/server/src/services/pipeline-orchestrator.ts -B3 -A3

Repository: AutoMaker-Org/automaker

Length of output: 655


🏁 Script executed:

#!/bin/bash
# Check if pipeline-orchestrator.ts actually calls executeFeatureFn and how
rg -n "executeFeatureFn\(" apps/server/src/services/pipeline-orchestrator.ts -B2 -A2

Repository: AutoMaker-Org/automaker

Length of output: 380


Multiple conflicting ExecuteFeatureFn type definitions exist with incompatible signatures.

This file exports ExecuteFeatureFn with parameters: (projectPath, featureId, useWorktrees, useScreenshots: boolean, model?: string, options?). However, execution-types.ts and recovery-service.ts export ExecuteFeatureFn with completely different semantics: (projectPath, featureId, useWorktrees, isAutoMode: boolean, providedWorktreePath?: string, options?). Additionally, auto-loop-coordinator.ts exports a third variant with only 4 parameters.

Since all parameters are boolean/string types, importing the wrong definition would pass TypeScript type checks but fail at runtime due to semantic incompatibility (e.g., passing useScreenshots where isAutoMode is expected).

Consolidate into a single canonical definition in a shared types module, or rename each to reflect its specific context (e.g., PipelineExecuteFeatureFn, ExecutionExecuteFeatureFn, AutoLoopExecuteFeatureFn) to prevent accidental cross-module imports.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/pipeline-types.ts` around lines 54 - 61, There are
multiple incompatible ExecuteFeatureFn definitions (e.g., ExecuteFeatureFn in
pipeline-types.ts vs those exported from execution-types.ts, recovery-service.ts
and auto-loop-coordinator.ts) that will type-check but break at runtime; fix by
consolidating into one canonical type or renaming per-context: either move a
single shared signature into a new shared types module and update all imports to
use that symbol (e.g., ExportedExecuteFeatureFn) or rename each distinct
signature to a context-specific name (e.g., PipelineExecuteFeatureFn,
ExecutionExecuteFeatureFn, AutoLoopExecuteFeatureFn) and update
declarations/uses in ExecuteFeatureFn, execution-types.ts, recovery-service.ts
and auto-loop-coordinator.ts so signatures match their actual runtime parameter
order and meaning. Ensure callers are updated to import the correct
renamed/central type to prevent cross-import confusion.

Comment on lines +122 to +146
async saveExecutionState(
projectPath: string,
autoLoopWasRunning = false,
maxConcurrency = DEFAULT_MAX_CONCURRENCY
): Promise<void> {
try {
await ensureAutomakerDir(projectPath);
const state: ExecutionState = {
version: 1,
autoLoopWasRunning,
maxConcurrency,
projectPath,
branchName: null,
runningFeatureIds: this.concurrencyManager.getAllRunning().map((rf) => rf.featureId),
savedAt: new Date().toISOString(),
};
await secureFs.writeFile(
getExecutionStatePath(projectPath),
JSON.stringify(state, null, 2),
'utf-8'
);
} catch {
/* ignore */
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

saveExecutionState captures running features from all projects, not just the target project.

Line 135 calls this.concurrencyManager.getAllRunning().map(rf => rf.featureId) without filtering by projectPath. Compare with saveExecutionStateForProject (Lines 99-102) which correctly filters: .filter(f => f.projectPath === projectPath). When multiple projects have running features, this method will save feature IDs belonging to other projects, which could cause incorrect resumption behavior on reload.

Proposed fix
       const state: ExecutionState = {
         version: 1,
         autoLoopWasRunning,
         maxConcurrency,
         projectPath,
         branchName: null,
-        runningFeatureIds: this.concurrencyManager.getAllRunning().map((rf) => rf.featureId),
+        runningFeatureIds: this.concurrencyManager
+          .getAllRunning()
+          .filter((f) => f.projectPath === projectPath)
+          .map((rf) => rf.featureId),
         savedAt: new Date().toISOString(),
       };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async saveExecutionState(
projectPath: string,
autoLoopWasRunning = false,
maxConcurrency = DEFAULT_MAX_CONCURRENCY
): Promise<void> {
try {
await ensureAutomakerDir(projectPath);
const state: ExecutionState = {
version: 1,
autoLoopWasRunning,
maxConcurrency,
projectPath,
branchName: null,
runningFeatureIds: this.concurrencyManager.getAllRunning().map((rf) => rf.featureId),
savedAt: new Date().toISOString(),
};
await secureFs.writeFile(
getExecutionStatePath(projectPath),
JSON.stringify(state, null, 2),
'utf-8'
);
} catch {
/* ignore */
}
}
async saveExecutionState(
projectPath: string,
autoLoopWasRunning = false,
maxConcurrency = DEFAULT_MAX_CONCURRENCY
): Promise<void> {
try {
await ensureAutomakerDir(projectPath);
const state: ExecutionState = {
version: 1,
autoLoopWasRunning,
maxConcurrency,
projectPath,
branchName: null,
runningFeatureIds: this.concurrencyManager
.getAllRunning()
.filter((f) => f.projectPath === projectPath)
.map((rf) => rf.featureId),
savedAt: new Date().toISOString(),
};
await secureFs.writeFile(
getExecutionStatePath(projectPath),
JSON.stringify(state, null, 2),
'utf-8'
);
} catch {
/* ignore */
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/recovery-service.ts` around lines 122 - 146, The
saved execution state in saveExecutionState is collecting runningFeatureIds from
all projects via this.concurrencyManager.getAllRunning() causing cross-project
leakage; change the collection to filter running features by the given
projectPath (same approach as saveExecutionStateForProject) so runningFeatureIds
only includes featureId values where rf.projectPath === projectPath before
mapping, ensuring saveExecutionState and saveExecutionStateForProject are
consistent.

Comment on lines +75 to +90
for (const line of lines) {
if (line.startsWith('worktree ')) {
currentPath = line.slice(9);
} else if (line.startsWith('branch ')) {
currentBranch = line.slice(7).replace('refs/heads/', '');
} else if (line === '' && currentPath && currentBranch) {
// End of a worktree entry
if (currentBranch === branchName) {
// Resolve to absolute path - git may return relative paths
// On Windows, this is critical for cwd to work correctly
// On all platforms, absolute paths ensure consistent behavior
return this.resolvePath(projectPath, currentPath);
}
currentPath = null;
currentBranch = null;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

findWorktreeForBranch doesn't handle detached lines, unlike listWorktrees.

If a worktree entry has a detached line instead of a branch line, currentBranch retains the value from the previous entry (it's only reset on blank-line boundaries). This could cause a false match on the stale branch name.

🐛 Proposed fix
       } else if (line.startsWith('branch ')) {
         currentBranch = line.slice(7).replace('refs/heads/', '');
+      } else if (line.startsWith('detached')) {
+        currentBranch = null;
       } else if (line === '' && currentPath && currentBranch) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for (const line of lines) {
if (line.startsWith('worktree ')) {
currentPath = line.slice(9);
} else if (line.startsWith('branch ')) {
currentBranch = line.slice(7).replace('refs/heads/', '');
} else if (line === '' && currentPath && currentBranch) {
// End of a worktree entry
if (currentBranch === branchName) {
// Resolve to absolute path - git may return relative paths
// On Windows, this is critical for cwd to work correctly
// On all platforms, absolute paths ensure consistent behavior
return this.resolvePath(projectPath, currentPath);
}
currentPath = null;
currentBranch = null;
}
for (const line of lines) {
if (line.startsWith('worktree ')) {
currentPath = line.slice(9);
} else if (line.startsWith('branch ')) {
currentBranch = line.slice(7).replace('refs/heads/', '');
} else if (line.startsWith('detached')) {
currentBranch = null;
} else if (line === '' && currentPath && currentBranch) {
// End of a worktree entry
if (currentBranch === branchName) {
// Resolve to absolute path - git may return relative paths
// On Windows, this is critical for cwd to work correctly
// On all platforms, absolute paths ensure consistent behavior
return this.resolvePath(projectPath, currentPath);
}
currentPath = null;
currentBranch = null;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/worktree-resolver.ts` around lines 75 - 90,
findWorktreeForBranch can carry over a stale currentBranch when an entry
contains a "detached" line; update the loop in findWorktreeForBranch to
explicitly handle lines starting with "detached" (similar to how listWorktrees
does) by clearing currentBranch (e.g., set currentBranch = null or '') so a
detached entry cannot match a prior branch value, ensuring only entries with an
explicit "branch " line can match branchName.

Comment on lines +360 to +379
it('should preserve pipeline_* statuses', async () => {
(secureFs.readFile as Mock).mockResolvedValue(
JSON.stringify({ ...mockFeature, status: 'pipeline_step_1' })
);

await manager.markFeatureInterrupted('/project', 'feature-123', 'server shutdown');

// Should NOT call atomicWriteJson because pipeline status is preserved
expect(atomicWriteJson).not.toHaveBeenCalled();
});

it('should preserve pipeline_complete status', async () => {
(secureFs.readFile as Mock).mockResolvedValue(
JSON.stringify({ ...mockFeature, status: 'pipeline_complete' })
);

await manager.markFeatureInterrupted('/project', 'feature-123');

expect(atomicWriteJson).not.toHaveBeenCalled();
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Tests pass for the wrong reason — readJsonWithRecovery is not mocked, so loadFeature returns null.

In both "should preserve pipeline_* statuses" (Line 360) and "should preserve pipeline_complete status" (Line 371), only secureFs.readFile is mocked but readJsonWithRecovery is not set up for these test cases. The production markFeatureInterrupted calls this.loadFeature() which uses readJsonWithRecovery. Since readJsonWithRecovery isn't mocked here, loadFeature returns null, so currentStatus is undefined — the pipeline preservation branch is never entered. The assertion expect(atomicWriteJson).not.toHaveBeenCalled() passes only because the feature isn't found at all, not because the pipeline status is being preserved.

Compare with the working test at Line 343 which correctly mocks both secureFs.readFile and readJsonWithRecovery.

Proposed fix
     it('should preserve pipeline_* statuses', async () => {
-      (secureFs.readFile as Mock).mockResolvedValue(
-        JSON.stringify({ ...mockFeature, status: 'pipeline_step_1' })
-      );
+      (readJsonWithRecovery as Mock).mockResolvedValue({
+        data: { ...mockFeature, status: 'pipeline_step_1' },
+        recovered: false,
+        source: 'main',
+      });
 
       await manager.markFeatureInterrupted('/project', 'feature-123', 'server shutdown');
 
       // Should NOT call atomicWriteJson because pipeline status is preserved
       expect(atomicWriteJson).not.toHaveBeenCalled();
     });
 
     it('should preserve pipeline_complete status', async () => {
-      (secureFs.readFile as Mock).mockResolvedValue(
-        JSON.stringify({ ...mockFeature, status: 'pipeline_complete' })
-      );
+      (readJsonWithRecovery as Mock).mockResolvedValue({
+        data: { ...mockFeature, status: 'pipeline_complete' },
+        recovered: false,
+        source: 'main',
+      });
 
       await manager.markFeatureInterrupted('/project', 'feature-123');
 
       expect(atomicWriteJson).not.toHaveBeenCalled();
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/tests/unit/services/feature-state-manager.test.ts` around lines
360 - 379, The tests are passing for the wrong reason because
markFeatureInterrupted -> loadFeature uses readJsonWithRecovery, which isn't
mocked here; mock readJsonWithRecovery in both "should preserve pipeline_*
statuses" and "should preserve pipeline_complete status" to return the parsed
feature object (e.g. mockFeature with status 'pipeline_step_1' or
'pipeline_complete') so loadFeature returns the feature and the
pipeline-preservation branch in markFeatureInterrupted is exercised; keep the
existing secureFs.readFile mock but add (readJsonWithRecovery as
Mock).mockResolvedValue(...) (or mockResolvedValueOnce) returning the same
feature object used for secureFs.readFile, then assert atomicWriteJson is not
called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants