Skip to content

refactor: AutoModeService decomposition (Phases 1-6) #733 + fixes#771

Merged
gsxdsm merged 68 commits intoAutoMaker-Org:v0.15.0rcfrom
gsxdsm:refactor/auto-mode-service-gsxdsm
Feb 16, 2026
Merged

refactor: AutoModeService decomposition (Phases 1-6) #733 + fixes#771
gsxdsm merged 68 commits intoAutoMaker-Org:v0.15.0rcfrom
gsxdsm:refactor/auto-mode-service-gsxdsm

Conversation

@gsxdsm
Copy link
Collaborator

@gsxdsm gsxdsm commented Feb 15, 2026

See: #733

Bug fixes and improvements from @Shironex awesome refactor of AutoModeService

Summary by CodeRabbit

  • New Features

    • Multi-worktree auto-mode: aggregated running tasks, per-worktree concurrency, auto-loop coordinator, execution/pipeline orchestration, plan-approval and recovery flows; UI surfaces branch context and adaptive thinking level.
  • Bug Fixes

    • Improved validation, capacity checks, error handling and user-facing toasts (feature create/start/stop/drag-drop/sync).
  • Documentation

    • Large set of planning and architecture docs added (architecture, conventions, integrations, testing, stack, structure, concerns).
  • Tests

    • Many new unit/integration test suites added; some legacy tests removed.
  • Chores

    • Relaxed dev tooling version constraint.

Shironex and others added 30 commits February 14, 2026 18:51
- STACK.md - Technologies and dependencies
- ARCHITECTURE.md - System design and patterns
- STRUCTURE.md - Directory layout
- CONVENTIONS.md - Code style and patterns
- TESTING.md - Test structure
- INTEGRATIONS.md - External services
- CONCERNS.md - Technical debt and issues
Refactoring auto-mode-service.ts (5k+ lines) into smaller, focused services with clear boundaries.
User preference: keep .planning/ local-only
- Test lease counting basics (acquire/release semantics)
- Test running count queries (project and worktree level)
- Test feature state queries (isRunning, getRunningFeature, getAllRunning)
- Test edge cases (multiple features, multiple worktrees)
- 36 test cases documenting expected behavior

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Lease-based reference counting for nested execution support
- acquire() creates entry with leaseCount: 1 or increments existing
- release() decrements leaseCount, deletes at 0 or with force:true
- Project and worktree-level running counts
- RunningFeature interface exported for type sharing

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- AutoModeService now delegates to ConcurrencyManager for all running feature tracking
- Constructor accepts optional ConcurrencyManager for dependency injection
- Remove local RunningFeature interface (imported from ConcurrencyManager)
- Migrate all this.runningFeatures usages to concurrencyManager methods
- Update tests to use concurrencyManager.acquire() instead of direct Map access
- ConcurrencyManager accepts getCurrentBranch function for testability

BREAKING: AutoModeService no longer exposes runningFeatures Map directly.
Tests must use concurrencyManager.acquire() to add running features.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add TypedEventBus as wrapper around EventEmitter
- Implement emitAutoModeEvent method for auto-mode event format
- Add emit, subscribe, getUnderlyingEmitter methods
- Create comprehensive test suite (20 tests)
- Verify exact event format for frontend compatibility
- Create WorktreeResolver class for git worktree discovery
- Extract getCurrentBranch, findWorktreeForBranch, listWorktrees methods
- Add WorktreeInfo interface for worktree metadata
- Always resolve paths to absolute for cross-platform compatibility
- Add 20 unit tests covering all worktree operations
- Import TypedEventBus into AutoModeService
- Add eventBus property initialized via constructor injection
- Remove private emitAutoModeEvent method (now in TypedEventBus)
- Update all 66 emitAutoModeEvent calls to use this.eventBus
- Constructor accepts optional TypedEventBus for testing
- Create FeatureStateManager class for feature status updates
- Extract updateFeatureStatus, markFeatureInterrupted, resetStuckFeatures
- Extract updateFeaturePlanSpec, saveFeatureSummary, updateTaskStatus
- Persist BEFORE emit pattern for data integrity (Pitfall 2)
- Handle corrupted JSON with readJsonWithRecovery backup support
- Preserve pipeline_* statuses in markFeatureInterrupted
- Fix bug: version increment now checks old content before applying updates
- Add 33 unit tests covering all state management operations
…utoModeService

- Add WorktreeResolver and FeatureStateManager as constructor parameters
- Remove top-level getCurrentBranch function (now in WorktreeResolver)
- Delegate loadFeature, updateFeatureStatus to FeatureStateManager
- Delegate markFeatureInterrupted, resetStuckFeatures to FeatureStateManager
- Delegate updateFeaturePlanSpec, saveFeatureSummary, updateTaskStatus
- Replace findExistingWorktreeForBranch calls with worktreeResolver
- Update tests to mock featureStateManager instead of internal methods
- All 89 tests passing across 3 service files
- Extract plan approval workflow from AutoModeService
- Timeout-wrapped Promise creation via waitForApproval()
- Resolution handling (approve/reject) with needsRecovery flag
- Cancellation support for stopped features
- Per-project configurable timeout (default 30 minutes)
- Event emission through TypedEventBus for plan_rejected
- 24 tests covering approval, rejection, timeout, cancellation, recovery
- Tests use Vitest fake timers for timeout testing
- Covers needsRecovery flag for server restart recovery
- Covers plan_rejected event emission
- Covers configurable timeout from project settings
- Add PlanApprovalService import and constructor parameter
- Delegate waitForPlanApproval, cancelPlanApproval, hasPendingApproval
- resolvePlanApproval checks needsRecovery flag and calls executeFeature
- Remove pendingApprovals Map (now in PlanApprovalService)
- Remove PendingApproval interface (moved to plan-approval-service.ts)
- Extract parseTasksFromSpec for parsing tasks from spec content
- Extract marker detection functions (task start/complete, phase complete)
- Extract detectSpecFallback for non-Claude model support
- Extract extractSummary with multi-format support and last-match behavior
- Add 65 unit tests covering all functions and edge cases
- Add import for all spec parsing functions from spec-parser.ts
- Remove 209 lines of function definitions (now imported)
- Functions extracted: parseTasksFromSpec, parseTaskLine, detectTaskStartMarker,
  detectTaskCompleteMarker, detectPhaseCompleteMarker, detectSpecFallback, extractSummary
- All server tests pass (1608 tests)
- Create AgentExecutor class with constructor injection for TypedEventBus,
  FeatureStateManager, PlanApprovalService, and SettingsService
- Extract streaming pipeline from AutoModeService.runAgent()
- Implement execute() with stream processing, marker detection, file output
- Support recovery path with executePersistedTasks()
- Handle spec generation and approval workflow
- Multi-agent task execution with progress events
- Single-agent continuation fallback
- Debounced file writes (500ms)
- Heartbeat logging for silent model calls
- Abort signal handling throughout execution

Key interfaces:
- AgentExecutionOptions: All execution parameters
- AgentExecutionResult: responseText, specDetected, tasksCompleted, aborted
- Callbacks: waitForApproval, saveFeatureSummary, updateFeatureSummary, buildTaskPrompt
- Test constructor injection with all dependencies
- Test interface exports (AgentExecutionOptions, AgentExecutionResult)
- Test callback type signatures (WaitForApprovalFn, SaveFeatureSummaryFn, etc.)
- Test dependency injection patterns with custom implementations
- Verify execute method signature

Note: Full integration tests for streaming/marker detection require
complex mocking of @automaker/utils module which has hoisting issues.
Integration testing covered in E2E and auto-mode-service tests.
- Add AgentExecutor import to auto-mode-service.ts
- Add agentExecutor as constructor parameter (optional, with default)
- Initialize AgentExecutor with TypedEventBus, FeatureStateManager,
  PlanApprovalService, and SettingsService dependencies

This enables constructor injection for testing and prepares for
incremental delegation of runAgent() logic to AgentExecutor.
The AgentExecutor contains the full execution pipeline;
runAgent() delegation will be done incrementally to ensure
stability.
- Replace stream processing loop with AgentExecutor.execute() delegation
- Build AgentExecutionOptions object from runAgent() parameters
- Create callbacks for waitForApproval, saveFeatureSummary, etc.
- Remove ~930 lines of duplicated stream processing code
- Progress events now flow through AgentExecutor

File: auto-mode-service.ts reduced from 5086 to 4157 lines
- Add 11 new test cases for execute() behavior
- Test callback invocation (progress events, tool events)
- Test error handling (API errors, auth failures)
- Test result structure and response accumulation
- Test abort signal propagation
- Test branchName propagation in event payloads

Test file: 388 -> 935 lines (+547 lines)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…-merge

- Extract pipeline orchestration logic from AutoModeService
- executePipeline: Sequential step execution with context continuity
- buildPipelineStepPrompt: Builds prompts with feature context and previous output
- detectPipelineStatus: Identifies pipeline status for resumption
- resumePipeline/resumeFromStep: Handle excluded steps and missing context
- executeTestStep: 5-attempt agent fix loop (REQ-F07)
- attemptMerge: Auto-merge with conflict detection (REQ-F05)
- buildTestFailureSummary: Concise test failure summary for agent

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Tests for executePipeline: step sequence, events, status updates
- Tests for buildPipelineStepPrompt: context inclusion, previous work
- Tests for detectPipelineStatus: pipeline status detection and parsing
- Tests for resumePipeline/resumeFromStep: excluded steps, slot management
- Tests for executeTestStep: 5-attempt fix loop, failure events
- Tests for attemptMerge: merge endpoint, conflict detection
- Tests for buildTestFailureSummary: output parsing

37 tests covering all core functionality

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add PipelineOrchestrator constructor parameter and property
- Initialize PipelineOrchestrator with all required dependencies and callbacks
- Delegate executePipelineSteps to pipelineOrchestrator.executePipeline()
- Delegate detectPipelineStatus to pipelineOrchestrator.detectPipelineStatus()
- Delegate resumePipelineFeature to pipelineOrchestrator.resumePipeline()
- Add AutoModeService integration tests for delegation verification
- Test executePipeline delegation with context fields
- Test detectPipelineStatus delegation for pipeline/non-pipeline status
- Test resumePipeline delegation with autoLoadClaudeMd and useWorktrees
- Add edge case tests for abort signals, missing context, deleted steps
- Delete executePipelineSteps method (~115 lines)
- Delete buildPipelineStepPrompt method (~38 lines)
- Delete resumePipelineFeature method (~88 lines)
- Delete resumeFromPipelineStep method (~195 lines)
- Delete detectPipelineStatus method (~104 lines)
- Remove unused PipelineStatusInfo interface (~18 lines)
- Update comments to reference PipelineOrchestrator

Total reduction: ~546 lines (4150 -> 3604 lines)
- Extract loop lifecycle from AutoModeService
- Export AutoModeConfig, ProjectAutoLoopState, getWorktreeAutoLoopKey
- Export callback types for AutoModeService integration
- Methods: start/stop/isRunning/getConfig for project/worktree
- Failure tracking with threshold and quota error detection
- Sleep helper interruptible by abort signal
- 41 tests covering loop lifecycle and failure tracking
- Tests for getWorktreeAutoLoopKey key generation
- Tests for start/stop/isRunning/getConfig methods
- Tests for runAutoLoopForProject loop behavior
- Tests for failure tracking threshold and quota errors
- Tests for multiple concurrent projects/worktrees
- Tests for edge cases (null settings, reset errors)
- Extract executeFeature, stopFeature, buildFeaturePrompt from AutoModeService
- Export callback types for test mocking and integration
- Implement persist-before-emit pattern for status updates
- Support approved plan continuation and context resumption
- Track failures and signal pause when threshold reached

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Shironex and others added 8 commits February 14, 2026 18:54
…ss-platform tests

- Extract merge logic from pipeline-orchestrator to merge-service.ts to avoid HTTP self-call
- Make agent-executor error handling provider-agnostic using shared isAuthenticationError utility
- Fix cross-platform path handling in tests using path.normalize/path.resolve helpers
- Add catch handlers in plan-approval-service tests to prevent unhandled promise rejection warnings

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Added caching for facades in AutoModeServiceCompat to persist auto loop state across API calls.
- Improved error handling in BoardView for starting and stopping auto mode, with user-friendly toast notifications.
- Updated WorktreePanel to manage auto mode state and concurrency limits more effectively.
- Enhanced useAutoMode hook to prevent state overwrites during transitions and synchronize UI with backend status.

This update optimizes performance and user experience in the auto mode feature.
… feature retrieval

- Updated getStatusForProject method in AutoModeServiceCompat and its facade to be asynchronous, allowing for better handling of feature status retrieval.
- Modified related status handlers in the server routes to await the updated method.
- Introduced a new method, getRunningFeaturesForWorktree, in ConcurrencyManager to improve feature ID retrieval based on branch normalization.
- Adjusted BoardView component to ensure consistent handling of running auto tasks across worktrees.

These changes improve the responsiveness and accuracy of the auto mode feature in the application.
…obal service

- Replaced synchronous feature retrieval with asynchronous logic in both AutoModeServiceFacade and GlobalAutoModeService.
- Updated filtering logic to resolve the primary branch name for main worktrees, improving accuracy in feature status checks.
- This change enhances the responsiveness and correctness of feature handling in auto mode operations.
- Added error handling for feature creation in BoardView component to log errors and display user-friendly messages.
- Updated persistFeatureCreate function to throw errors on failure, allowing for better state management.
- Introduced removal of features from state if server creation fails, improving user experience during conflicts.

Also added @playwright/test to devDependencies in package-lock.json for improved testing capabilities.
- Enhanced the Kanban card component to support additional feature statuses ('interrupted' and 'ready') in the backlog display logic.
- Updated relevant components to reflect these changes, ensuring consistent behavior across the UI.
@gsxdsm gsxdsm changed the base branch from main to v0.15.0rc February 15, 2026 18:40
@coderabbitai
Copy link

coderabbitai bot commented Feb 15, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Replaces the monolithic AutoModeService with a modular architecture (Global service, per-project Facade, Compat shim), adds orchestration (AutoLoopCoordinator, ExecutionService, AgentExecutor, PipelineOrchestrator), concurrency and worktree utilities, plan-approval/recovery/feature-state services, UI multi-worktree wiring, tests, and planning docs; adds .planning/ to .gitignore.

Changes

Cohort / File(s) Summary
Planning docs & .gitignore
\.gitignore, .planning/PROJECT.md, .planning/codebase/*
Adds .planning/ to .gitignore and a large set of planning/architecture/conventions/integrations/stack/structure/testing docs.
Auto-mode public surface & index
apps/server/src/services/auto-mode/index.ts, apps/server/src/services/auto-mode/types.ts
New central exports, factory, and public types for redesigned auto-mode surface.
Auto-mode core: global/facade/compat
apps/server/src/services/auto-mode/global-service.ts, .../facade.ts, .../compat.ts
Introduces GlobalAutoModeService, per-project AutoModeServiceFacade and AutoModeServiceCompat shim used by existing routes.
Auto-loop & concurrency
apps/server/src/services/auto-loop-coordinator.ts, apps/server/src/services/concurrency-manager.ts
Adds AutoLoopCoordinator (per-project/worktree loops) and ConcurrencyManager (lease-based running-feature tracking, branch-aware counts).
Execution & agent layer
apps/server/src/services/agent-executor.ts, .../agent-executor-types.ts, apps/server/src/services/execution-service.ts, .../execution-types.ts
New AgentExecutor (streaming/spec/plan handling), ExecutionService orchestration, and related execution types and callbacks.
Pipeline orchestration
apps/server/src/services/pipeline-orchestrator.ts, .../pipeline-types.ts
Adds PipelineOrchestrator for multi-step pipelines, test-step execution, and auto-merge handling plus pipeline types.
Feature/state/approval/recovery
apps/server/src/services/feature-state-manager.ts, plan-approval-service.ts, recovery-service.ts
Adds FeatureStateManager (persistence/events), PlanApprovalService (timeouts/recovery), and RecoveryService (execution-state persistence & resume).
Worktree, merge, parser, events
apps/server/src/services/worktree-resolver.ts, merge-service.ts, spec-parser.ts, typed-event-bus.ts
Adds WorktreeResolver, MergeService, spec-parser utilities, and TypedEventBus for typed auto-mode events.
Auto-mode routes & signatures
apps/server/src/routes/auto-mode/**, apps/server/src/routes/features/*, projects/*, running-agents/*
Routes updated to accept AutoModeServiceCompat; added input validation, worktree capacity checks (429), awaited status calls, minor argument reorder (approve-plan), and removed some finally/cleanup blocks.
Service tweaks: provider/model handling
apps/server/src/services/agent-service.ts, apps/server/src/services/ideation-service.ts
Clarifies provider vs SDK model handling: use provider model IDs for API calls when claudeCompatibleProvider is set and adjust SDK model selection.
UI: multi-worktree & robustness
apps/ui/src/components/views/board-view.tsx, worktree-panel/*, kanban-card/*, hooks/use-auto-mode.ts, hooks/use-board-*.ts, hooks/use-query-invalidation.ts
Switches to aggregated runningAutoTasksAllWorktrees, adds error handling around create/start/stop flows, drag/drop constraints for running features, new store selectors (setAutoModeRunning, getMaxConcurrencyForWorktree), and query invalidation updates.
New tests & removed legacy suites
apps/server/tests/...
Removes legacy AutoModeService tests and adds extensive unit tests for new services (AgentExecutor, AutoLoopCoordinator, ConcurrencyManager, ExecutionService, FeatureStateManager, PipelineOrchestrator, PlanApprovalService, RecoveryService, SpecParser, TypedEventBus, WorktreeResolver, etc.).
Types & settings
libs/types/src/index.ts, libs/types/src/settings.ts
Exports TerminalPromptTheme and updates Claude-compatible provider model mappings (GLM-5 adjustments).
Package
package.json
Relaxed devDependencies.lint-staged to ^16.2.7.

Sequence Diagram(s)

sequenceDiagram
    participant UI as Client/UI
    participant Router as API Route
    participant Compat as AutoModeServiceCompat
    participant Facade as AutoModeServiceFacade
    participant Exec as ExecutionService
    participant Agent as AgentExecutor
    participant Pipeline as PipelineOrchestrator
    participant State as FeatureStateManager
    rect rgba(240,248,255,0.5)
    UI->>Router: POST /run-feature (projectPath, featureId)
    Router->>Compat: executeFeature(projectPath, featureId, ...)
    Compat->>Facade: executeFeature(featureId, ...)
    Facade->>Exec: executeFeature(featureId, ...)
    end
    rect rgba(224,255,224,0.5)
    Exec->>State: updateFeatureStatus("in_progress")
    Exec->>Agent: execute(prompt, options)
    Agent->>Agent: stream provider responses
    Agent->>State: emit progress & save outputs
    Agent-->>Exec: execution result
    end
    rect rgba(255,250,205,0.5)
    Exec->>Pipeline: executePipeline(context)
    Pipeline->>Agent: runAgentForStep(...)
    Agent-->>Pipeline: step result
    Pipeline->>State: updateFeatureStatus("verified")
    Pipeline-->>Exec: pipeline complete
    end
    Facade-->>Compat: feature complete
    Compat-->>Router: { success: true }
    Router-->>UI: 200 OK
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~150 minutes

Possibly related PRs

Poem

🐰 I hopped through facades, loops, and queues,

Executors hummed while I gnawed on new clues.
Plans and pipelines in tidy rows,
Worktrees and merges — calm as a rose.
Tests snug like carrots — ready to grow.

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 47.22% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main change: a comprehensive refactor decomposing AutoModeService into smaller, focused services across multiple phases.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into v0.15.0rc

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Shironex Shironex added Testers-Requested Request for others to test an enhancement or bug fix/etc. Refactor A complete logic rewrite is requested or being performed for an issue. Do Not Merge Use this label if something should not be merged. labels Feb 15, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @gsxdsm, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a significant architectural overhaul by decomposing the core AutoModeService into a suite of smaller, focused services. This refactoring addresses accumulated technical debt, improves modularity, and enhances testability across the backend. Concurrently, it integrates newer AI models, streamlines Docker-based automated testing, and introduces several user interface improvements to provide a more stable and responsive experience.

Highlights

  • AutoModeService Decomposition: The monolithic AutoModeService has been refactored into several specialized services, including AgentExecutor, AutoLoopCoordinator, ConcurrencyManager, ExecutionService, FeatureStateManager, PlanApprovalService, RecoveryService, PipelineOrchestrator, WorktreeResolver, TypedEventBus, SpecParser, and MergeService. A compatibility layer (AutoModeServiceCompat) and facade (AutoModeServiceFacade) have been introduced to ensure existing API routes continue to function during this phased transition.
  • Comprehensive Documentation: A new .planning/codebase directory has been added, containing detailed Markdown files that document the project's architecture, technical concerns, coding conventions, external integrations, technology stack, and testing patterns. This significantly enhances project understanding and onboarding.
  • AI Model Updates: The project now supports Claude Opus 4.6, which includes adaptive thinking capabilities, and integrates OpenAI's GPT-5.3 Codex model. Corresponding UI and backend logic have been updated to leverage these new models.
  • Dockerized Automated Testing: Playwright Chromium is now pre-installed within the Docker image, enabling AI agents to perform automated verification tests directly within the containerized environment without additional setup.
  • Improved Auto-Mode UI/UX: User experience enhancements include more robust error handling for feature creation, refined drag-and-drop behavior for in-progress features (allowing them to be moved to the backlog to stop the agent), and improved synchronization of auto-mode status with the backend for better optimistic UI updates.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .gitignore
    • Added local planning and metadata files to ignore.
  • .planning/PROJECT.md
    • Added a new project planning document.
  • .planning/codebase/ARCHITECTURE.md
    • Added new architecture documentation.
  • .planning/codebase/CONCERNS.md
    • Added new codebase concerns documentation.
  • .planning/codebase/CONVENTIONS.md
    • Added new coding conventions documentation.
  • .planning/codebase/INTEGRATIONS.md
    • Added new external integrations documentation.
  • .planning/codebase/STACK.md
    • Added new technology stack documentation.
  • .planning/codebase/STRUCTURE.md
    • Added new codebase structure documentation.
  • .planning/codebase/TESTING.md
    • Added new testing patterns documentation.
  • CLAUDE.md
    • Updated Claude Opus model alias.
  • Dockerfile
    • Installed Playwright Chromium for automated testing.
  • README.md
    • Updated documentation for Playwright Chromium in Docker.
  • apps/server/package.json
    • Updated Claude and OpenAI SDKs, added Playwright dev dependency.
  • apps/server/src/index.ts
    • Updated Claude auth check, switched to AutoModeServiceCompat.
  • apps/server/src/lib/sdk-options.ts
    • Modified thinking options for adaptive models.
  • apps/server/src/providers/claude-provider.ts
    • Updated Claude Opus model and thinking logic.
  • apps/server/src/providers/codex-models.ts
    • Added GPT-5.3 Codex and updated descriptions.
  • apps/server/src/providers/provider-factory.ts
    • Updated JSDoc example for Claude Opus.
  • apps/server/src/routes/auto-mode/index.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/analyze-project.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/approve-plan.ts
    • Updated AutoModeService type and resolvePlanApproval parameters.
  • apps/server/src/routes/auto-mode/routes/commit-feature.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/context-exists.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/follow-up-feature.ts
    • Updated AutoModeService type and removed .finally() block.
  • apps/server/src/routes/auto-mode/routes/resume-feature.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/resume-interrupted.ts
    • Updated AutoModeService type and added await to resumeInterruptedFeatures.
  • apps/server/src/routes/auto-mode/routes/run-feature.ts
    • Updated AutoModeService type and removed .finally() block.
  • apps/server/src/routes/auto-mode/routes/start.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/status.ts
    • Updated AutoModeService type and made getStatusForProject asynchronous.
  • apps/server/src/routes/auto-mode/routes/stop-feature.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/stop.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/auto-mode/routes/verify-feature.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/enhance-prompt/routes/enhance.ts
    • Modified model resolution logic for custom providers.
  • apps/server/src/routes/features/index.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/features/routes/list.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/github/routes/validate-issue.ts
    • Modified model resolution logic for custom providers.
  • apps/server/src/routes/projects/index.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/projects/routes/overview.ts
    • Updated AutoModeService type and method calls to be asynchronous.
  • apps/server/src/routes/running-agents/index.ts
    • Updated AutoModeService type to AutoModeServiceCompat.
  • apps/server/src/routes/running-agents/routes/index.ts
    • Updated AutoModeService type and method calls to be asynchronous.
  • apps/server/src/routes/setup/routes/verify-claude-auth.ts
    • Added authType to Claude auth verification response.
  • apps/server/src/services/agent-executor-types.ts
    • Added new types for agent execution.
  • apps/server/src/services/agent-executor.ts
    • Added new service for core agent execution.
  • apps/server/src/services/agent-service.ts
    • Modified model resolution logic for custom providers.
  • apps/server/src/services/auto-loop-coordinator.ts
    • Added new service for auto-mode loop management.
  • apps/server/src/services/auto-mode/compat.ts
    • Added new compatibility shim for AutoModeService.
  • apps/server/src/services/auto-mode/facade.ts
    • Added new per-project facade for auto-mode.
  • apps/server/src/services/auto-mode/global-service.ts
    • Added new service for global auto-mode operations.
  • apps/server/src/services/auto-mode/index.ts
    • Added new module entry point for auto-mode services.
  • apps/server/src/services/auto-mode/types.ts
    • Added new types for auto-mode services.
  • apps/server/src/services/concurrency-manager.ts
    • Added new service for managing running feature concurrency.
  • apps/server/src/services/execution-service.ts
    • Added new service for feature execution lifecycle.
  • apps/server/src/services/execution-types.ts
    • Added new types for execution service callbacks.
  • apps/server/src/services/feature-state-manager.ts
    • Added new service for managing feature status and persistence.
  • apps/server/src/services/ideation-service.ts
    • Modified model resolution logic for custom providers.
  • apps/server/src/services/merge-service.ts
    • Added new service for direct git merge operations.
  • apps/server/src/services/pipeline-orchestrator.ts
    • Added new service for pipeline step execution.
  • apps/server/src/services/pipeline-types.ts
    • Added new types for pipeline orchestrator.
  • apps/server/src/services/plan-approval-service.ts
    • Added new service for plan approval workflow.
  • apps/server/src/services/recovery-service.ts
    • Added new service for crash recovery and feature resumption.
  • apps/server/src/services/spec-parser.ts
    • Added new service for parsing spec content.
  • apps/server/src/services/typed-event-bus.ts
    • Added new type-safe event bus.
  • apps/server/src/services/worktree-resolver.ts
    • Added new service for git worktree resolution.
  • apps/server/tests/integration/services/auto-mode-service.integration.test.ts
    • Removed old integration tests.
  • apps/server/tests/unit/lib/model-resolver.test.ts
    • Updated Claude Opus model ID in tests.
  • apps/server/tests/unit/lib/sdk-options.test.ts
    • Added tests for adaptive thinking.
  • apps/server/tests/unit/providers/claude-provider.test.ts
    • Updated Claude Opus model ID in tests.
  • apps/server/tests/unit/providers/provider-factory.test.ts
    • Updated Claude Opus model ID in tests.
  • apps/server/tests/unit/services/agent-executor.test.ts
    • Added new unit tests for AgentExecutor.
  • apps/server/tests/unit/services/auto-loop-coordinator.test.ts
    • Added new unit tests for AutoLoopCoordinator.
  • apps/server/tests/unit/services/auto-mode-service-planning.test.ts
    • Removed old unit tests.
  • apps/server/tests/unit/services/auto-mode-service.test.ts
    • Removed old unit tests.
  • apps/server/tests/unit/services/execution-service.test.ts
    • Added new unit tests for ExecutionService.
  • apps/server/tests/unit/services/feature-state-manager.test.ts
    • Added new unit tests for FeatureStateManager.
  • apps/server/tests/unit/services/pipeline-orchestrator.test.ts
    • Added new unit tests for PipelineOrchestrator.
  • apps/server/tests/unit/services/plan-approval-service.test.ts
    • Added new unit tests for PlanApprovalService.
  • apps/server/tests/unit/services/recovery-service.test.ts
    • Added new unit tests for RecoveryService.
  • apps/server/tests/unit/services/spec-parser.test.ts
    • Added new unit tests for SpecParser.
  • apps/server/tests/unit/services/typed-event-bus.test.ts
    • Added new unit tests for TypedEventBus.
  • apps/server/tests/unit/services/worktree-resolver.test.ts
    • Added new unit tests for WorktreeResolver.
  • apps/ui/docs/AGENT_ARCHITECTURE.md
    • Updated Claude Opus model ID.
  • apps/ui/src/components/dialogs/sandbox-risk-dialog.tsx
    • Added Docker troubleshooting steps.
  • apps/ui/src/components/views/board-view.tsx
    • Updated runningAutoTasks prop, added error handling for feature creation, and improved merge conflict feature title.
  • apps/ui/src/components/views/board-view/components/kanban-card/card-actions.tsx
    • Enabled edit/make actions for interrupted and ready statuses.
  • apps/ui/src/components/views/board-view/components/kanban-card/card-header.tsx
    • Enabled backlog column actions for interrupted and ready statuses.
  • apps/ui/src/components/views/board-view/components/kanban-card/kanban-card.tsx
    • Enabled dragging for interrupted and ready statuses.
  • apps/ui/src/components/views/board-view/dialogs/add-feature-dialog.tsx
    • Added logic to normalize thinking levels based on model capabilities.
  • apps/ui/src/components/views/board-view/hooks/use-board-actions.ts
    • Added error handling for feature creation.
  • apps/ui/src/components/views/board-view/hooks/use-board-drag-drop.ts
    • Modified drag logic for running features to allow stopping via drag to backlog.
  • apps/ui/src/components/views/board-view/hooks/use-board-persistence.ts
    • Modified persistFeatureCreate to throw errors.
  • apps/ui/src/components/views/board-view/shared/model-constants.ts
    • Added 'adaptive' thinking level.
  • apps/ui/src/components/views/board-view/shared/thinking-level-selector.tsx
    • Updated to filter thinking levels by model and display adaptive thinking description.
  • apps/ui/src/components/views/board-view/worktree-panel/worktree-panel.tsx
    • Updated auto-mode toggle to use app store actions for state management.
  • apps/ui/src/components/views/settings-view/model-defaults/phase-model-selector.tsx
    • Updated thinking level selection logic for adaptive models.
  • apps/ui/src/components/views/settings-view/providers/codex-model-configuration.tsx
    • Added GPT-5.3 Codex model and updated descriptions.
  • apps/ui/src/components/views/setup-view/steps/claude-setup-step.tsx
    • Enhanced Claude auth verification UI to show OAuth vs CLI.
  • apps/ui/src/hooks/use-auto-mode.ts
    • Implemented optimistic UI updates and backend synchronization for auto-mode status.
  • apps/ui/src/hooks/use-query-invalidation.ts
    • Expanded event types that trigger feature list invalidation.
  • apps/ui/src/lib/agent-context-parser.ts
    • Updated default model and model name formatting for new Claude/Codex models.
  • apps/ui/src/lib/electron.ts
    • Updated verifyClaudeAuth return type.
  • apps/ui/src/lib/http-api-client.ts
    • Updated verifyClaudeAuth return type.
  • docker-compose.override.yml.example
    • Updated comments for Playwright cache.
  • docs/llm-shared-packages.md
    • Updated Claude Opus model ID.
  • docs/server/providers.md
    • Updated Claude Opus model ID.
  • docs/server/utilities.md
    • Updated Claude Opus model ID.
  • libs/model-resolver/README.md
    • Updated Claude Opus model ID.
  • libs/model-resolver/tests/resolver.test.ts
    • Updated Claude Opus model ID in tests.
  • libs/platform/src/index.ts
    • Exported new types FileCheckResult and DirectoryCheckResult.
Activity
  • The pull request title and body indicate this is a follow-up to a previous refactoring PR (refactor: AutoModeService decomposition (Phases 1-6) #733), continuing the decomposition work initiated by @Shironex.
  • The extensive addition of new documentation files under .planning/codebase suggests a thorough analysis and planning phase for this refactoring.
  • The removal of old monolithic service tests and the addition of numerous new unit tests for the decomposed services indicate a strong focus on testability and code quality.
  • The use of a compatibility layer (AutoModeServiceCompat) suggests a phased rollout strategy for the refactoring, minimizing immediate breaking changes for existing API consumers.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Shironex Shironex added the Tests Adding / Updating / Removing tests across the project. label Feb 15, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent and extensive refactoring of the monolithic AutoModeService. Breaking it down into smaller, focused services like ConcurrencyManager, ExecutionService, and RecoveryService is a huge step forward for maintainability and testability. The use of a compatibility layer to ensure a smooth transition is a great approach.

I'm also impressed with the number of bug fixes and UX improvements included, such as the correct handling of custom provider model IDs, more robust UI state syncing, and the ability to stop a running feature by dragging it to the backlog. The new planning and architecture documents are a fantastic addition.

I've found one very minor issue in the .gitignore file, but overall this is a high-quality pull request that significantly improves the codebase's architecture.

gsxdsm and others added 2 commits February 15, 2026 21:04
Add missing 'adaptive' thinking level to kanban card labels and export
TerminalPromptTheme type from @automaker/types package.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/server/src/routes/features/routes/list.ts (1)

35-48: ⚠️ Potential issue | 🟡 Minor

Missing .catch() on the fire-and-forget orphan detection promise.

The comment on Line 34 states detectOrphanedFeatures handles errors internally and always resolves, but if the compat layer or facade creation itself throws (e.g., initialization error), this .then() chain has no .catch(), which would result in an unhandled promise rejection.

Suggested safeguard
       if (autoModeService) {
         autoModeService.detectOrphanedFeatures(projectPath).then((orphanedFeatures) => {
           if (orphanedFeatures.length > 0) {
             logger.info(
               `[ProjectLoad] Detected ${orphanedFeatures.length} orphaned feature(s) in ${projectPath}`
             );
             for (const { feature, missingBranch } of orphanedFeatures) {
               logger.info(
                 `[ProjectLoad] Orphaned: ${feature.title || feature.id} - branch "${missingBranch}" no longer exists`
               );
             }
           }
-        });
+        }).catch((error) => {
+          logger.warn(`[ProjectLoad] Orphan detection failed for ${projectPath}:`, error);
+        });
       }
🤖 Fix all issues with AI agents
In `@apps/server/src/routes/auto-mode/routes/approve-plan.ts`:
- Around line 50-56: The handler currently passes projectPath unconditionally
(using projectPath || '') into autoModeService.resolvePlanApproval and the
request type marks projectPath optional, which lets invalid requests slip
through; update the request type so projectPath is required (projectPath:
string), add a validation at the start of the route handler that if
(!projectPath) { res.status(400).json({ success: false, error: 'projectPath is
required' }); return; }, and remove the fallback || '' when calling
resolvePlanApproval so the real projectPath is passed through; refer to
resolvePlanApproval and the handler's request type/parameters to locate the
changes.

In `@apps/server/src/routes/github/routes/validate-issue.ts`:
- Around line 191-196: The effectiveModel fallback can be an unresolved alias
(e.g. "opus"); change the fallback to run the model through resolveModelString
so API calls get a resolved provider/model combo. Specifically, in the block
that computes effectiveModel (where claudeCompatibleProvider,
providerResolvedModel and model are used), replace the raw fallback (model as
string) with the result of resolveModelString(model as string) (await if
resolveModelString is async), so effectiveModel is always a resolved model
string before calling streamingQuery/stripProviderPrefix; keep the existing
branch for claudeCompatibleProvider unchanged.

In `@apps/server/src/services/agent-executor.ts`:
- Around line 554-568: The progress event emitted during plan revision in the
provider.executeQuery loop omits branchName, preventing the UI from scoping
revision-phase updates; update the eventBus.emitAutoModeEvent call inside the
for-await loop (where revText is appended and b.text emitted) to include
branchName alongside featureId and content (mirror the payload shape used
elsewhere in this class), so the emitted object contains { featureId,
branchName, content: b.text } and maintain existing behavior for error/result
handling.
- Around line 633-648: The buildExecOpts function is missing propagation of the
user's thinkingLevel so tasks and plan revisions run without it; update
buildExecOpts (used to build execution options for task execution and plan
revision) to include thinkingLevel: o.thinkingLevel in the returned object
(alongside prompt, model, maxTurns, cwd, allowedTools, abortController,
mcpServers, credentials, claudeCompatibleProvider) so the thinkingLevel from the
original AgentExecutionOptions is passed through.

In `@apps/server/src/services/auto-loop-coordinator.ts`:
- Around line 401-413: The sleep function (private sleep(ms: number, signal?:
AbortSignal)) adds an 'abort' listener that is never removed on normal timeout,
causing listener leaks; update it to clean up the listener when the timer
resolves (either by attaching the listener with { once: true } or by capturing
the listener function and removing it on timeout resolution) and ensure the
timeout is cleared and the listener removed in all code paths so the AbortSignal
doesn't accumulate listeners over repeated loop iterations.
- Around line 34-36: getWorktreeAutoLoopKey is normalizing only the literal
'main' to null, causing mismatches with
ConcurrencyManager.getRunningCountForWorktree which dynamically resolves the
repository's primary branch via getCurrentBranch(projectPath); update
getWorktreeAutoLoopKey to align with that resolution by either making it async
and calling getCurrentBranch(projectPath) to treat the resolved primary branch
as the null/main equivalent, or change its signature to accept a
resolvedPrimaryBranch parameter and normalize that value the same way; ensure
calls that previously passed branchName are updated to provide the resolved
primary branch where needed so keys match ConcurrencyManager's logic (refer to
getWorktreeAutoLoopKey, ConcurrencyManager.getRunningCountForWorktree, and
getCurrentBranch).

In `@apps/server/src/services/auto-mode/facade.ts`:
- Around line 526-608: The facade's followUpFeature (and similarly
analyzeProject) currently emit an auto_mode_feature_start event then immediately
throw, causing endpoints via AutoModeServiceCompat to always error; change the
flow so the facade either delegates to the legacy AutoModeService implementation
(e.g., call AutoModeService.followUpFeature) when full implementation isn't
ready, or at minimum avoid emitting eventBus.emitAutoModeEvent(...) before
throwing and instead return a clear not-implemented error without side effects;
update AutoModeServiceCompat to route to the legacy service if the facade method
is not implemented (or ensure the facade forwards to AutoModeService), and
remove the premature event emission in followUpFeature and analyzeProject to
prevent false start events.
- Around line 683-717: commitFeature currently shells git commands with
user-derived text and manually inspects .worktrees; change it to call the Git
CLI via execFile (or an execAsync wrapper that uses execFile) and pass arguments
as an array (e.g., ['commit', '-m', commitMessage]) instead of interpolating
strings to eliminate command injection risk, and similarly use execFile for 'git
add' and 'git rev-parse' calls; also replace the legacy manual lookup of the
.worktrees directory in commitFeature with a call to
worktreeResolver.findWorktreeForBranch(featureId) (and apply the same execFile
fix to the other occurrence referenced in the review around lines 978–1000).
Ensure you keep using this.featureStateManager.loadFeature and the existing
commit message construction, but hand the message as an argument to execFile
rather than embedding it in a shell string.

In `@apps/server/src/services/execution-service.ts`:
- Around line 193-201: The current logic ignores a providedWorktreePath
parameter and always re-resolves or falls back to projectPath; update the block
to prefer a non-empty providedWorktreePath before calling
worktreeResolver.findWorktreeForBranch: if providedWorktreePath is set, assign
it to worktreePath and log it, otherwise proceed with the existing
branchName/useWorktrees resolution via worktreeResolver.findWorktreeForBranch.
Ensure you still compute workDir = worktreePath ? path.resolve(worktreePath) :
path.resolve(projectPath), call validateWorkingDirectory(workDir), and set
tempRunningFeature.worktreePath = worktreePath so callers that passed
providedWorktreePath run in the intended directory.
- Around line 273-296: executePipelineFn can set terminal statuses like
"merge_conflict" and return early, but the current flow unconditionally calls
updateFeatureStatusFn and overwrites those statuses; change the logic so you do
not overwrite pipeline-set statuses: either (A) make executePipelineFn return a
result/status flag (e.g., {status, aborted}) and only call updateFeatureStatusFn
when that result indicates no terminal status, or (B) after awaiting
executePipelineFn reload the feature/status from the canonical source and bail
(skip calling updateFeatureStatusFn and recordSuccessFn) if the status is
already a pipeline-set terminal status (e.g., "merge_conflict"); apply this
change around executePipelineFn, updateFeatureStatusFn, and recordSuccessFn to
prevent overwrites.

In `@apps/server/src/services/feature-state-manager.ts`:
- Around line 84-150: updateFeatureStatus currently updates and persists feature
state but does not emit events, causing UI to miss interrupted feature updates
(unlike saveFeatureSummary and updateTaskStatus). Modify updateFeatureStatus to
emit the "auto_mode_feature_status" event after successful persistence (and
after setting justFinishedAt) using the same event payload shape used by
saveFeatureSummary/updateTaskStatus so callers (including markFeatureInterrupted
and markAllRunningFeaturesInterrupted) no longer need to emit; ensure emission
happens inside updateFeatureStatus and only on successful write (i.e., after
atomicWriteJson and before syncFeatureToAppSpec).

In `@apps/server/src/services/merge-service.ts`:
- Around line 74-79: The input check at the top only ensures projectPath,
branchName, and worktreePath are present but does not validate branch names;
call isValidBranchName for both branchName and targetBranch at the function
entry (alongside the existing presence checks for
projectPath/branchName/worktreePath) and return a failure object if either is
invalid so git operations never run with untrusted branch names; reference the
existing isValidBranchName helper and the branchName/targetBranch variables and
keep the later deletion-time validation consistent or remove duplication after
adding this early check.
- Around line 7-12: Replace all insecure shell-string execAsync usage with the
array-based execGitCommand wrapper to prevent command injection: locate calls to
execAsync that validate branches and perform merges/commits (references:
execAsync, execGitCommand, branchName, options.message) and change them so git
subcommands and their args are passed as arrays to execGitCommand instead of
interpolating values into a shell string; ensure branchName and options.message
are passed as discrete arguments (no concatenation or quoting into a single
shell command) for the branch validation, merge execution, and commit steps.

In `@apps/server/src/services/plan-approval-service.ts`:
- Around line 85-126: waitForApproval currently overwrites this.pendingApprovals
entry for the same key, leaking the original promise and its timeout; fix by
making PendingApproval store its timeoutId and, at the top of waitForApproval,
check this.pendingApprovals.get(key) and if an existing entry exists call
existing.reject(new Error('approval superseded by new request')) and
clearTimeout(existing.timeoutId) before replacing it. Update the PendingApproval
shape to include timeoutId, set that field when creating the new entry, and keep
wrappedResolve/wrappedReject behavior (they should still clearTimeout(timeoutId)
and remove the map entry) so no timers or promises remain reachable after being
superseded or resolved.
- Around line 224-239: The updateFeaturePlanSpec call is overwriting plan
content when editedPlan is undefined and failing to notify the frontend for
rejections without feedback; change the payload in updateFeaturePlanSpec to set
content: editedPlan || feature.planSpec.content (or otherwise preserve existing
content) instead of content: editedPlan, and ensure you always call
this.eventBus.emitAutoModeEvent('plan_rejected', { featureId, projectPath,
feedback }) when approved is false (even if feedback is falsy) so rejections
without feedback still emit the event; locate these changes around the
updateFeaturePlanSpec invocation and the eventBus.emitAutoModeEvent call.

In `@apps/ui/src/components/views/board-view/hooks/use-board-persistence.ts`:
- Around line 78-97: The onAddAndStart call path can throw unhandled promise
rejections because AddFeatureDialog.handleAction invokes the async callback
synchronously; fix by ensuring the async chain is awaited and errors are caught:
either wrap the await handleAddFeature(...) inside handleAddAndStartFeature in a
try/catch and surface/log the error (referencing handleAddAndStartFeature and
handleAddFeature in graph-view-page.tsx), or modify
AddFeatureDialog.handleAction to await the result of the provided callback and
handle/rethrow errors properly so any exceptions from persistFeatureCreate
propagate to the caller in a controlled way; update the code paths accordingly
to prevent unhandled promise rejections.
🟡 Minor comments (19)
apps/ui/src/components/views/board-view/components/kanban-card/card-header.tsx-129-161 (1)

129-161: ⚠️ Potential issue | 🟡 Minor

Update test IDs to reflect actual feature status instead of hardcoding "backlog".

The condition now includes interrupted and ready statuses, but data-testid values remain hardcoded as spawn-backlog- and delete-backlog-. This creates misleading test identifiers when the feature status is interrupted or ready.

Use feature.status to derive the test ID:

Proposed fix
-              data-testid={`spawn-backlog-${feature.id}`}
+              data-testid={`spawn-${feature.status}-${feature.id}`}
               title="Spawn Sub-Task"
             >
               <GitFork className="w-4 h-4" />
             </Button>
             <Button
               variant="ghost"
               size="sm"
               className="h-6 w-6 p-0 hover:bg-white/10 text-muted-foreground hover:text-destructive"
               onClick={handleDeleteClick}
               onPointerDown={(e) => e.stopPropagation()}
-              data-testid={`delete-backlog-${feature.id}`}
+              data-testid={`delete-${feature.status}-${feature.id}`}
.planning/codebase/STACK.md-108-108 (1)

108-108: ⚠️ Potential issue | 🟡 Minor

Ambiguous zod version.

"zod 3.24.1 or 4.0.0" — these are very different major versions with breaking API changes (zod 4 has a different import structure). Clarify which version is actually in use.

.gitignore-98-100 (1)

98-100: ⚠️ Potential issue | 🟡 Minor

.planning/ is gitignored but also committed in this PR.

The PR adds .planning/ to .gitignore while simultaneously committing multiple files under .planning/ (STACK.md, CONVENTIONS.md, PROJECT.md, CONCERNS.md, etc.). Once tracked, existing files remain tracked despite the gitignore entry, but this creates confusion: future edits to these files will appear in git status, yet new files under .planning/ will be silently ignored. If the intent is to keep these docs in the repo, remove this gitignore entry. If they should be local-only, don't commit them.

.planning/codebase/CONCERNS.md-47-48 (1)

47-48: ⚠️ Potential issue | 🟡 Minor

Stale file references to auto-mode-service.ts.

Multiple sections reference specific lines in apps/server/src/services/auto-mode-service.ts (lines 47–48, 93, 129), but this PR deletes and decomposes that monolith. These references will be immediately stale. Update them to point to the new service files (e.g., recovery-service.ts, pipeline-orchestrator.ts, auto-loop-coordinator.ts).

Also applies to: 93-93, 129-129

.planning/codebase/CONVENTIONS.md-9-9 (1)

9-9: ⚠️ Potential issue | 🟡 Minor

Documentation error: examples contradict the stated convention.

Line states "PascalCase for class/service files" but all examples (auto-mode-service.ts, feature-loader.ts, claude-provider.ts) are kebab-case. The convention appears to be kebab-case for all files; fix the label.

Proposed fix
-- PascalCase for class/service files: `auto-mode-service.ts`, `feature-loader.ts`, `claude-provider.ts`
+- kebab-case for class/service files: `auto-mode-service.ts`, `feature-loader.ts`, `claude-provider.ts`
.planning/codebase/INTEGRATIONS.md-108-108 (1)

108-108: ⚠️ Potential issue | 🟡 Minor

Inconsistency with ARCHITECTURE.md on log storage.

Line 108 states "(no persistent logging to files)" but ARCHITECTURE.md line 204 states "Output: stdout (dev), files (production)". These docs are being added in the same PR — align them on the actual behavior.

.planning/codebase/STRUCTURE.md-233-233 (1)

233-233: ⚠️ Potential issue | 🟡 Minor

Contradictory naming convention: examples don't match the stated convention.

Line 233 states "Components: PascalCase.tsx" but the examples (board-view.tsx, session-manager.tsx) are kebab-case, not PascalCase. Either correct the convention label to "kebab-case" or update the examples.

.planning/codebase/STRUCTURE.md-238-238 (1)

238-238: ⚠️ Potential issue | 🟡 Minor

Markdown rendering issue with wildcard patterns.

_.test.ts or _.spec.ts — the asterisks are being interpreted as emphasis markers (flagged by MD037). Use backtick escaping:

Proposed fix
-- **Tests:** _.test.ts or _.spec.ts (co-located with source)
+- **Tests:** `*.test.ts` or `*.spec.ts` (co-located with source)
apps/server/src/services/recovery-service.ts-122-146 (1)

122-146: ⚠️ Potential issue | 🟡 Minor

saveExecutionState (legacy) captures running features across all projects.

Unlike saveExecutionStateForProject (Line 101) which filters by projectPath, the legacy saveExecutionState at Line 135 calls getAllRunning().map(...) without filtering. This means runningFeatureIds will include features from other projects.

If this is a known limitation of the legacy API that's being phased out, a brief comment would help. Otherwise, it should filter to match the new method's behavior.

apps/server/tests/unit/services/feature-state-manager.test.ts-266-285 (1)

266-285: ⚠️ Potential issue | 🟡 Minor

Tests for "preserve pipeline_ statuses" don't actually exercise the intended code path.*

These tests mock secureFs.readFile but markFeatureInterrupted calls this.loadFeature() which uses readJsonWithRecovery (not secureFs.readFile). Since readJsonWithRecovery is a bare vi.fn() mock (returns undefined), loadFeature will throw internally and return null. The currentStatus check at the pipeline_* guard never fires — instead, updateFeatureStatus is called but also silently returns because it also reads null from the same bare mock.

The assertion expect(atomicWriteJson).not.toHaveBeenCalled() passes, but for the wrong reason (feature not found), not because the pipeline status was preserved.

💚 Proposed fix — mock readJsonWithRecovery to return the pipeline feature
   it('should preserve pipeline_* statuses', async () => {
-    (secureFs.readFile as Mock).mockResolvedValue(
-      JSON.stringify({ ...mockFeature, status: 'pipeline_step_1' })
-    );
+    (readJsonWithRecovery as Mock).mockResolvedValue({
+      data: { ...mockFeature, status: 'pipeline_step_1' },
+      recovered: false,
+      source: 'main',
+    });

     await manager.markFeatureInterrupted('/project', 'feature-123', 'server shutdown');

     // Should NOT call atomicWriteJson because pipeline status is preserved
     expect(atomicWriteJson).not.toHaveBeenCalled();
   });

   it('should preserve pipeline_complete status', async () => {
-    (secureFs.readFile as Mock).mockResolvedValue(
-      JSON.stringify({ ...mockFeature, status: 'pipeline_complete' })
-    );
+    (readJsonWithRecovery as Mock).mockResolvedValue({
+      data: { ...mockFeature, status: 'pipeline_complete' },
+      recovered: false,
+      source: 'main',
+    });

     await manager.markFeatureInterrupted('/project', 'feature-123');

     expect(atomicWriteJson).not.toHaveBeenCalled();
   });
apps/server/src/services/execution-types.ts-19-37 (1)

19-37: ⚠️ Potential issue | 🟡 Minor

projectPath appears as both a required positional parameter and an optional field in options.

RunAgentFn accepts projectPath at position 5 (Line 24) and also as options.projectPath (Line 28). This duplication is confusing for callers — which one takes precedence? All current implementations pass the same value to both, but this redundancy creates ambiguity. Remove the optional projectPath from options to keep a single source of truth.

.planning/codebase/TESTING.md-48-69 (1)

48-69: ⚠️ Potential issue | 🟡 Minor

Add language identifier to fenced code block.

The directory tree structure block on line 48 lacks a language specifier. This triggers the markdownlint MD040 warning. Use text or plaintext for non-code content.

📝 Proposed fix
-```
+```text
 apps/server/
apps/ui/src/components/views/board-view/hooks/use-board-drag-drop.ts-37-37 (1)

37-37: ⚠️ Potential issue | 🟡 Minor

autoMode in dependency array defeats useCallback memoization.

useAutoMode() returns a new object reference on every render (the return object is not memoized, only the internal functions are). Including autoMode in the useCallback dependencies means handleDragEnd is recreated every render despite being wrapped in useCallback, which can cause unnecessary re-renders of downstream @dnd-kit components that depend on handler identity.

Since only stopFeature is used from autoMode in handleDragEnd, destructure it instead:

-  const autoMode = useAutoMode();
+  const { stopFeature } = useAutoMode();

Then update the call on line 233 from autoMode.stopFeature(featureId) to stopFeature(featureId), and replace autoMode with stopFeature in the dependency array (line 328). This preserves memoization since stopFeature is itself wrapped in useCallback inside the hook.

apps/ui/src/hooks/use-auto-mode.ts-614-626 (1)

614-626: ⚠️ Potential issue | 🟡 Minor

refreshStatus is not in the dependency array of start (and stop).

refreshStatus is captured by the queueMicrotask closure but isn't listed as a dependency of the start callback (line 626). If refreshStatus is recreated (e.g., because runningAutoTasks changes — now a dependency at line 197), the queued call could use a stale reference.

The impact is minor since the next poll cycle will correct it, but it's technically a stale-closure risk. Same applies to stop at line 674.

apps/server/tests/unit/services/execution-service.test.ts-364-394 (1)

364-394: ⚠️ Potential issue | 🟡 Minor

Test name says "throws" but the method doesn't throw — it emits an error event.

The test "throws if feature not found" (line 364) calls svc.executeFeature without await expect(...).rejects. The method handles the error internally and emits an auto_mode_error event. The test name should reflect this behavior.

♻️ Suggested rename
-   it('throws if feature not found', async () => {
+   it('emits error event when feature not found', async () => {
apps/ui/src/hooks/use-auto-mode.ts-15-19 (1)

15-19: ⚠️ Potential issue | 🟡 Minor

arraysEqual gives false positives when a contains duplicates.

The Set-based comparison only checks membership, not multiplicity. For example, arraysEqual(['a', 'a'], ['a', 'b']) returns true (both length 2, every element of a is in Set(['a','b'])).

In practice feature IDs are unique, so this is unlikely to cause real issues, but a simple sorted-comparison or symmetric set check would be safer.

♻️ Suggested fix
 function arraysEqual(a: string[], b: string[]): boolean {
   if (a.length !== b.length) return false;
-  const set = new Set(b);
-  return a.every((id) => set.has(id));
+  const sorted_a = [...a].sort();
+  const sorted_b = [...b].sort();
+  return sorted_a.every((id, i) => id === sorted_b[i]);
 }
apps/server/tests/unit/services/agent-executor.test.ts-395-440 (1)

395-440: ⚠️ Potential issue | 🟡 Minor

Test name is misleading — doesn't actually assert aborted=true.

The test is named "should return aborted=true when abort signal is already aborted" but only asserts result and result.responseText are defined. It never checks result.aborted === true. Either rename the test to match what it verifies, or add the missing assertion.

💚 Proposed fix
       // When stream is empty and signal is aborted before stream starts,
       // the result depends on whether abort was checked
       expect(result).toBeDefined();
       expect(result.responseText).toBeDefined();
+      expect(result.aborted).toBe(true);
     });
apps/server/src/services/auto-loop-coordinator.ts-275-310 (1)

275-310: ⚠️ Potential issue | 🟡 Minor

Non-null assertion errorInfo! is unsafe for certain call patterns.

At line 294, actualErrorInfo = errorInfo! assumes errorInfo is defined when branchNameOrError is a string or null. While TypeScript callers will generally provide it, this overloaded-via-runtime-check pattern is fragile. If a caller passes (projectPath, null) without the third argument, actualErrorInfo is undefined and line 301 throws at runtime.

Consider adding a defensive check or using proper function overloads.

🛡️ Suggested defensive check
    } else {
      // New signature: (projectPath, branchName, errorInfo)
      branchName = branchNameOrError;
-     actualErrorInfo = errorInfo!;
+     if (!errorInfo) {
+       logger.warn('trackFailureAndCheckPauseForProject called without errorInfo');
+       return false;
+     }
+     actualErrorInfo = errorInfo;
    }
apps/server/src/services/pipeline-orchestrator.ts-454-476 (1)

454-476: ⚠️ Potential issue | 🟡 Minor

Clear the timeout when the test run finishes.

When a session completes, the 10‑minute timeout still fires later. Clearing it prevents an unnecessary dangling timer.

🛠 Suggested fix
   private async waitForTestCompletion(
     sessionId: string
   ): Promise<{ status: TestRunStatus; exitCode: number | null; duration: number }> {
     return new Promise((resolve) => {
-      const checkInterval = setInterval(() => {
+      const timeout = setTimeout(() => {
+        clearInterval(checkInterval);
+        resolve({ status: 'failed', exitCode: null, duration: 600000 });
+      }, 600000);
+
+      const checkInterval = setInterval(() => {
         const session = this.testRunnerService.getSession(sessionId);
         if (session && session.status !== 'running' && session.status !== 'pending') {
           clearInterval(checkInterval);
+          clearTimeout(timeout);
           resolve({
             status: session.status,
             exitCode: session.exitCode,
             duration: session.finishedAt
               ? session.finishedAt.getTime() - session.startedAt.getTime()
               : 0,
           });
         }
       }, 1000);
-      setTimeout(() => {
-        clearInterval(checkInterval);
-        resolve({ status: 'failed', exitCode: null, duration: 600000 });
-      }, 600000);
     });
   }
🧹 Nitpick comments (29)
apps/ui/src/components/views/board-view/components/kanban-card/card-actions.tsx (1)

1-1: @ts-nocheck disables all type safety for this file.

The comment mentions "optional callback prop typing with feature status narrowing" — but CardActionsProps already types the optional callbacks with ?. The status narrowing could be handled with discriminated unions or simple type guards. Blanket @ts-nocheck masks any future type regressions introduced in this file.

Consider removing it and fixing the specific type issues, or at minimum narrowing to targeted @ts-expect-error comments on the offending lines.

apps/ui/src/components/views/board-view/components/kanban-card/card-header.tsx (1)

1-1: @ts-nocheck suppresses all type safety for this file.

This directive hides potential type errors introduced by these or future changes. Consider removing it and fixing any underlying type issues, or at minimum adding a tracking issue to address it later.

package.json (1)

75-75: Inconsistent version pinning for lint-staged.

Changed from exact "16.2.7" to caret "^16.2.7", while sibling devDependencies (husky, prettier, vitest) use exact versions. Consider keeping exact pinning for consistency and reproducible builds.

Proposed fix
-    "lint-staged": "^16.2.7",
+    "lint-staged": "16.2.7",
.planning/codebase/CONCERNS.md (1)

53-56: Capitalize "GitHub" in theme names.

Lines 53 and 55 reference "github" terminal theme — should be "GitHub" when referring to the platform name in prose (per static analysis hint).

.planning/codebase/ARCHITECTURE.md (1)

39-39: Consider updating service list to reflect the decomposition.

Line 39 lists AutoModeService as a single service, but this PR decomposes it into AutoModeServiceCompat, AutoModeServiceFacade, GlobalAutoModeService, and several sub-services. Consider updating this list to reflect the new architecture, since this doc is being added in the same PR.

apps/server/src/services/worktree-resolver.ts (1)

65-102: Duplicated porcelain parsing logic between findWorktreeForBranch and listWorktrees.

Both methods parse git worktree list --porcelain output with nearly identical line-by-line iteration. findWorktreeForBranch could delegate to listWorktrees and filter by branch, eliminating ~35 lines of duplication.

Suggested refactor
  async findWorktreeForBranch(projectPath: string, branchName: string): Promise<string | null> {
-   try {
-     const { stdout } = await execAsync('git worktree list --porcelain', {
-       cwd: projectPath,
-     });
-
-     const lines = stdout.split('\n');
-     let currentPath: string | null = null;
-     let currentBranch: string | null = null;
-
-     for (const line of lines) {
-       if (line.startsWith('worktree ')) {
-         currentPath = line.slice(9);
-       } else if (line.startsWith('branch ')) {
-         currentBranch = line.slice(7).replace('refs/heads/', '');
-       } else if (line === '' && currentPath && currentBranch) {
-         if (currentBranch === branchName) {
-           return this.resolvePath(projectPath, currentPath);
-         }
-         currentPath = null;
-         currentBranch = null;
-       }
-     }
-
-     if (currentPath && currentBranch && currentBranch === branchName) {
-       return this.resolvePath(projectPath, currentPath);
-     }
-
-     return null;
-   } catch {
-     return null;
-   }
+   const worktrees = await this.listWorktrees(projectPath);
+   const match = worktrees.find(wt => wt.branch === branchName);
+   return match?.path ?? null;
  }
apps/server/tests/unit/services/auto-loop-coordinator.test.ts (1)

2-13: Consider using path alias for consistency with other test files.

Other test files in this PR (e.g., recovery-service.test.ts) use the @/ path alias for server-internal imports. This file uses deep relative paths (../../../src/services/...). Consider aligning for consistency.

🔧 Suggested import alignment
-import {
-  AutoLoopCoordinator,
-  getWorktreeAutoLoopKey,
-  type AutoModeConfig,
-  type ProjectAutoLoopState,
-  type ExecuteFeatureFn,
-  type LoadPendingFeaturesFn,
-  type SaveExecutionStateFn,
-  type ClearExecutionStateFn,
-  type ResetStuckFeaturesFn,
-  type IsFeatureFinishedFn,
-} from '../../../src/services/auto-loop-coordinator.js';
-import type { TypedEventBus } from '../../../src/services/typed-event-bus.js';
-import type { ConcurrencyManager } from '../../../src/services/concurrency-manager.js';
-import type { SettingsService } from '../../../src/services/settings-service.js';
+import {
+  AutoLoopCoordinator,
+  getWorktreeAutoLoopKey,
+  type AutoModeConfig,
+  type ProjectAutoLoopState,
+  type ExecuteFeatureFn,
+  type LoadPendingFeaturesFn,
+  type SaveExecutionStateFn,
+  type ClearExecutionStateFn,
+  type ResetStuckFeaturesFn,
+  type IsFeatureFinishedFn,
+} from '@/services/auto-loop-coordinator.js';
+import type { TypedEventBus } from '@/services/typed-event-bus.js';
+import type { ConcurrencyManager } from '@/services/concurrency-manager.js';
+import type { SettingsService } from '@/services/settings-service.js';
apps/server/src/services/merge-service.ts (1)

36-56: execGitCommand and isValidBranchName are duplicated from routes/worktree/common.ts.

Both helpers exist in apps/server/src/routes/worktree/common.ts with nearly identical implementations (the only difference is isValidBranchName uses MAX_BRANCH_NAME_LENGTH there vs. 250 here). Consider importing from the shared location or extracting to a common utility to avoid drift.

apps/server/src/services/feature-state-manager.ts (1)

168-192: markFeatureInterrupted silently returns when feature is not found.

When loadFeature returns null (Line 174), currentStatus is undefined, the pipeline_* check passes through, and updateFeatureStatus is called which also returns silently on null feature (Line 100). This is defensive but consider logging a warning at the markFeatureInterrupted level when the feature can't be found, to aid debugging crash-recovery issues.

apps/server/tests/unit/services/feature-state-manager.test.ts (1)

248-264: Same mock issue in "should mark feature as interrupted" test.

Line 250 mocks secureFs.readFile but the actual code path goes through readJsonWithRecovery. Lines 253–257 correctly mock readJsonWithRecovery for this test, but the secureFs.readFile mock on Line 250 is dead code. It won't cause failures but adds confusion about which mock is actually exercised.

🧹 Remove dead mock
   it('should mark feature as interrupted', async () => {
-    (secureFs.readFile as Mock).mockResolvedValue(
-      JSON.stringify({ ...mockFeature, status: 'in_progress' })
-    );
     (readJsonWithRecovery as Mock).mockResolvedValue({
       data: { ...mockFeature, status: 'in_progress' },
       recovered: false,
       source: 'main',
     });
apps/server/src/services/spec-parser.ts (1)

176-227: Local truncate shadows a utility in @automaker/utils.

The local truncate helper (Line 178) has different semantics from the one in libs/utils/src/string-utils.ts — it extracts the first paragraph and then truncates, which is purpose-specific to summary extraction. Consider renaming it (e.g., truncateFirstParagraph) to avoid confusion with the shared utility.

apps/server/src/services/recovery-service.ts (2)

78-90: Consider a deps/config object for the 11-parameter constructor.

The constructor has 11 parameters which makes call-site readability and future extensibility challenging. A common pattern is to accept a single options/deps object.

♻️ Example deps-object pattern
interface RecoveryServiceDeps {
  eventBus: TypedEventBus;
  concurrencyManager: ConcurrencyManager;
  settingsService: SettingsService | null;
  executeFeatureFn: ExecuteFeatureFn;
  loadFeatureFn: LoadFeatureFn;
  detectPipelineStatusFn: DetectPipelineStatusFn;
  resumePipelineFn: ResumePipelineFn;
  isFeatureRunningFn: IsFeatureRunningFn;
  acquireRunningFeatureFn: AcquireRunningFeatureFn;
  releaseRunningFeatureFn: ReleaseRunningFeatureFn;
}

export class RecoveryService {
  constructor(private deps: RecoveryServiceDeps) {}
  // access as this.deps.eventBus, etc.
}

250-301: Sequential resumption of interrupted features may be slow with many features.

resumeInterruptedFeatures awaits each resumeFeature call sequentially (Line 293). If there are many interrupted features, this blocks progress on later features while earlier ones complete. Consider firing and tracking promises concurrently (up to maxConcurrency), or at minimum fire-and-forget with error logging, since each resumeFeature is self-contained.

apps/server/src/services/plan-approval-service.ts (1)

306-312: planApprovalTimeoutMs accessed via as any cast — consider adding it to the ProjectSettings type.

The eslint-disable + as any here silently bypasses type safety. If ProjectSettings is under your control, adding planApprovalTimeoutMs?: number to that interface would be cleaner and would ensure downstream consumers know this setting exists.

apps/server/tests/unit/services/plan-approval-service.test.ts (1)

42-43: Unused flushPromises helper.

flushPromises is defined but never called — all tests use vi.advanceTimersByTimeAsync(0) directly instead. Consider removing it to avoid confusion.

apps/server/src/services/execution-types.ts (1)

89-94: _calledInternally parameter with underscore prefix in a type definition.

The underscore prefix convention signals an unused parameter at the implementation site, but embedding that convention in the type definition leaks implementation detail. Consumers implementing this type see _calledInternally and may be confused about whether it's intentionally ignored. Consider naming it calledInternally in the type and letting implementers use the underscore if they choose.

Same applies to ExecuteFeatureFn at Line 143.

apps/server/src/services/ideation-service.ts (1)

233-235: Fix looks correct for custom provider model handling.

When a custom provider (e.g., GLM, MiniMax) is detected, using options.model directly avoids resolving to a Claude model name that the custom endpoint wouldn't recognize. The default path (no custom provider) still uses the resolved model from line 219.

Note: the coding guideline states to always use resolveModelString() before API calls. Here it's intentionally bypassed for custom providers — the CRITICAL comment documents why. Consider extracting this into a small helper (e.g., resolveModelForProvider) that encapsulates this conditional logic, so the intent is clear without relying solely on comments.

apps/server/tests/unit/services/typed-event-bus.test.ts (1)

134-164: Event type list is incomplete relative to AutoModeEventType definition.

The "all standard auto-mode event types" test on Lines 134-152 is missing several types defined in typed-event-bus.ts: auto_mode_task_status, auto_mode_phase_complete, auto_mode_summary, auto_mode_resuming_features, plan_auto_approved, plan_revision_requested, plan_revision_warning, pipeline_step_started, pipeline_step_complete, and auto_mode_phase. Not a bug (since the | string catch-all ensures they work), but the test name "should handle all standard auto-mode event types" is misleading. Consider either completing the list or renaming the test.

apps/server/src/services/typed-event-bus.ts (1)

19-45: The | string catch-all negates type safety from the literal union.

The explicit event type literals serve as documentation but provide no compile-time safety since | string widens the type to accept any string. If extensibility is needed, consider a branded/template-literal pattern or removing | string and adding new event types explicitly as they're introduced. This would catch typos like passing 'auto_mode_strated' instead of 'auto_mode_started'.

Also, 'auto_mode_phase' (used in the UI's SINGLE_FEATURE_INVALIDATION_EVENTS) is missing from the explicit list, though it's covered by | string.

apps/server/src/services/auto-mode/global-service.ts (3)

62-67: Good safety guard, but consider logging when it fires.

The throwing executeFeatureFn is a solid defensive measure. However, if this ever fires unexpectedly due to a wiring bug, the generic Error might be caught and logged far from here without clear context.

Consider adding a logger.error(...) before the throw so it's easy to trace in logs.


105-112: getSharedServices() exposes the monitoring-only autoLoopCoordinator.

The returned autoLoopCoordinator has a throwing executeFeatureFn. Consumers of SharedServices must know not to use it for execution. The facade code (in facade.ts) correctly creates its own coordinator, but the type system doesn't enforce this — any caller with SharedServices could accidentally call execution methods on it.

Consider either:

  1. Omitting autoLoopCoordinator from SharedServices (facades create their own anyway), or
  2. Adding a prominent JSDoc warning on the SharedServices.autoLoopCoordinator type.

195-207: Sequential markFeatureInterrupted calls during shutdown.

The for...of with await processes features one at a time. During graceful shutdown with many running features, this could be slow. Using Promise.allSettled would be faster and still resilient to individual failures.

♻️ Suggested refactor
  async markAllRunningFeaturesInterrupted(reason?: string): Promise<void> {
    const allRunning = this.concurrencyManager.getAllRunning();

-   for (const rf of allRunning) {
-     await this.featureStateManager.markFeatureInterrupted(rf.projectPath, rf.featureId, reason);
-   }
+   await Promise.allSettled(
+     allRunning.map((rf) =>
+       this.featureStateManager.markFeatureInterrupted(rf.projectPath, rf.featureId, reason)
+     )
+   );

    if (allRunning.length > 0) {
      logger.info(
        `Marked ${allRunning.length} running feature(s) as interrupted: ${reason || 'no reason provided'}`
      );
    }
  }
apps/server/tests/unit/services/agent-executor.test.ts (2)

268-281: Function.length is fragile for verifying method signatures.

executor.execute.length only counts parameters before the first one with a default value. If a default is added later, this test silently becomes wrong. Consider removing this or documenting the caveat.


90-119: Constructor dependency tests don't verify storage — effectively duplicates.

Tests at lines 90-119 ("should store eventBus dependency", "should store featureStateManager dependency", "should store planApprovalService dependency") all just assert executor is toBeDefined(), identical to the first constructor test. They don't verify the dependency is actually stored or accessible. Consider either removing them or testing via observable behavior (e.g., verifying eventBus is called during execute).

apps/server/src/services/auto-loop-coordinator.ts (1)

181-192: Fire-and-forget executeFeatureFn — intentional but worth noting.

The executeFeatureFn at line 181 is called without await, relying solely on .catch() for error handling. This is intentional for concurrency (the loop should proceed to start more features). The .catch() handler logs and checks for pause conditions.

However, if .catch() itself throws (unlikely but possible if trackFailureAndCheckPauseForProject or signalShouldPauseForProject throws), it becomes an unhandled rejection. Consider wrapping the .catch() body in a try-catch.

apps/server/src/services/agent-executor.ts (2)

336-336: || 100 treats maxTurns: 0 as falsy — use ?? for nullish coalescing.

Math.min(sdkOptions?.maxTurns || 100, 50) — if maxTurns is explicitly set to 0, || falls through to 100, then caps at 50. While 0 maxTurns is likely not a valid use case, ?? is more correct:

-      this.buildExecOpts(options, taskPrompt, Math.min(sdkOptions?.maxTurns || 100, 50))
+      this.buildExecOpts(options, taskPrompt, Math.min(sdkOptions?.maxTurns ?? 100, 50))

126-145: Raw output uses pretty-printed JSON in a JSONL file.

JSON.stringify({ ... }, null, 4) produces multi-line JSON. Since the file extension is .jsonl (JSON Lines), each entry should be a single line. This makes the file unparseable with standard JSONL tools.

Since this is debug-only output (gated by AUTOMAKER_DEBUG_RAW_OUTPUT), it's low impact, but worth fixing for correctness.

♻️ Suggested fix
        rawOutputLines.push(
-         JSON.stringify({ timestamp: new Date().toISOString(), event }, null, 4)
+         JSON.stringify({ timestamp: new Date().toISOString(), event })
        );
apps/server/tests/unit/services/execution-service.test.ts (1)

170-262: Consider a helper factory to reduce repeated ExecutionService construction.

Many tests (lines 366-385, 481-500, 525-544, 574-593, etc.) re-create ExecutionService with 16+ constructor arguments when only one mock differs. A factory helper with overridable defaults would significantly reduce boilerplate.

♻️ Example helper
function createService(overrides: Partial<{
  loadFeatureFn: LoadFeatureFn;
  runAgentFn: RunAgentFn;
  contextExistsFn: ContextExistsFn;
  trackFailureFn: TrackFailureFn;
  // ... other overridable fns
}> = {}) {
  return new ExecutionService(
    mockEventBus,
    mockConcurrencyManager,
    mockWorktreeResolver,
    mockSettingsService,
    overrides.runAgentFn ?? mockRunAgentFn,
    mockExecutePipelineFn,
    mockUpdateFeatureStatusFn,
    overrides.loadFeatureFn ?? mockLoadFeatureFn,
    // ... etc
  );
}
apps/server/src/services/concurrency-manager.ts (1)

64-66: Consider a safer default for getCurrentBranch.

Defaulting to 'main' can miscount worktree capacity in repos with master or custom primary branches when the dependency isn’t injected. A null default avoids misclassification.

🛠 Suggested tweak
-    this.getCurrentBranch = getCurrentBranch ?? (() => Promise.resolve('main'));
+    this.getCurrentBranch = getCurrentBranch ?? (() => Promise.resolve(null));

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Fix all issues with AI agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.

In `@apps/server/src/services/agent-executor.ts`:
- Around line 335-337: The Math.min call uses an unreachable fallback; replace
Math.min(sdkOptions?.maxTurns || 100, 50) with a simpler cap such as
sdkOptions?.maxTurns ? Math.min(sdkOptions.maxTurns, 50) : 50 (or
Math.min(sdkOptions?.maxTurns ?? 50, 50)) when calling provider.executeQuery so
buildExecOpts receives a capped maxTurns; update the expression used in the
provider.executeQuery invocation in agent-executor.ts (referencing
provider.executeQuery, buildExecOpts, and sdkOptions?.maxTurns).

In `@apps/server/src/services/auto-mode/facade.ts`:
- Around line 225-235: The template replacement uses a redundant fallback
`task.description || task.description` in the buildTaskPrompt function; update
both occurrences to use a real fallback such as `task.description || task.name`
(or another appropriate field) so the template token `{{taskDescription}}` has a
sensible fallback when description is missing—locate the buildTaskPrompt
implementation and the second similar block and replace the no-op fallback with
`task.name`.
- Around line 637-683: verifyFeature currently builds worktreePath using
path.join(this.projectPath, '.worktrees', sanitizedFeatureId) which is
inconsistent with commitFeature's use of worktreeResolver; update verifyFeature
to use this.worktreeResolver.findWorktreeForBranch(...) to locate the worktree
(falling back to sanitizedFeatureId or feature.branchName as appropriate)
instead of constructing '.worktrees' manually, and use the returned path for
execAsync calls and secureFs.access checks so it matches commitFeature's
behavior.
- Around line 54-67: Remove the inline execGitCommand implementation and replace
calls to execGitCommand with the standardized git utilities from
`@automaker/git-utils` (or update `@automaker/git-utils` to export a git executor
and worktree management functions and then import those). Specifically, stop
using the local execGitCommand helper and its spawnProcess usage; instead call
the exported git execution and worktree APIs (e.g., gitExec, createWorktree,
removeWorktree — or whatever names are added) from `@automaker/git-utils` to
ensure isolated worktrees and consistent git behavior across feature execution
paths.
- Around line 167-238: The anonymous async runAgentFn bodies contain duplicated
provider resolution, credential lookup and agentExecutor.execute invocation
(including buildTaskPrompt logic); extract shared helpers inside create such as
a getResolvedProvider(model) helper that uses
ProviderFactory.getProviderForModel, stripProviderPrefix and
getProviderByModelId to return {provider, effectiveBareModel, credentials,
claudeCompatibleProvider}, and extract a buildTaskPrompt(task, allTasks,
taskIndex, template, feedback) helper to centralize the template replacements;
then replace both inline blocks (the async runAgentFn passed to
PipelineOrchestrator and the one passed to ExecutionService) to call these
helpers and invoke agentExecutor.execute with the unified parameter object and
callbacks.
- Around line 660-669: The loop in verifyFeature currently uses
execAsync(check.cmd) which is inconsistent with the spawnProcess/execGitCommand
pattern and is vulnerable to stdout/stderr buffering and package-manager
differences; replace the execAsync call with the existing spawn-based helper
(the same pattern used by execGitCommand/spawnProcess in this file) so you
stream and accumulate stdout/stderr into a single output string, handle timeouts
via the spawn helper, and push results.push({ check: check.name, passed:
true/false, output }) using the streamed output; additionally detect package
manager by checking for lockfiles (yarn.lock, pnpm-lock.yaml, package-lock.json)
and normalize/translate common hardcoded npm commands in verificationChecks to
the appropriate runner (yarn/pnpm/npm) before spawning so scripts work across
projects.

In `@apps/server/src/services/execution-service.ts`:
- Around line 339-364: The catch block currently handles aborts by emitting an
auto_mode_feature_complete event when errorInfo.isAbort is true but does not
update the feature state; call await this.updateFeatureStatusFn(projectPath,
featureId, 'interrupted') in the errorInfo.isAbort branch (near where
eventBus.emitAutoModeEvent('auto_mode_feature_complete', ... ) is invoked) so
the feature status reflects a user-initiated stop (using the 'interrupted'
status used by interruptFeature()/feature-state-manager) before
returning/finishing the handler.

In `@apps/server/src/services/feature-state-manager.ts`:
- Around line 212-282: In resetStuckFeatures, add counters (e.g.,
featuresScanned and featuresReset) while iterating entries in resetStuckFeatures
and increment featuresScanned for each directory and featuresReset whenever
needsUpdate is true; after the loop (before the catch) emit a logger.info
summary like `[resetStuckFeatures] Scanned X features, reset Y features for
${projectPath}` so you get a concise post-run summary to aid debugging.
- Around line 291-340: The updateFeaturePlanSpec function persists planSpec
changes but never emits an event; after the atomicWriteJson call add a
post-persist emission using createEventEmitter() to send a "plan_spec_updated"
event (include identifying payload such as featureId and the updated
feature.planSpec or entire feature) so the frontend is notified; follow the
existing persist-before-emit pattern used by saveFeatureSummary/updateTaskStatus
and ensure you call createEventEmitter().emit(...) immediately after
atomicWriteJson completes.

In `@apps/server/src/services/pipeline-orchestrator.ts`:
- Around line 362-374: After calling this.executePipeline(...) inside
resumeFromStep, re-fetch the feature's current status (e.g., loadFeature or
whichever method returns the latest feature record) and only call
updateFeatureStatusFn(projectPath, featureId, finalStatus) when the reloaded
feature is not already in a terminal/status that should be preserved
(specifically check for 'merge_conflict' and any other terminal statuses used
elsewhere); this prevents resumeFromStep from overwriting the merge_conflict
status set by executePipeline, mirroring the reload-and-check approach used in
execution-service.ts before proceeding.
- Around line 455-478: waitForTestCompletion currently polls for up to 600s
without honoring aborts; update its signature to accept an AbortSignal (e.g.,
add parameter signal: AbortSignal) and inside the interval callback and timeout
handler check signal.aborted, clearing both timers and resolving early (e.g., {
status: 'failed', exitCode: null, duration: ... } or propagate abort behavior)
when aborted; ensure you reference waitForTestCompletion, sessionId,
this.testRunnerService.getSession, clearInterval(checkInterval) and
clearTimeout(timeoutId) so both timers are cleared on abort to stop polling
immediately.
- Around line 486-496: Replace the hardcoded 'main' target in the performMerge
call with a dynamic primary branch lookup using the injected worktreeResolver:
call await this.worktreeResolver.getCurrentBranch(projectPath) and pass that
value instead of the literal 'main' (in the context of PipelineOrchestrator
where performMerge is invoked). Ensure the surrounding function remains async so
the await works and reuse the same projectPath argument when calling
getCurrentBranch.
- Around line 534-548: buildTestFailureSummary and extractFailedTestNames
duplicate the same line-parsing logic for detecting failures (regex for
FAIL/FAILED, leading '> ... .test|.spec.', and assertion keywords). Extract that
shared logic into a single helper function (e.g., parseTestLine or
collectFailedInfo) and replace the duplicated loops in buildTestFailureSummary
and extractFailedTestNames to call this helper; ensure the helper
returns/updates failCount, passCount, and failedTests (or returns extracted test
names and counts) and preserves the existing regexes and trimming behavior so
both callers remain in sync.

In `@apps/server/src/services/plan-approval-service.ts`:
- Around line 95-108: The timeout handler currently calls the raw reject from
the Promise constructor; move the creation of wrappedReject (and wrappedResolve
if present) to before the setTimeout call (or move the setTimeout after the
wrappers) and replace the raw reject in the timeout callback with wrappedReject
so the timeout path uses the same cleanup logic; update the timeout callback
that references setTimeout/timeoutId/pendingApprovals/featureId/timeoutMinutes
to call wrappedReject(...) and retain the delete(this.pendingApprovals) and
logger.warn behavior.

In `@apps/ui/src/components/views/graph-view-page.tsx`:
- Around line 316-328: The catch block after calling handleAddFeature and
handleStartImplementation currently only logs errors via logger.error; update it
to also show a user-facing toast by calling toast.error with a concise message
(e.g., "Failed to add and start feature") and include the error.message or
stringified error for context, so replace or augment the logger.error call in
that catch to call toast.error alongside logger.error; locate the block using
useAppStore.getState(), handleAddFeature, handleStartImplementation and
logger.error to make the change.
🧹 Nitpick comments (7)
🤖 Fix all nitpicks with AI agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.

In `@apps/server/src/services/auto-mode/facade.ts`:
- Around line 167-238: The anonymous async runAgentFn bodies contain duplicated
provider resolution, credential lookup and agentExecutor.execute invocation
(including buildTaskPrompt logic); extract shared helpers inside create such as
a getResolvedProvider(model) helper that uses
ProviderFactory.getProviderForModel, stripProviderPrefix and
getProviderByModelId to return {provider, effectiveBareModel, credentials,
claudeCompatibleProvider}, and extract a buildTaskPrompt(task, allTasks,
taskIndex, template, feedback) helper to centralize the template replacements;
then replace both inline blocks (the async runAgentFn passed to
PipelineOrchestrator and the one passed to ExecutionService) to call these
helpers and invoke agentExecutor.execute with the unified parameter object and
callbacks.
- Around line 660-669: The loop in verifyFeature currently uses
execAsync(check.cmd) which is inconsistent with the spawnProcess/execGitCommand
pattern and is vulnerable to stdout/stderr buffering and package-manager
differences; replace the execAsync call with the existing spawn-based helper
(the same pattern used by execGitCommand/spawnProcess in this file) so you
stream and accumulate stdout/stderr into a single output string, handle timeouts
via the spawn helper, and push results.push({ check: check.name, passed:
true/false, output }) using the streamed output; additionally detect package
manager by checking for lockfiles (yarn.lock, pnpm-lock.yaml, package-lock.json)
and normalize/translate common hardcoded npm commands in verificationChecks to
the appropriate runner (yarn/pnpm/npm) before spawning so scripts work across
projects.

In `@apps/server/src/services/feature-state-manager.ts`:
- Around line 212-282: In resetStuckFeatures, add counters (e.g.,
featuresScanned and featuresReset) while iterating entries in resetStuckFeatures
and increment featuresScanned for each directory and featuresReset whenever
needsUpdate is true; after the loop (before the catch) emit a logger.info
summary like `[resetStuckFeatures] Scanned X features, reset Y features for
${projectPath}` so you get a concise post-run summary to aid debugging.
- Around line 291-340: The updateFeaturePlanSpec function persists planSpec
changes but never emits an event; after the atomicWriteJson call add a
post-persist emission using createEventEmitter() to send a "plan_spec_updated"
event (include identifying payload such as featureId and the updated
feature.planSpec or entire feature) so the frontend is notified; follow the
existing persist-before-emit pattern used by saveFeatureSummary/updateTaskStatus
and ensure you call createEventEmitter().emit(...) immediately after
atomicWriteJson completes.

In `@apps/server/src/services/pipeline-orchestrator.ts`:
- Around line 534-548: buildTestFailureSummary and extractFailedTestNames
duplicate the same line-parsing logic for detecting failures (regex for
FAIL/FAILED, leading '> ... .test|.spec.', and assertion keywords). Extract that
shared logic into a single helper function (e.g., parseTestLine or
collectFailedInfo) and replace the duplicated loops in buildTestFailureSummary
and extractFailedTestNames to call this helper; ensure the helper
returns/updates failCount, passCount, and failedTests (or returns extracted test
names and counts) and preserves the existing regexes and trimming behavior so
both callers remain in sync.

In `@apps/server/src/services/plan-approval-service.ts`:
- Around line 95-108: The timeout handler currently calls the raw reject from
the Promise constructor; move the creation of wrappedReject (and wrappedResolve
if present) to before the setTimeout call (or move the setTimeout after the
wrappers) and replace the raw reject in the timeout callback with wrappedReject
so the timeout path uses the same cleanup logic; update the timeout callback
that references setTimeout/timeoutId/pendingApprovals/featureId/timeoutMinutes
to call wrappedReject(...) and retain the delete(this.pendingApprovals) and
logger.warn behavior.

In `@apps/ui/src/components/views/graph-view-page.tsx`:
- Around line 316-328: The catch block after calling handleAddFeature and
handleStartImplementation currently only logs errors via logger.error; update it
to also show a user-facing toast by calling toast.error with a concise message
(e.g., "Failed to add and start feature") and include the error.message or
stringified error for context, so replace or augment the logger.error call in
that catch to call toast.error alongside logger.error; locate the block using
useAppStore.getState(), handleAddFeature, handleStartImplementation and
logger.error to make the change.
apps/ui/src/components/views/graph-view-page.tsx (1)

316-328: Good addition of error handling and the new-feature guard.

The snapshot-and-diff approach to detect the newly added feature is sound, and the try/catch prevents unhandled rejections. One small gap: other error paths in this file surface failures to the user via toast.error (e.g., Line 175), but this catch block only logs silently. Consider adding a toast so the user knows the "add & start" action failed.

Suggested improvement
       } catch (error) {
         logger.error('Failed to add and start feature:', error);
+        toast.error('Failed to add and start feature');
       }
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/ui/src/components/views/graph-view-page.tsx` around lines 316 - 328, The
catch block after calling handleAddFeature and handleStartImplementation
currently only logs errors via logger.error; update it to also show a
user-facing toast by calling toast.error with a concise message (e.g., "Failed
to add and start feature") and include the error.message or stringified error
for context, so replace or augment the logger.error call in that catch to call
toast.error alongside logger.error; locate the block using
useAppStore.getState(), handleAddFeature, handleStartImplementation and
logger.error to make the change.
apps/server/src/services/pipeline-orchestrator.ts (1)

534-548: Duplicate FAIL-line extraction logic between buildTestFailureSummary and extractFailedTestNames.

Both methods iterate over scrollback lines with the same FAIL/FAILED regex matching. Consider extracting the shared logic into a single helper to keep them in sync.

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/pipeline-orchestrator.ts` around lines 534 - 548,
buildTestFailureSummary and extractFailedTestNames duplicate the same
line-parsing logic for detecting failures (regex for FAIL/FAILED, leading '> ...
.test|.spec.', and assertion keywords). Extract that shared logic into a single
helper function (e.g., parseTestLine or collectFailedInfo) and replace the
duplicated loops in buildTestFailureSummary and extractFailedTestNames to call
this helper; ensure the helper returns/updates failCount, passCount, and
failedTests (or returns extracted test names and counts) and preserves the
existing regexes and trimming behavior so both callers remain in sync.
apps/server/src/services/plan-approval-service.ts (1)

95-108: Timeout callback uses raw reject instead of wrappedReject.

The timeout handler (Line 102) calls the raw reject from the Promise constructor rather than wrappedReject. While this works (the timeout doesn't need to clear itself since it already fired), it's inconsistent with the stated design of "wrapped resolve/reject ensures cleanup." If the timeout handler used wrappedReject, it would be self-documenting and safe against future refactoring that might add cleanup logic to the wrapped functions.

✨ Optional: use wrappedReject in timeout

This requires reordering the declarations so wrappedReject is defined before the setTimeout call, or moving the timeout setup after the wrappers.

+     const wrappedReject = (error: Error) => {
+       clearTimeout(timeoutId);
+       reject(error);
+     };
+
      const timeoutId = setTimeout(() => {
        const pending = this.pendingApprovals.get(key);
        if (pending) {
          logger.warn(...);
          this.pendingApprovals.delete(key);
-         reject(
+         wrappedReject(
            new Error(...)
          );
        }
      }, timeoutMs);

      const wrappedResolve = (result: PlanApprovalResult) => {
        clearTimeout(timeoutId);
        resolve(result);
      };

-     const wrappedReject = (error: Error) => {
-       clearTimeout(timeoutId);
-       reject(error);
-     };
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/plan-approval-service.ts` around lines 95 - 108, The
timeout handler currently calls the raw reject from the Promise constructor;
move the creation of wrappedReject (and wrappedResolve if present) to before the
setTimeout call (or move the setTimeout after the wrappers) and replace the raw
reject in the timeout callback with wrappedReject so the timeout path uses the
same cleanup logic; update the timeout callback that references
setTimeout/timeoutId/pendingApprovals/featureId/timeoutMinutes to call
wrappedReject(...) and retain the delete(this.pendingApprovals) and logger.warn
behavior.
apps/server/src/services/auto-mode/facade.ts (2)

167-238: Significant code duplication: runAgentFn callback defined twice with near-identical logic.

The runAgentFn callback for PipelineOrchestrator (Lines 167-238) and ExecutionService (Lines 286-364) share nearly identical provider resolution, credential handling, and agentExecutor.execute() invocation. The same applies to buildTaskPrompt (Lines 225-235 vs 351-361). Consider extracting these into shared helper functions within the create method to avoid drift.

Also applies to: 286-364

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-mode/facade.ts` around lines 167 - 238, The
anonymous async runAgentFn bodies contain duplicated provider resolution,
credential lookup and agentExecutor.execute invocation (including
buildTaskPrompt logic); extract shared helpers inside create such as a
getResolvedProvider(model) helper that uses ProviderFactory.getProviderForModel,
stripProviderPrefix and getProviderByModelId to return {provider,
effectiveBareModel, credentials, claudeCompatibleProvider}, and extract a
buildTaskPrompt(task, allTasks, taskIndex, template, feedback) helper to
centralize the template replacements; then replace both inline blocks (the async
runAgentFn passed to PipelineOrchestrator and the one passed to
ExecutionService) to call these helpers and invoke agentExecutor.execute with
the unified parameter object and callbacks.

660-669: verifyFeature uses shell-based execAsync for verification commands.

While the commands are hardcoded (not user-derived), this is inconsistent with the execGitCommand pattern using spawnProcess introduced elsewhere in this file. Also, exec doesn't handle stdout/stderr buffering limits well for large outputs.

Additionally, the hardcoded npm commands may fail for projects using yarn, pnpm, or differently-named scripts.

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-mode/facade.ts` around lines 660 - 669, The
loop in verifyFeature currently uses execAsync(check.cmd) which is inconsistent
with the spawnProcess/execGitCommand pattern and is vulnerable to stdout/stderr
buffering and package-manager differences; replace the execAsync call with the
existing spawn-based helper (the same pattern used by
execGitCommand/spawnProcess in this file) so you stream and accumulate
stdout/stderr into a single output string, handle timeouts via the spawn helper,
and push results.push({ check: check.name, passed: true/false, output }) using
the streamed output; additionally detect package manager by checking for
lockfiles (yarn.lock, pnpm-lock.yaml, package-lock.json) and normalize/translate
common hardcoded npm commands in verificationChecks to the appropriate runner
(yarn/pnpm/npm) before spawning so scripts work across projects.
apps/server/src/services/feature-state-manager.ts (2)

212-282: resetStuckFeatures silently continues on per-feature read failures.

If readJsonWithRecovery returns null for a feature (Line 228), the loop skips it. This is fine. However, there's no summary log indicating how many features were reset. A brief summary log after the loop would aid debugging after server restarts.

✨ Optional: Add summary log
+      let resetCount = 0;
       for (const entry of entries) {
         ...
         if (needsUpdate) {
+          resetCount++;
           feature.updatedAt = new Date().toISOString();
           await atomicWriteJson(featurePath, feature, { backupCount: DEFAULT_BACKUP_COUNT });
         }
       }
+      if (resetCount > 0) {
+        logger.info(`[resetStuckFeatures] Reset ${resetCount} stuck feature(s) for ${projectPath}`);
+      }
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/feature-state-manager.ts` around lines 212 - 282, In
resetStuckFeatures, add counters (e.g., featuresScanned and featuresReset) while
iterating entries in resetStuckFeatures and increment featuresScanned for each
directory and featuresReset whenever needsUpdate is true; after the loop (before
the catch) emit a logger.info summary like `[resetStuckFeatures] Scanned X
features, reset Y features for ${projectPath}` so you get a concise post-run
summary to aid debugging.

291-340: updateFeaturePlanSpec does not emit an event after persistence.

Unlike saveFeatureSummary and updateTaskStatus, this method persists changes but never emits an event. Callers in AgentExecutor compensate by emitting their own events (e.g. plan_approval_required, plan_approved), but if a caller forgets, the UI won't be notified of planSpec changes.

Consider emitting a plan_spec_updated event here for consistency with the persist-before-emit pattern used elsewhere in this class. As per coding guidelines, "All server operations should emit events using createEventEmitter() from lib/events.ts that stream to the frontend via WebSocket."

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/feature-state-manager.ts` around lines 291 - 340,
The updateFeaturePlanSpec function persists planSpec changes but never emits an
event; after the atomicWriteJson call add a post-persist emission using
createEventEmitter() to send a "plan_spec_updated" event (include identifying
payload such as featureId and the updated feature.planSpec or entire feature) so
the frontend is notified; follow the existing persist-before-emit pattern used
by saveFeatureSummary/updateTaskStatus and ensure you call
createEventEmitter().emit(...) immediately after atomicWriteJson completes.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.

In `@apps/server/src/services/auto-loop-coordinator.ts`:
- Around line 283-318: The method trackFailureAndCheckPauseForProject (and
similarly signalShouldPauseForProject) dangerously uses errorInfo! for the new
signature; add a runtime guard when branchNameOrError is treated as the
branchName to ensure errorInfo is present before using it — either throw a clear
runtime error, return false, or supply a safe default error object if errorInfo
is undefined; alternatively implement proper TypeScript overloads or split into
two methods to eliminate the optional errorInfo ambiguity so actualErrorInfo is
never assigned from an undefined value.
- Around line 46-51: The local ExecuteFeatureFn type in auto-loop-coordinator.ts
diverges from the canonical signature; replace the local declaration with an
import of ExecuteFeatureFn from execution-types.ts (or update it to include the
missing parameters providedWorktreePath and options matching the canonical
definition) so the function signature (ExecuteFeatureFn) aligns with the
exported type and avoids drift.
- Around line 68-82: The AutoLoopCoordinator constructor currently accepts 10
positional dependencies (constructor in class AutoLoopCoordinator), which is
hard to read and extend; refactor it to accept a single options/config object
(e.g., AutoLoopCoordinatorOptions) containing named properties for eventBus,
concurrencyManager, settingsService, executeFeatureFn, loadPendingFeaturesFn,
saveExecutionStateFn, clearExecutionStateFn, resetStuckFeaturesFn,
isFeatureFinishedFn and isFeatureRunningFn, update the constructor to
destructure and assign those properties to the private fields, and update all
instantiation sites to pass a single options object (preserving original types
and nullability for settingsService) so call-sites remain readable and future
optional deps can be added without changing call signatures.
- Around line 184-202: The loop currently picks only one nextFeature per
iteration causing slow ramp-up; change the logic in the auto-loop to compute
available slots as (projectState.config.maxConcurrency || maxConcurrency) minus
the current runningCount, then select and launch up to that many features from
pendingFeatures that satisfy !this.isFeatureRunningFn(f.id) &&
!this.isFeatureFinishedFn(f). For each selected feature call
this.executeFeatureFn(projectPath, feature.id, projectState.config.useWorktrees,
true) and attach the same .catch handler that classifies the error
(classifyError), logs via logger.error, and calls
this.trackFailureAndCheckPauseForProject and this.signalShouldPauseForProject as
now; keep the final await this.sleep(2000, projectState.abortController.signal)
unchanged. Ensure runningCount/slot computation uses the same source of truth
used elsewhere (e.g., projectState or config) and that launches are started
without awaiting so they run concurrently.
- Around line 115-120: The config object in startAutoLoopForProject hardcodes
useWorktrees to true; change this by adding an optional useWorktrees?: boolean
parameter to startAutoLoopForProject (and propagate it from callers), update the
AutoModeConfig definition to make useWorktrees optional (or keep boolean), and
set useWorktrees: useWorktrees ?? true when building the config instead of the
literal true so existing behavior is preserved while allowing callers to toggle
it; ensure all call sites pass the new flag or rely on the default.
- Around line 203-206: The bare catch swallows all errors; change it to catch
(err) and, after checking projectState.abortController.signal.aborted, log a
warning containing the error and contextual info (e.g., which worktree/project
and the functions involved like
loadPendingFeaturesFn/getRunningCountForWorktree) using the existing logger
(e.g., this.logger.warn or projectState.logger) or console.warn, then await
this.sleep as before; ensure you do not log when the abort signal caused the
break.
🧹 Nitpick comments (5)
🤖 Fix all nitpicks with AI agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.

In `@apps/server/src/services/auto-loop-coordinator.ts`:
- Around line 46-51: The local ExecuteFeatureFn type in auto-loop-coordinator.ts
diverges from the canonical signature; replace the local declaration with an
import of ExecuteFeatureFn from execution-types.ts (or update it to include the
missing parameters providedWorktreePath and options matching the canonical
definition) so the function signature (ExecuteFeatureFn) aligns with the
exported type and avoids drift.
- Around line 68-82: The AutoLoopCoordinator constructor currently accepts 10
positional dependencies (constructor in class AutoLoopCoordinator), which is
hard to read and extend; refactor it to accept a single options/config object
(e.g., AutoLoopCoordinatorOptions) containing named properties for eventBus,
concurrencyManager, settingsService, executeFeatureFn, loadPendingFeaturesFn,
saveExecutionStateFn, clearExecutionStateFn, resetStuckFeaturesFn,
isFeatureFinishedFn and isFeatureRunningFn, update the constructor to
destructure and assign those properties to the private fields, and update all
instantiation sites to pass a single options object (preserving original types
and nullability for settingsService) so call-sites remain readable and future
optional deps can be added without changing call signatures.
- Around line 184-202: The loop currently picks only one nextFeature per
iteration causing slow ramp-up; change the logic in the auto-loop to compute
available slots as (projectState.config.maxConcurrency || maxConcurrency) minus
the current runningCount, then select and launch up to that many features from
pendingFeatures that satisfy !this.isFeatureRunningFn(f.id) &&
!this.isFeatureFinishedFn(f). For each selected feature call
this.executeFeatureFn(projectPath, feature.id, projectState.config.useWorktrees,
true) and attach the same .catch handler that classifies the error
(classifyError), logs via logger.error, and calls
this.trackFailureAndCheckPauseForProject and this.signalShouldPauseForProject as
now; keep the final await this.sleep(2000, projectState.abortController.signal)
unchanged. Ensure runningCount/slot computation uses the same source of truth
used elsewhere (e.g., projectState or config) and that launches are started
without awaiting so they run concurrently.
- Around line 115-120: The config object in startAutoLoopForProject hardcodes
useWorktrees to true; change this by adding an optional useWorktrees?: boolean
parameter to startAutoLoopForProject (and propagate it from callers), update the
AutoModeConfig definition to make useWorktrees optional (or keep boolean), and
set useWorktrees: useWorktrees ?? true when building the config instead of the
literal true so existing behavior is preserved while allowing callers to toggle
it; ensure all call sites pass the new flag or rely on the default.
- Around line 203-206: The bare catch swallows all errors; change it to catch
(err) and, after checking projectState.abortController.signal.aborted, log a
warning containing the error and contextual info (e.g., which worktree/project
and the functions involved like
loadPendingFeaturesFn/getRunningCountForWorktree) using the existing logger
(e.g., this.logger.warn or projectState.logger) or console.warn, then await
this.sleep as before; ensure you do not log when the abort signal caused the
break.
apps/server/src/services/auto-loop-coordinator.ts (5)

46-51: ExecuteFeatureFn type diverges from the canonical definition in execution-types.ts.

The local type declares 4 parameters, but execution-types.ts defines ExecuteFeatureFn with 6 parameters (including optional providedWorktreePath and options). Maintaining two separate type definitions for the same concept creates a drift risk — if the canonical signature changes, this local copy won't reflect it.

Consider importing the type from execution-types.ts instead of re-declaring it, or at minimum keeping the signatures aligned.

#!/bin/bash
# Check if ExecuteFeatureFn is exported from execution-types and used elsewhere
rg -n 'ExecuteFeatureFn' --type=ts -C2
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-loop-coordinator.ts` around lines 46 - 51, The
local ExecuteFeatureFn type in auto-loop-coordinator.ts diverges from the
canonical signature; replace the local declaration with an import of
ExecuteFeatureFn from execution-types.ts (or update it to include the missing
parameters providedWorktreePath and options matching the canonical definition)
so the function signature (ExecuteFeatureFn) aligns with the exported type and
avoids drift.

68-82: Constructor takes 10 dependencies — consider a config/options object.

The constructor has 10 positional parameters, making call-site readability and future extensibility difficult. An options object pattern would be more maintainable.

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-loop-coordinator.ts` around lines 68 - 82, The
AutoLoopCoordinator constructor currently accepts 10 positional dependencies
(constructor in class AutoLoopCoordinator), which is hard to read and extend;
refactor it to accept a single options/config object (e.g.,
AutoLoopCoordinatorOptions) containing named properties for eventBus,
concurrencyManager, settingsService, executeFeatureFn, loadPendingFeaturesFn,
saveExecutionStateFn, clearExecutionStateFn, resetStuckFeaturesFn,
isFeatureFinishedFn and isFeatureRunningFn, update the constructor to
destructure and assign those properties to the private fields, and update all
instantiation sites to pass a single options object (preserving original types
and nullability for settingsService) so call-sites remain readable and future
optional deps can be added without changing call signatures.

184-202: Loop launches at most one feature per iteration — slow ramp-up at higher concurrency.

With maxConcurrency set to, say, 5 and 0 features running, the loop picks only one feature per 2-second cycle (line 202). It would take ~10 seconds to fill all slots. Consider launching up to maxConcurrency - runningCount features per iteration for faster utilization.

♻️ Sketch of batched launch
-       const nextFeature = pendingFeatures.find(
-         (f) => !this.isFeatureRunningFn(f.id) && !this.isFeatureFinishedFn(f)
-       );
-       if (nextFeature) {
-         projectState.hasEmittedIdleEvent = false;
-         this.executeFeatureFn(
-           projectPath,
-           nextFeature.id,
-           projectState.config.useWorktrees,
-           true
-         ).catch((error) => {
-           const errorInfo = classifyError(error);
-           logger.error(`Auto-loop feature ${nextFeature.id} failed:`, errorInfo.message);
-           if (this.trackFailureAndCheckPauseForProject(projectPath, branchName, errorInfo)) {
-             this.signalShouldPauseForProject(projectPath, branchName, errorInfo);
-           }
-         });
-       }
+       const availableSlots = projectState.config.maxConcurrency - runningCount;
+       const eligibleFeatures = pendingFeatures.filter(
+         (f) => !this.isFeatureRunningFn(f.id) && !this.isFeatureFinishedFn(f)
+       );
+       const toLaunch = eligibleFeatures.slice(0, availableSlots);
+       if (toLaunch.length > 0) {
+         projectState.hasEmittedIdleEvent = false;
+         for (const feature of toLaunch) {
+           this.executeFeatureFn(
+             projectPath,
+             feature.id,
+             projectState.config.useWorktrees,
+             true
+           ).catch((error) => {
+             const errorInfo = classifyError(error);
+             logger.error(`Auto-loop feature ${feature.id} failed:`, errorInfo.message);
+             if (this.trackFailureAndCheckPauseForProject(projectPath, branchName, errorInfo)) {
+               this.signalShouldPauseForProject(projectPath, branchName, errorInfo);
+             }
+           });
+         }
+       }
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-loop-coordinator.ts` around lines 184 - 202,
The loop currently picks only one nextFeature per iteration causing slow
ramp-up; change the logic in the auto-loop to compute available slots as
(projectState.config.maxConcurrency || maxConcurrency) minus the current
runningCount, then select and launch up to that many features from
pendingFeatures that satisfy !this.isFeatureRunningFn(f.id) &&
!this.isFeatureFinishedFn(f). For each selected feature call
this.executeFeatureFn(projectPath, feature.id, projectState.config.useWorktrees,
true) and attach the same .catch handler that classifies the error
(classifyError), logs via logger.error, and calls
this.trackFailureAndCheckPauseForProject and this.signalShouldPauseForProject as
now; keep the final await this.sleep(2000, projectState.abortController.signal)
unchanged. Ensure runningCount/slot computation uses the same source of truth
used elsewhere (e.g., projectState or config) and that launches are started
without awaiting so they run concurrently.

115-120: useWorktrees is hardcoded to true — consider making it configurable or removing from config.

The AutoModeConfig interface declares useWorktrees as a field, but startAutoLoopForProject always sets it to true (line 117). If worktree usage isn't meant to be toggled, remove it from the interface to avoid misleading consumers; otherwise, accept it as a parameter.

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-loop-coordinator.ts` around lines 115 - 120,
The config object in startAutoLoopForProject hardcodes useWorktrees to true;
change this by adding an optional useWorktrees?: boolean parameter to
startAutoLoopForProject (and propagate it from callers), update the
AutoModeConfig definition to make useWorktrees optional (or keep boolean), and
set useWorktrees: useWorktrees ?? true when building the config instead of the
literal true so existing behavior is preserved while allowing callers to toggle
it; ensure all call sites pass the new flag or rely on the default.

203-206: Bare catch silently swallows all loop errors.

Any unexpected error from loadPendingFeaturesFn or getRunningCountForWorktree is caught at line 203 with no logging. Only the abort-signal check prevents a silent infinite retry. Consider logging a warning for non-abort errors to aid debugging.

♻️ Proposed fix
-     } catch {
+     } catch (error) {
        if (projectState.abortController.signal.aborted) break;
+       logger.warn(`Auto-loop iteration error for ${worktreeKey}:`, error);
        await this.sleep(5000, projectState.abortController.signal);
      }
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-loop-coordinator.ts` around lines 203 - 206,
The bare catch swallows all errors; change it to catch (err) and, after checking
projectState.abortController.signal.aborted, log a warning containing the error
and contextual info (e.g., which worktree/project and the functions involved
like loadPendingFeaturesFn/getRunningCountForWorktree) using the existing logger
(e.g., this.logger.warn or projectState.logger) or console.warn, then await
this.sleep as before; ensure you do not log when the abort signal caused the
break.

Comment on lines +283 to +318
trackFailureAndCheckPauseForProject(
projectPath: string,
branchNameOrError: string | null | { type: string; message: string },
errorInfo?: { type: string; message: string }
): boolean {
// Support both old (projectPath, errorInfo) and new (projectPath, branchName, errorInfo) signatures
let branchName: string | null;
let actualErrorInfo: { type: string; message: string };
if (
typeof branchNameOrError === 'object' &&
branchNameOrError !== null &&
'type' in branchNameOrError
) {
// Old signature: (projectPath, errorInfo)
branchName = null;
actualErrorInfo = branchNameOrError;
} else {
// New signature: (projectPath, branchName, errorInfo)
branchName = branchNameOrError;
actualErrorInfo = errorInfo!;
}
const projectState = this.autoLoopsByProject.get(
getWorktreeAutoLoopKey(projectPath, branchName)
);
if (!projectState) return false;
const now = Date.now();
projectState.consecutiveFailures.push({ timestamp: now, error: actualErrorInfo.message });
projectState.consecutiveFailures = projectState.consecutiveFailures.filter(
(f) => now - f.timestamp < FAILURE_WINDOW_MS
);
return (
projectState.consecutiveFailures.length >= CONSECUTIVE_FAILURE_THRESHOLD ||
actualErrorInfo.type === 'quota_exhausted' ||
actualErrorInfo.type === 'rate_limit'
);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unsafe errorInfo! non-null assertion when called with the "new" signature.

On line 302, actualErrorInfo = errorInfo! will be undefined at runtime if a caller passes (projectPath, branchName) without the third argument. The same issue exists in signalShouldPauseForProject at line 337. This ad-hoc overloading via union types is fragile — TypeScript won't catch misuse because errorInfo is typed as optional.

Consider using proper function overload signatures, or at minimum add a runtime guard:

🐛 Proposed fix (for both methods)
     } else {
       // New signature: (projectPath, branchName, errorInfo)
       branchName = branchNameOrError;
-      actualErrorInfo = errorInfo!;
+      if (!errorInfo) {
+        throw new Error('errorInfo is required when branchName is provided');
+      }
+      actualErrorInfo = errorInfo;
     }

Alternatively, split into two clearly-typed methods to eliminate the ambiguity entirely.

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-loop-coordinator.ts` around lines 283 - 318,
The method trackFailureAndCheckPauseForProject (and similarly
signalShouldPauseForProject) dangerously uses errorInfo! for the new signature;
add a runtime guard when branchNameOrError is treated as the branchName to
ensure errorInfo is present before using it — either throw a clear runtime
error, return false, or supply a safe default error object if errorInfo is
undefined; alternatively implement proper TypeScript overloads or split into two
methods to eliminate the optional errorInfo ambiguity so actualErrorInfo is
never assigned from an undefined value.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.

In `@apps/server/src/services/agent-executor.ts`:
- Around line 126-144: The appendRawEvent helper is writing pretty-printed
multi-line JSON (JSON.stringify with spacing) which breaks JSONL; change it to
emit a single-line JSON per event by removing the space/indent argument so each
push to rawOutputLines is JSON.stringify({ timestamp: ..., event }) (ensure
rawOutputLines, rawOutputPath, appendRawEvent and the debounce logic using
AgentExecutor.WRITE_DEBOUNCE_MS remain unchanged), and retain the existing
write/debounce/try-catch behavior so the file appends one JSON object per line
compatible with JSONL.
- Around line 554-555: In provider.executeQuery(...this.buildExecOpts(...,
sdkOptions?.maxTurns || 100)) replace the fallback operator with nullish
coalescing to match the earlier use; change the expression using
sdkOptions?.maxTurns || 100 to sdkOptions?.maxTurns ?? 100 so an explicit 0 is
honored (look for provider.executeQuery and buildExecOpts references and align
with the sdkOptions?.maxTurns ?? 50 pattern used elsewhere).

In `@apps/server/src/services/auto-mode/facade.ts`:
- Around line 544-631: The method followUpFeature() currently does heavy work
(concurrencyManager.acquire, featureStateManager.loadFeature,
worktreeResolver.findWorktreeForBranch, secureFs.readFile, prompt construction)
and then unconditionally throws, wasting I/O and blocking the concurrency slot;
change the stub to perform only minimal setup before the intentional throw: keep
validateWorkingDirectory(this.projectPath), acquire a concurrency slot via
concurrencyManager.acquire and capture abortController, then immediately throw
the not-implemented Error (wrapped in the existing try/catch that uses
classifyError and eventBus.emitAutoModeEvent) and ensure
concurrencyManager.release(featureId) runs in finally; remove or defer
featureStateManager.loadFeature, worktreeResolver.findWorktreeForBranch,
secureFs.readFile, getPromptCustomization, and prompt-building logic (or move
them into the real AutoModeService.followUpFeature implementation) so the facade
no longer performs costly I/O before throwing.
- Around line 854-873: The code in checkWorktreeCapacity currently normalizes
branchName by comparing rawBranchName to the hardcoded string 'main', which
fails for repos whose primary branch is not 'main'; replace that hardcoded check
by calling worktreeResolver.getCurrentBranch(this.projectPath) (or an existing
method that returns the repo's primary/default branch) and normalize branchName
to null when rawBranchName equals that resolved primary branch; keep the rest of
the logic calling autoLoopCoordinator.resolveMaxConcurrency and
concurrencyManager.getRunningCountForWorktree unchanged and use the resolved
branchName variable when passing to those methods.
- Around line 668-677: The verifyFeature loop currently calls
execAsync(check.cmd) which spawns a shell; replace this with a non-shell
invocation for consistency by using execFile or the existing
execGitCommand/spawnProcess helpers: split the hardcoded command string in each
verificationChecks entry into command + args (or update entries to provide cmd
and args), call execFile(cmd, args, { cwd: workDir, timeout: 120000 }) or call
execGitCommand/spawnProcess equivalent to run the command without a shell, then
keep the same success/error handling that pushes to results and flips allPassed
on catch. Ensure you reference the verificationChecks items, the verifyFeature
loop, and preserve output capture and the existing error.message usage.
- Around line 98-433: The create factory method contains duplicated runAgentFn
logic used by PipelineOrchestrator and ExecutionService (see
PipelineOrchestrator constructor call and ExecutionService constructor call)
which should be extracted into a single helper to avoid divergence; implement a
private/shared function (e.g., buildAgentRunner or createRunAgentFn) that
captures dependencies (agentExecutor, settingsService, getProviderByModelId,
stripProviderPrefix, ProviderFactory, planApprovalService, featureStateManager)
and returns the async runAgentFn signature expected by both PipelineOrchestrator
and ExecutionService, then replace the inline implementations in the
PipelineOrchestrator and ExecutionService calls with calls to that helper to
eliminate duplication and keep behavior consistent.
- Line 148: The non-null assertions on facadeInstance used inside callbacks
passed to PipelineOrchestrator, AutoLoopCoordinator, and ExecutionService are
fragile because facadeInstance is only assigned later; add a clear comment above
the declaration of facadeInstance describing the invariant ("callbacks passed to
PipelineOrchestrator / AutoLoopCoordinator / ExecutionService must not be
invoked synchronously and may reference facadeInstance after it's assigned") so
future maintainers know why facadeInstance! is safe, and optionally replace
direct facadeInstance! usage with a small local accessor function (e.g.,
getFacade() that throws a descriptive error if null) to make failures explicit;
reference the variable facadeInstance and the constructors PipelineOrchestrator,
AutoLoopCoordinator, ExecutionService when making the change.

In `@apps/server/src/services/execution-service.ts`:
- Around line 63-84: The ExecutionService constructor currently takes 18
positional params (notably many callback fns like runAgentFn, executePipelineFn,
updateFeatureStatusFn, loadFeatureFn, etc.), which is fragile; refactor by
introducing an ExecutionServiceCallbacks interface that groups all callback
function types (e.g., runAgent, executePipeline, updateFeatureStatus,
loadFeature, getPlanningPromptPrefix, saveFeatureSummary, recordLearnings,
contextExists, resumeFeature, trackFailure, signalPause, recordSuccess,
saveExecutionState, loadContextFiles), then change the ExecutionService
constructor signature to accept the four core deps (eventBus,
concurrencyManager, worktreeResolver, settingsService) plus a single callbacks
object; update usages/wiring (e.g., in facade.ts) to pass a callbacks object
rather than positional args and update internal references from runAgentFn ->
callbacks.runAgent, etc.

In `@apps/server/src/services/feature-state-manager.ts`:
- Around line 125-153: The notificationService.createNotification calls (for
statuses 'waiting_approval' and 'verified') can throw and currently can prevent
the later call to this.featureLoader.syncFeatureToAppSpec; wrap each
notificationService.createNotification invocation (or the whole notification
block) in its own try-catch, log any errors (e.g., using logger.warn/error) and
continue so that featureLoader.syncFeatureToAppSpec always runs for 'verified'
or 'completed' statuses; keep the existing try-catch around syncFeatureToAppSpec
as-is.
- Around line 330-339: The version bump logic incorrectly uses a falsy check (if
(updates.content && ...)) so setting updates.content to an empty string won't
trigger a version increment; change the condition in the update block that
currently reads against updates.content to explicitly check for presence (e.g.,
updates.content !== undefined) and compare it to oldContent before incrementing
feature.planSpec.version on feature.planSpec (ensure you reference the
oldContent captured from feature.planSpec.content and update
feature.planSpec.version accordingly).
- Around line 432-449: The code silently ignores cases where taskId doesn't
match any entry in feature.planSpec.tasks; add an else branch after the task
lookup in the method that logs a warning (including featureId, projectPath,
taskId and maybe available task IDs) so mismatches are visible; do this near the
block that updates task.status, feature.updatedAt, calls atomicWriteJson and
emitAutoModeEvent (referencing the same function scope where task is found) to
ensure a warning is emitted when task === undefined.

In `@apps/server/src/services/pipeline-orchestrator.ts`:
- Around line 364-371: The code calls this.featureLoader.getById but
PipelineOrchestrator was injected with featureStateManager, causing a runtime
crash; replace the use of featureLoader with the injected featureStateManager
when re-fetching the feature (i.e., change
this.featureLoader.getById(projectPath, featureId) to
this.featureStateManager.getById(projectPath, featureId) or, if
FeatureStateManager exposes a different accessor name, call the correct method
(e.g., get or getFeature) on this.featureStateManager and keep the rest of the
logic (reloadedFeature and updateFeatureStatusFn) unchanged.
- Around line 554-587: The parser in parseTestLines currently treats any line
containing "toBe" or "toEqual" (and the generic "AssertionError") as a failed
test which causes false positives; update the heuristic to only record such
lines as failures when they appear in a failure context (e.g., line begins with
"AssertionError" or contains "Error" or the word "FAIL"/"FAILED" nearby) or when
associated with a test file marker (the current /^>\s+.*\.(test|spec)\./ check),
and consolidate failure-detection logic so failedTests, failCount and passCount
are incremented consistently; specifically tighten matching for "toBe"/"toEqual"
and "AssertionError" (in parseTestLines) to require contextual cues rather than
matching any occurrence.
- Around line 565-584: The current loose heuristic pushes any line with
"toBe"/"toEqual" into failedTests; tighten it by replacing the trimmed.includes
checks with a stricter pattern match: only add trimmed to failedTests if it
matches assertion-call patterns like
/\.to(Be|Equal)\s*\(|/expect\(.+\)\.(toBe|toEqual)\s*\(/ or if the line contains
explicit failure context such as "AssertionError" or "Expected"/"Received" or
the immediately preceding line indicated a FAIL/FAILED; update the code inside
the for (const line of lines) loop (variables: trimmed, failedTests,
passCount/failCount) to use these regex checks instead of the current substring
checks so passing/comments containing "toBe"/"toEqual" are not misclassified.
🧹 Nitpick comments (10)
🤖 Fix all nitpicks with AI agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.

In `@apps/server/src/services/agent-executor.ts`:
- Around line 126-144: The appendRawEvent helper is writing pretty-printed
multi-line JSON (JSON.stringify with spacing) which breaks JSONL; change it to
emit a single-line JSON per event by removing the space/indent argument so each
push to rawOutputLines is JSON.stringify({ timestamp: ..., event }) (ensure
rawOutputLines, rawOutputPath, appendRawEvent and the debounce logic using
AgentExecutor.WRITE_DEBOUNCE_MS remain unchanged), and retain the existing
write/debounce/try-catch behavior so the file appends one JSON object per line
compatible with JSONL.
- Around line 554-555: In provider.executeQuery(...this.buildExecOpts(...,
sdkOptions?.maxTurns || 100)) replace the fallback operator with nullish
coalescing to match the earlier use; change the expression using
sdkOptions?.maxTurns || 100 to sdkOptions?.maxTurns ?? 100 so an explicit 0 is
honored (look for provider.executeQuery and buildExecOpts references and align
with the sdkOptions?.maxTurns ?? 50 pattern used elsewhere).

In `@apps/server/src/services/auto-mode/facade.ts`:
- Around line 668-677: The verifyFeature loop currently calls
execAsync(check.cmd) which spawns a shell; replace this with a non-shell
invocation for consistency by using execFile or the existing
execGitCommand/spawnProcess helpers: split the hardcoded command string in each
verificationChecks entry into command + args (or update entries to provide cmd
and args), call execFile(cmd, args, { cwd: workDir, timeout: 120000 }) or call
execGitCommand/spawnProcess equivalent to run the command without a shell, then
keep the same success/error handling that pushes to results and flips allPassed
on catch. Ensure you reference the verificationChecks items, the verifyFeature
loop, and preserve output capture and the existing error.message usage.
- Around line 98-433: The create factory method contains duplicated runAgentFn
logic used by PipelineOrchestrator and ExecutionService (see
PipelineOrchestrator constructor call and ExecutionService constructor call)
which should be extracted into a single helper to avoid divergence; implement a
private/shared function (e.g., buildAgentRunner or createRunAgentFn) that
captures dependencies (agentExecutor, settingsService, getProviderByModelId,
stripProviderPrefix, ProviderFactory, planApprovalService, featureStateManager)
and returns the async runAgentFn signature expected by both PipelineOrchestrator
and ExecutionService, then replace the inline implementations in the
PipelineOrchestrator and ExecutionService calls with calls to that helper to
eliminate duplication and keep behavior consistent.
- Line 148: The non-null assertions on facadeInstance used inside callbacks
passed to PipelineOrchestrator, AutoLoopCoordinator, and ExecutionService are
fragile because facadeInstance is only assigned later; add a clear comment above
the declaration of facadeInstance describing the invariant ("callbacks passed to
PipelineOrchestrator / AutoLoopCoordinator / ExecutionService must not be
invoked synchronously and may reference facadeInstance after it's assigned") so
future maintainers know why facadeInstance! is safe, and optionally replace
direct facadeInstance! usage with a small local accessor function (e.g.,
getFacade() that throws a descriptive error if null) to make failures explicit;
reference the variable facadeInstance and the constructors PipelineOrchestrator,
AutoLoopCoordinator, ExecutionService when making the change.

In `@apps/server/src/services/execution-service.ts`:
- Around line 63-84: The ExecutionService constructor currently takes 18
positional params (notably many callback fns like runAgentFn, executePipelineFn,
updateFeatureStatusFn, loadFeatureFn, etc.), which is fragile; refactor by
introducing an ExecutionServiceCallbacks interface that groups all callback
function types (e.g., runAgent, executePipeline, updateFeatureStatus,
loadFeature, getPlanningPromptPrefix, saveFeatureSummary, recordLearnings,
contextExists, resumeFeature, trackFailure, signalPause, recordSuccess,
saveExecutionState, loadContextFiles), then change the ExecutionService
constructor signature to accept the four core deps (eventBus,
concurrencyManager, worktreeResolver, settingsService) plus a single callbacks
object; update usages/wiring (e.g., in facade.ts) to pass a callbacks object
rather than positional args and update internal references from runAgentFn ->
callbacks.runAgent, etc.

In `@apps/server/src/services/feature-state-manager.ts`:
- Around line 330-339: The version bump logic incorrectly uses a falsy check (if
(updates.content && ...)) so setting updates.content to an empty string won't
trigger a version increment; change the condition in the update block that
currently reads against updates.content to explicitly check for presence (e.g.,
updates.content !== undefined) and compare it to oldContent before incrementing
feature.planSpec.version on feature.planSpec (ensure you reference the
oldContent captured from feature.planSpec.content and update
feature.planSpec.version accordingly).
- Around line 432-449: The code silently ignores cases where taskId doesn't
match any entry in feature.planSpec.tasks; add an else branch after the task
lookup in the method that logs a warning (including featureId, projectPath,
taskId and maybe available task IDs) so mismatches are visible; do this near the
block that updates task.status, feature.updatedAt, calls atomicWriteJson and
emitAutoModeEvent (referencing the same function scope where task is found) to
ensure a warning is emitted when task === undefined.

In `@apps/server/src/services/pipeline-orchestrator.ts`:
- Around line 554-587: The parser in parseTestLines currently treats any line
containing "toBe" or "toEqual" (and the generic "AssertionError") as a failed
test which causes false positives; update the heuristic to only record such
lines as failures when they appear in a failure context (e.g., line begins with
"AssertionError" or contains "Error" or the word "FAIL"/"FAILED" nearby) or when
associated with a test file marker (the current /^>\s+.*\.(test|spec)\./ check),
and consolidate failure-detection logic so failedTests, failCount and passCount
are incremented consistently; specifically tighten matching for "toBe"/"toEqual"
and "AssertionError" (in parseTestLines) to require contextual cues rather than
matching any occurrence.
- Around line 565-584: The current loose heuristic pushes any line with
"toBe"/"toEqual" into failedTests; tighten it by replacing the trimmed.includes
checks with a stricter pattern match: only add trimmed to failedTests if it
matches assertion-call patterns like
/\.to(Be|Equal)\s*\(|/expect\(.+\)\.(toBe|toEqual)\s*\(/ or if the line contains
explicit failure context such as "AssertionError" or "Expected"/"Received" or
the immediately preceding line indicated a FAIL/FAILED; update the code inside
the for (const line of lines) loop (variables: trimmed, failedTests,
passCount/failCount) to use these regex checks instead of the current substring
checks so passing/comments containing "toBe"/"toEqual" are not misclassified.
apps/server/src/services/feature-state-manager.ts (2)

330-339: Falsy check on updates.content silently skips version bump for empty-string content.

Line 337: if (updates.content && ...) — if someone sets content to "", the version won't increment even though content changed. Consider using updates.content !== undefined if the intent is to detect any deliberate content change.

This is a minor edge case — plan content is unlikely to be intentionally empty.

Stricter check (if needed)
-      if (updates.content && updates.content !== oldContent) {
+      if (updates.content !== undefined && updates.content !== oldContent) {
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/feature-state-manager.ts` around lines 330 - 339,
The version bump logic incorrectly uses a falsy check (if (updates.content &&
...)) so setting updates.content to an empty string won't trigger a version
increment; change the condition in the update block that currently reads against
updates.content to explicitly check for presence (e.g., updates.content !==
undefined) and compare it to oldContent before incrementing
feature.planSpec.version on feature.planSpec (ensure you reference the
oldContent captured from feature.planSpec.content and update
feature.planSpec.version accordingly).

432-449: Missing warning when task is not found by ID.

If the taskId doesn't match any task in planSpec.tasks, the method silently returns without logging. This makes debugging subtle task-ID mismatches difficult. Consider adding an else branch:

Add a warning for missing task
       const task = feature.planSpec.tasks.find((t) => t.id === taskId);
       if (task) {
         task.status = status;
         feature.updatedAt = new Date().toISOString();
 
         // PERSIST BEFORE EMIT
         await atomicWriteJson(featurePath, feature, { backupCount: DEFAULT_BACKUP_COUNT });
 
         // Emit event for UI update
         this.emitAutoModeEvent('auto_mode_task_status', {
           featureId,
           projectPath,
           taskId,
           status,
           tasks: feature.planSpec.tasks,
         });
+      } else {
+        logger.warn(`Task ${taskId} not found in feature ${featureId}`);
       }
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/feature-state-manager.ts` around lines 432 - 449,
The code silently ignores cases where taskId doesn't match any entry in
feature.planSpec.tasks; add an else branch after the task lookup in the method
that logs a warning (including featureId, projectPath, taskId and maybe
available task IDs) so mismatches are visible; do this near the block that
updates task.status, feature.updatedAt, calls atomicWriteJson and
emitAutoModeEvent (referencing the same function scope where task is found) to
ensure a warning is emitted when task === undefined.
apps/server/src/services/pipeline-orchestrator.ts (2)

554-587: Consider the typo 'AssertionError' — should this be 'AssertionError'?

Hmm, on closer inspection both spellings resolve to the same string. However, note that this heuristic parser is fairly fragile — it matches any line containing toBe/toEqual as a "failed test" even if it appears in passing test output or comments. This is a minor concern given the context (building a summary for agent consumption), so not blocking.

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/pipeline-orchestrator.ts` around lines 554 - 587,
The parser in parseTestLines currently treats any line containing "toBe" or
"toEqual" (and the generic "AssertionError") as a failed test which causes false
positives; update the heuristic to only record such lines as failures when they
appear in a failure context (e.g., line begins with "AssertionError" or contains
"Error" or the word "FAIL"/"FAILED" nearby) or when associated with a test file
marker (the current /^>\s+.*\.(test|spec)\./ check), and consolidate
failure-detection logic so failedTests, failCount and passCount are incremented
consistently; specifically tighten matching for "toBe"/"toEqual" and
"AssertionError" (in parseTestLines) to require contextual cues rather than
matching any occurrence.

565-584: Heuristic test parsing is loose — lines matching toBe/toEqual anywhere get added as "failed tests".

Lines 579-580 match any line containing toBe or toEqual, which could appear in passing assertions or comments, inflating the failure list. Since this feeds an agent prompt (not user-facing logic), it's a minor concern — the deduplication and cap at 10/20 entries mitigate the noise.

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/pipeline-orchestrator.ts` around lines 565 - 584,
The current loose heuristic pushes any line with "toBe"/"toEqual" into
failedTests; tighten it by replacing the trimmed.includes checks with a stricter
pattern match: only add trimmed to failedTests if it matches assertion-call
patterns like /\.to(Be|Equal)\s*\(|/expect\(.+\)\.(toBe|toEqual)\s*\(/ or if the
line contains explicit failure context such as "AssertionError" or
"Expected"/"Received" or the immediately preceding line indicated a FAIL/FAILED;
update the code inside the for (const line of lines) loop (variables: trimmed,
failedTests, passCount/failCount) to use these regex checks instead of the
current substring checks so passing/comments containing "toBe"/"toEqual" are not
misclassified.
apps/server/src/services/agent-executor.ts (2)

126-144: Raw JSONL output uses pretty-printed JSON — not valid JSONL.

Line 130 serializes with JSON.stringify(..., null, 4), producing multi-line entries. JSONL requires one JSON object per line. Since this is a debug-only feature gated by AUTOMAKER_DEBUG_RAW_OUTPUT, it's low impact, but tools expecting JSONL will choke on the output.

Suggested fix
-          JSON.stringify({ timestamp: new Date().toISOString(), event }, null, 4)
+          JSON.stringify({ timestamp: new Date().toISOString(), event })
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/agent-executor.ts` around lines 126 - 144, The
appendRawEvent helper is writing pretty-printed multi-line JSON (JSON.stringify
with spacing) which breaks JSONL; change it to emit a single-line JSON per event
by removing the space/indent argument so each push to rawOutputLines is
JSON.stringify({ timestamp: ..., event }) (ensure rawOutputLines, rawOutputPath,
appendRawEvent and the debounce logic using AgentExecutor.WRITE_DEBOUNCE_MS
remain unchanged), and retain the existing write/debounce/try-catch behavior so
the file appends one JSON object per line compatible with JSONL.

554-555: Inconsistent nullish handling: || here vs ?? on line 336.

Line 336 uses sdkOptions?.maxTurns ?? 50 (null-coalescing), but line 555 uses sdkOptions?.maxTurns || 100 (falsy-coalescing). If maxTurns were explicitly 0, || would silently substitute 100. Consider using ?? for consistency.

-            this.buildExecOpts(options, revPrompt, sdkOptions?.maxTurns || 100)
+            this.buildExecOpts(options, revPrompt, sdkOptions?.maxTurns ?? 100)
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/agent-executor.ts` around lines 554 - 555, In
provider.executeQuery(...this.buildExecOpts(..., sdkOptions?.maxTurns || 100))
replace the fallback operator with nullish coalescing to match the earlier use;
change the expression using sdkOptions?.maxTurns || 100 to sdkOptions?.maxTurns
?? 100 so an explicit 0 is honored (look for provider.executeQuery and
buildExecOpts references and align with the sdkOptions?.maxTurns ?? 50 pattern
used elsewhere).
apps/server/src/services/auto-mode/facade.ts (3)

668-677: verifyFeature uses execAsync (shell) for hardcoded commands — consider execFile for consistency.

The rest of the file migrated git operations to execGitCommand (which uses spawnProcess/array args). verifyFeature still uses execAsync which spawns a shell. Since the commands are hardcoded string literals, there's no injection risk here, but switching to execFile (or splitting command/args) would be more consistent with the rest of the file.

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-mode/facade.ts` around lines 668 - 677, The
verifyFeature loop currently calls execAsync(check.cmd) which spawns a shell;
replace this with a non-shell invocation for consistency by using execFile or
the existing execGitCommand/spawnProcess helpers: split the hardcoded command
string in each verificationChecks entry into command + args (or update entries
to provide cmd and args), call execFile(cmd, args, { cwd: workDir, timeout:
120000 }) or call execGitCommand/spawnProcess equivalent to run the command
without a shell, then keep the same success/error handling that pushes to
results and flips allPassed on catch. Ensure you reference the
verificationChecks items, the verifyFeature loop, and preserve output capture
and the existing error.message usage.

98-433: The create factory method is ~340 lines of complex wiring — consider extracting.

This static factory wires 7+ services with interleaved callback definitions and duplicated runAgentFn implementations (lines 167–238 and lines 286–363 are near-identical). Extracting the runAgentFn builder into a shared helper would reduce duplication and the risk of them diverging.

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-mode/facade.ts` around lines 98 - 433, The
create factory method contains duplicated runAgentFn logic used by
PipelineOrchestrator and ExecutionService (see PipelineOrchestrator constructor
call and ExecutionService constructor call) which should be extracted into a
single helper to avoid divergence; implement a private/shared function (e.g.,
buildAgentRunner or createRunAgentFn) that captures dependencies (agentExecutor,
settingsService, getProviderByModelId, stripProviderPrefix, ProviderFactory,
planApprovalService, featureStateManager) and returns the async runAgentFn
signature expected by both PipelineOrchestrator and ExecutionService, then
replace the inline implementations in the PipelineOrchestrator and
ExecutionService calls with calls to that helper to eliminate duplication and
keep behavior consistent.

148-148: facadeInstance! non-null assertions rely on deferred callback invocation.

The callbacks passed to PipelineOrchestrator, AutoLoopCoordinator, and ExecutionService (lines 165, 251, 269, etc.) use facadeInstance! which is only assigned at line 416. This is safe only if none of these constructors invoke the callbacks synchronously. Worth a brief comment to document this invariant for future maintainers.

🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/auto-mode/facade.ts` at line 148, The non-null
assertions on facadeInstance used inside callbacks passed to
PipelineOrchestrator, AutoLoopCoordinator, and ExecutionService are fragile
because facadeInstance is only assigned later; add a clear comment above the
declaration of facadeInstance describing the invariant ("callbacks passed to
PipelineOrchestrator / AutoLoopCoordinator / ExecutionService must not be
invoked synchronously and may reference facadeInstance after it's assigned") so
future maintainers know why facadeInstance! is safe, and optionally replace
direct facadeInstance! usage with a small local accessor function (e.g.,
getFacade() that throws a descriptive error if null) to make failures explicit;
reference the variable facadeInstance and the constructors PipelineOrchestrator,
AutoLoopCoordinator, ExecutionService when making the change.
apps/server/src/services/execution-service.ts (1)

63-84: Constructor accepts 18 positional parameters — fragile and hard to maintain.

With 14 callback parameters all typed as function signatures, it's easy to swap arguments during wiring (as seen in facade.ts which has to carefully order them). Consider grouping the callbacks into an options/config object.

Sketch
interface ExecutionServiceCallbacks {
  runAgent: RunAgentFn;
  executePipeline: ExecutePipelineFn;
  updateFeatureStatus: UpdateFeatureStatusFn;
  loadFeature: LoadFeatureFn;
  // ...etc
}

constructor(
  private eventBus: TypedEventBus,
  private concurrencyManager: ConcurrencyManager,
  private worktreeResolver: WorktreeResolver,
  private settingsService: SettingsService | null,
  private callbacks: ExecutionServiceCallbacks
) {}
🤖 Prompt for AI Agents
Before applying any fix, first verify the finding against the current code and
decide whether a code change is actually needed. If the finding is not valid or
no change is required, do not modify code for that item and briefly explain why
it was skipped.
In `@apps/server/src/services/execution-service.ts` around lines 63 - 84, The
ExecutionService constructor currently takes 18 positional params (notably many
callback fns like runAgentFn, executePipelineFn, updateFeatureStatusFn,
loadFeatureFn, etc.), which is fragile; refactor by introducing an
ExecutionServiceCallbacks interface that groups all callback function types
(e.g., runAgent, executePipeline, updateFeatureStatus, loadFeature,
getPlanningPromptPrefix, saveFeatureSummary, recordLearnings, contextExists,
resumeFeature, trackFailure, signalPause, recordSuccess, saveExecutionState,
loadContextFiles), then change the ExecutionService constructor signature to
accept the four core deps (eventBus, concurrencyManager, worktreeResolver,
settingsService) plus a single callbacks object; update usages/wiring (e.g., in
facade.ts) to pass a callbacks object rather than positional args and update
internal references from runAgentFn -> callbacks.runAgent, etc.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Fix all issues with AI agents
Verify each finding against the current code and only fix it if needed.


In `@apps/server/src/services/agent-executor.ts`:
- Around line 377-384: detectPhaseCompleteMarker is being run on the streaming
chunk variable text instead of the accumulated taskOutput (so markers split
across chunks can be missed) and there’s no dedup guard like for
taskStartDetected/taskCompleteDetected; change the phase detection to run on
taskOutput (use detectPhaseCompleteMarker(taskOutput)) and add a boolean guard
(e.g., phaseCompleteDetected) similar to taskStartDetected/taskCompleteDetected
to ensure eventBus.emitAutoModeEvent('auto_mode_phase_complete', {...,
phaseNumber: pn}) is emitted only once per phase.
- Around line 518-520: The variables hasFeedback and hasEdits are currently set
to `.length` (a number or undefined) which relies on truthiness; change them to
explicit booleans to match their names — e.g., set hasFeedback =
Boolean(approvalResult.feedback?.trim().length) or hasFeedback =
(approvalResult.feedback?.trim().length ?? 0) > 0 and likewise for hasEdits —
then keep the existing conditional that throws the Error('Plan cancelled by
user') when both are false; update the expressions near approvalResult.feedback
and approvalResult.editedPlan to use the chosen explicit boolean form.
- Around line 52-276: The execute method is too large and mixes concerns;
extract the file-write setup and the stream-processing loop into private helper
methods to reduce cyclomatic complexity. Move the writeToFile, scheduleWrite,
and appendRawEvent logic into a private method (e.g., setupFileWriters or
flushHelpers) that returns the helper functions and cleanup handles; extract the
for-await-of provider.executeQuery loop body (the streaming logic that handles
msg types, text/tool blocks, spec detection and calls to
handleSpecGenerated/executeTasksLoop) into a private method (e.g.,
processStreamMessages) that accepts the stream iterator, options, callbacks, and
mutable state (responseText, specDetected, tasksCompleted, aborted) and returns
the updated state; keep execute as the orchestration layer that calls
setupFileWriters, then processStreamMessages, handles cleanup (clearing timers,
final writes) and returns the result.

In `@apps/server/src/services/auto-mode/facade.ts`:
- Around line 178-249: Extract the duplicated provider-resolution +
agentExecutor.execute logic into a shared async helper (e.g.
resolveAndExecuteAgent) that takes the same parameters shown in the diff
(workDir, featureId, prompt, abortController, pPath, imagePaths?, model?, opts?)
and performs: resolveModel defaulting to 'claude-sonnet-4-20250514', call
ProviderFactory.getProviderForModel and stripProviderPrefix, call
getProviderByModelId to populate claudeCompatibleProvider and credentials, then
invoke agentExecutor.execute with the same options and callbacks (use
planApprovalService.waitForApproval and featureStateManager.saveFeatureSummary
for save/update, and keep the buildTaskPrompt logic). Replace the anonymous
async callbacks passed to both PipelineOrchestrator and ExecutionService with
calls to this new resolveAndExecuteAgent helper so all provider-resolution and
agent execution logic lives in one place.
- Line 188: The code hardcodes a model fallback ("claude-sonnet-4-20250514")
when computing resolvedModel; replace that by calling resolveModelString with
the alias fallback: use resolveModelString(model || 'sonnet') where
resolvedModel is assigned (and do the same replacement at the other occurrence
around line 315), ensuring you import or reference resolveModelString and remove
the hardcoded full model name so aliases are converted to full model IDs before
API calls.
- Around line 996-1008: These facade methods (markAllRunningFeaturesInterrupted,
getStatus, getRunningAgents) operate globally by calling
concurrencyManager.getAllRunning() and touching features without scoping to
this.projectPath; update each method to first filter the running feature list by
rf.projectPath === this.projectPath (e.g., const projectRunning =
this.concurrencyManager.getAllRunning().filter(rf => rf.projectPath ===
this.projectPath)) and then operate on projectRunning (call
this.featureStateManager.markFeatureInterrupted for
markAllRunningFeaturesInterrupted and use the filtered list for getStatus() and
getRunningAgents()) so only features belonging to this.projectPath are affected
and reported.
- Around line 625-634: The loop in verifyFeature uses shell-based execAsync for
running verification commands (see the loop referencing execAsync and
verificationChecks), which risks shell metacharacter interpretation of workDir;
replace execAsync with the non-shell spawnProcess (or the same helper used by
execGitCommand) to run commands without a shell, passing args and cwd
explicitly, wiring stdout/stderr streams to capture output (concatenate or
prefer stdout then stderr), enforce the same timeout behavior, and preserve the
existing result handling (push {check: check.name, passed: true/false, output:
...} and set allPassed = false on error and break). Ensure you invoke
spawnProcess with the command split into program + args (from check.cmd), handle
process exit vs thrown errors to populate (error).message or collected output,
and keep the break behavior identical.
- Around line 560-587: The stubbed followUpFeature currently calls
this.concurrencyManager.acquire (stored in runningEntry) and then always throws,
causing an unnecessary acquire/release cycle; remove the acquire and the
corresponding this.concurrencyManager.release call (and the runningEntry
variable) from the facade implementation so the method simply throws and
performs the existing error classification/event emission without reserving a
concurrency slot; if you want to keep a precondition check instead, explicitly
document and perform a non-mutating check (not acquire) using concurrencyManager
APIs, but do not call acquire/release in the stub.
- Around line 655-709: commitFeature currently calls
this.featureStateManager.loadFeature twice; hoist the feature load into a single
variable used throughout. Specifically, in commitFeature load the feature once
(e.g., const feature = await
this.featureStateManager.loadFeature(this.projectPath, featureId)) before the
worktree resolution logic, then use that feature for branchName lookup with
worktreeResolver.findWorktreeForBranch(...) and later for building the commit
message and event payload; remove the later redundant load so the function only
performs one feature load.
- Around line 14-48: The file imports exec and creates execAsync but those are
only used by verifyFeature; if you refactor verifyFeature to use spawnProcess
(or otherwise remove shell-based verification), delete the exec import and the
execAsync constant to avoid dead code; search for verifyFeature and replace its
use of execAsync with spawnProcess (or remove the verification block) and then
remove the now-unused exec/execAsync references from the top of facade.ts.
- Around line 146-159: Add a unit test to guard the deferred facade binding:
instantiate the module that calls create() (the factory that assigns
facadeInstance) and assert that after calling create() the closure-backed
accessor getFacade() returns a non-null AutoModeServiceFacade (and does not
throw). Target the public factory/function that performs the creation (the
create() call that sets facadeInstance) and exercise the callbacks passed to
PipelineOrchestrator/AutoLoopCoordinator/ExecutionService asynchronously to
ensure the invariant holds; fail the test if getFacade() throws or returns null.

In `@apps/server/src/services/feature-state-manager.ts`:
- Around line 335-344: The code currently copies all fields via
Object.assign(feature.planSpec, updates) and then always auto-increments
feature.planSpec.version when content changed, which causes a double-increment
if the caller supplied an explicit version; modify the logic in the update
routine (around oldContent, Object.assign(feature.planSpec, updates)) so that
you only auto-increment feature.planSpec.version when updates.content is
provided and different AND updates.version is undefined (i.e., do not increment
if the caller explicitly set updates.version); keep using the same symbols
(oldContent, Object.assign(feature.planSpec, updates), feature.planSpec.version)
to locate and change the conditional.

In `@apps/server/src/services/pipeline-orchestrator.ts`:
- Around line 582-583: The condition checking trimmed.includes('AssertionError')
is duplicated; update the second operand so it either removes the redundancy or
matches the intended alternate error string (e.g., replace the second
'AssertionError' with 'AssertionFailedError' or another correct variant),
keeping the logic that if either string is present you push trimmed onto
failedTests (refer to the trimmed variable and failedTests array in
pipeline-orchestrator.ts).
- Around line 373-380: The completion event is being emitted unconditionally
even when reloadedFeature.status === 'merge_conflict'; update the logic around
eventBus.emitAutoModeEvent('auto_mode_feature_complete', {...}) so it only emits
a successful completion when reloadedFeature.status indicates success (not
'merge_conflict'), and for the merge_conflict case either skip emission or emit
a distinct/adjusted event with passes: false and a clear message (e.g., 'Merge
conflict detected') referencing reloadedFeature.status, featureId,
feature.title/branchName and projectPath to construct the event payload; ensure
event emission happens after the status check that previously skipped the
update.
- Around line 504-552: The attemptMerge method calls
worktreeResolver.getCurrentBranch but the test/mock doesn't provide that method,
which can cause runtime failures; fix this by either (A) adding getCurrentBranch
to the worktreeResolver test mock to return a sensible default branch used as
the merge target, or (B) introduce a dedicated
WorktreeResolver.getDefaultMergeBranch() method (implement in WorktreeResolver)
and update attemptMerge to call getDefaultMergeBranch() (with a 'main' fallback)
so the intent is explicit; update the tests to mock the new method accordingly
and keep performMerge mocked as before.
- Around line 56-70: The PipelineOrchestrator constructor currently accepts 12
positional dependencies (TypedEventBus, FeatureStateManager, AgentExecutor,
TestRunnerService, WorktreeResolver, ConcurrencyManager, SettingsService|null,
UpdateFeatureStatusFn, loadContextFiles, BuildFeaturePromptFn, ExecuteFeatureFn,
RunAgentFn), which is unwieldy; refactor the constructor to accept a single
options/deps object (e.g. type PipelineOrchestratorOptions) that contains those
named properties, update the class to destructure that options object in the
constructor, and then update all call sites and tests that instantiate
PipelineOrchestrator to pass a named object instead of a long positional
argument list, preserving existing types and nullability for SettingsService and
keeping the same property names as in the original constructor.
🧹 Nitpick comments (8)
🤖 Fix all nitpicks with AI agents
Verify each finding against the current code and only fix it if needed.


In `@apps/server/src/services/agent-executor.ts`:
- Around line 518-520: The variables hasFeedback and hasEdits are currently set
to `.length` (a number or undefined) which relies on truthiness; change them to
explicit booleans to match their names — e.g., set hasFeedback =
Boolean(approvalResult.feedback?.trim().length) or hasFeedback =
(approvalResult.feedback?.trim().length ?? 0) > 0 and likewise for hasEdits —
then keep the existing conditional that throws the Error('Plan cancelled by
user') when both are false; update the expressions near approvalResult.feedback
and approvalResult.editedPlan to use the chosen explicit boolean form.
- Around line 52-276: The execute method is too large and mixes concerns;
extract the file-write setup and the stream-processing loop into private helper
methods to reduce cyclomatic complexity. Move the writeToFile, scheduleWrite,
and appendRawEvent logic into a private method (e.g., setupFileWriters or
flushHelpers) that returns the helper functions and cleanup handles; extract the
for-await-of provider.executeQuery loop body (the streaming logic that handles
msg types, text/tool blocks, spec detection and calls to
handleSpecGenerated/executeTasksLoop) into a private method (e.g.,
processStreamMessages) that accepts the stream iterator, options, callbacks, and
mutable state (responseText, specDetected, tasksCompleted, aborted) and returns
the updated state; keep execute as the orchestration layer that calls
setupFileWriters, then processStreamMessages, handles cleanup (clearing timers,
final writes) and returns the result.

In `@apps/server/src/services/auto-mode/facade.ts`:
- Around line 625-634: The loop in verifyFeature uses shell-based execAsync for
running verification commands (see the loop referencing execAsync and
verificationChecks), which risks shell metacharacter interpretation of workDir;
replace execAsync with the non-shell spawnProcess (or the same helper used by
execGitCommand) to run commands without a shell, passing args and cwd
explicitly, wiring stdout/stderr streams to capture output (concatenate or
prefer stdout then stderr), enforce the same timeout behavior, and preserve the
existing result handling (push {check: check.name, passed: true/false, output:
...} and set allPassed = false on error and break). Ensure you invoke
spawnProcess with the command split into program + args (from check.cmd), handle
process exit vs thrown errors to populate (error).message or collected output,
and keep the break behavior identical.
- Around line 560-587: The stubbed followUpFeature currently calls
this.concurrencyManager.acquire (stored in runningEntry) and then always throws,
causing an unnecessary acquire/release cycle; remove the acquire and the
corresponding this.concurrencyManager.release call (and the runningEntry
variable) from the facade implementation so the method simply throws and
performs the existing error classification/event emission without reserving a
concurrency slot; if you want to keep a precondition check instead, explicitly
document and perform a non-mutating check (not acquire) using concurrencyManager
APIs, but do not call acquire/release in the stub.
- Around line 655-709: commitFeature currently calls
this.featureStateManager.loadFeature twice; hoist the feature load into a single
variable used throughout. Specifically, in commitFeature load the feature once
(e.g., const feature = await
this.featureStateManager.loadFeature(this.projectPath, featureId)) before the
worktree resolution logic, then use that feature for branchName lookup with
worktreeResolver.findWorktreeForBranch(...) and later for building the commit
message and event payload; remove the later redundant load so the function only
performs one feature load.
- Around line 14-48: The file imports exec and creates execAsync but those are
only used by verifyFeature; if you refactor verifyFeature to use spawnProcess
(or otherwise remove shell-based verification), delete the exec import and the
execAsync constant to avoid dead code; search for verifyFeature and replace its
use of execAsync with spawnProcess (or remove the verification block) and then
remove the now-unused exec/execAsync references from the top of facade.ts.
- Around line 146-159: Add a unit test to guard the deferred facade binding:
instantiate the module that calls create() (the factory that assigns
facadeInstance) and assert that after calling create() the closure-backed
accessor getFacade() returns a non-null AutoModeServiceFacade (and does not
throw). Target the public factory/function that performs the creation (the
create() call that sets facadeInstance) and exercise the callbacks passed to
PipelineOrchestrator/AutoLoopCoordinator/ExecutionService asynchronously to
ensure the invariant holds; fail the test if getFacade() throws or returns null.

In `@apps/server/src/services/pipeline-orchestrator.ts`:
- Around line 56-70: The PipelineOrchestrator constructor currently accepts 12
positional dependencies (TypedEventBus, FeatureStateManager, AgentExecutor,
TestRunnerService, WorktreeResolver, ConcurrencyManager, SettingsService|null,
UpdateFeatureStatusFn, loadContextFiles, BuildFeaturePromptFn, ExecuteFeatureFn,
RunAgentFn), which is unwieldy; refactor the constructor to accept a single
options/deps object (e.g. type PipelineOrchestratorOptions) that contains those
named properties, update the class to destructure that options object in the
constructor, and then update all call sites and tests that instantiate
PipelineOrchestrator to pass a named object instead of a long positional
argument list, preserving existing types and nullability for SettingsService and
keeping the same property names as in the original constructor.
apps/server/src/services/auto-mode/facade.ts (5)

625-634: verifyFeature still uses shell-based execAsync for running verification commands.

While the commands themselves are hardcoded (npm run lint, etc.), execAsync spawns a shell, which means the cwd path is interpreted by the shell. Consider using spawnProcess (or a similar non-shell API) consistent with how execGitCommand works, to avoid any shell metacharacter interpretation of workDir.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/facade.ts` around lines 625 - 634, The
loop in verifyFeature uses shell-based execAsync for running verification
commands (see the loop referencing execAsync and verificationChecks), which
risks shell metacharacter interpretation of workDir; replace execAsync with the
non-shell spawnProcess (or the same helper used by execGitCommand) to run
commands without a shell, passing args and cwd explicitly, wiring stdout/stderr
streams to capture output (concatenate or prefer stdout then stderr), enforce
the same timeout behavior, and preserve the existing result handling (push
{check: check.name, passed: true/false, output: ...} and set allPassed = false
on error and break). Ensure you invoke spawnProcess with the command split into
program + args (from check.cmd), handle process exit vs thrown errors to
populate (error).message or collected output, and keep the break behavior
identical.

560-587: followUpFeature acquires a concurrency slot only to immediately release it.

The stub acquires a slot (line 560), then throws (line 569), which triggers the finally block to release (line 586). While this is safe, the acquire/release is wasted work. Since this is a known stub, the concurrency slot acquisition could be removed until the method is implemented — unless you specifically want to validate that the concurrency slot can be acquired as a precondition check.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/facade.ts` around lines 560 - 587, The
stubbed followUpFeature currently calls this.concurrencyManager.acquire (stored
in runningEntry) and then always throws, causing an unnecessary acquire/release
cycle; remove the acquire and the corresponding this.concurrencyManager.release
call (and the runningEntry variable) from the facade implementation so the
method simply throws and performs the existing error classification/event
emission without reserving a concurrency slot; if you want to keep a
precondition check instead, explicitly document and perform a non-mutating check
(not acquire) using concurrencyManager APIs, but do not call acquire/release in
the stub.

655-709: Feature is loaded twice in commitFeature — once for worktree resolution, once for commit message.

When providedWorktreePath is falsy, loadFeature is called at line 667, and then unconditionally again at line 686. Hoist the load to avoid the redundant I/O:

♻️ Proposed fix
   async commitFeature(featureId: string, providedWorktreePath?: string): Promise<string | null> {
     let workDir = this.projectPath;
+    const feature = await this.featureStateManager.loadFeature(this.projectPath, featureId);
 
     if (providedWorktreePath) {
       try {
         await secureFs.access(providedWorktreePath);
         workDir = providedWorktreePath;
       } catch {
         // Use project path
       }
     } else {
-      const feature = await this.featureStateManager.loadFeature(this.projectPath, featureId);
       const branchName = feature?.branchName;
       if (branchName) {
         ...
       }
     }
 
     try {
       ...
-      const feature = await this.featureStateManager.loadFeature(this.projectPath, featureId);
       const title =
         feature?.description?.split('\n')[0]?.substring(0, 60) || `Feature ${featureId}`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/facade.ts` around lines 655 - 709,
commitFeature currently calls this.featureStateManager.loadFeature twice; hoist
the feature load into a single variable used throughout. Specifically, in
commitFeature load the feature once (e.g., const feature = await
this.featureStateManager.loadFeature(this.projectPath, featureId)) before the
worktree resolution logic, then use that feature for branchName lookup with
worktreeResolver.findWorktreeForBranch(...) and later for building the commit
message and event payload; remove the later redundant load so the function only
performs one feature load.

14-48: exec import and execAsync are only used by verifyFeature — consider removing if verification is refactored.

exec (line 15) and execAsync (line 48) exist solely for the shell-based verification commands. If verifyFeature is updated to use spawnProcess, these become dead imports. Minor cleanup to track.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/facade.ts` around lines 14 - 48, The file
imports exec and creates execAsync but those are only used by verifyFeature; if
you refactor verifyFeature to use spawnProcess (or otherwise remove shell-based
verification), delete the exec import and the execAsync constant to avoid dead
code; search for verifyFeature and replace its use of execAsync with
spawnProcess (or remove the verification block) and then remove the now-unused
exec/execAsync references from the top of facade.ts.

146-159: Deferred facade binding pattern is sound but fragile.

The getFacade() closure with a null-check guard is a reasonable approach. The comment documenting the invariant (callbacks are only invoked asynchronously, never during construction) is appreciated. However, if any downstream service changes to invoke a callback synchronously during construction, this will throw a cryptic error.

Consider adding a brief unit test that verifies facadeInstance is non-null after create() returns, to guard against regressions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/facade.ts` around lines 146 - 159, Add a
unit test to guard the deferred facade binding: instantiate the module that
calls create() (the factory that assigns facadeInstance) and assert that after
calling create() the closure-backed accessor getFacade() returns a non-null
AutoModeServiceFacade (and does not throw). Target the public factory/function
that performs the creation (the create() call that sets facadeInstance) and
exercise the callbacks passed to
PipelineOrchestrator/AutoLoopCoordinator/ExecutionService asynchronously to
ensure the invariant holds; fail the test if getFacade() throws or returns null.
apps/server/src/services/agent-executor.ts (2)

518-520: Potential type confusion: .length used as boolean without explicit comparison.

hasFeedback and hasEdits are assigned numeric values (from .length) or undefined, then used in boolean context. While this works in JS/TS due to truthiness, the variable names suggest boolean intent. This is minor but could confuse future readers.

Optional: make intent explicit
-          const hasFeedback = approvalResult.feedback?.trim().length,
-            hasEdits = approvalResult.editedPlan?.trim().length;
+          const hasFeedback = (approvalResult.feedback?.trim().length ?? 0) > 0;
+          const hasEdits = (approvalResult.editedPlan?.trim().length ?? 0) > 0;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/agent-executor.ts` around lines 518 - 520, The
variables hasFeedback and hasEdits are currently set to `.length` (a number or
undefined) which relies on truthiness; change them to explicit booleans to match
their names — e.g., set hasFeedback =
Boolean(approvalResult.feedback?.trim().length) or hasFeedback =
(approvalResult.feedback?.trim().length ?? 0) > 0 and likewise for hasEdits —
then keep the existing conditional that throws the Error('Plan cancelled by
user') when both are false; update the expressions near approvalResult.feedback
and approvalResult.editedPlan to use the chosen explicit boolean form.

52-276: The execute method is ~225 lines long with deeply nested logic.

This method handles streaming, spec detection, plan approval branching, file I/O scheduling, heartbeats, and raw output — all in one function. While functional, the high cyclomatic complexity makes it harder to test and maintain. Consider extracting the stream-processing loop (lines 182-260) and the file-write setup (lines 114-143) into focused private methods in a future pass.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/agent-executor.ts` around lines 52 - 276, The
execute method is too large and mixes concerns; extract the file-write setup and
the stream-processing loop into private helper methods to reduce cyclomatic
complexity. Move the writeToFile, scheduleWrite, and appendRawEvent logic into a
private method (e.g., setupFileWriters or flushHelpers) that returns the helper
functions and cleanup handles; extract the for-await-of provider.executeQuery
loop body (the streaming logic that handles msg types, text/tool blocks, spec
detection and calls to handleSpecGenerated/executeTasksLoop) into a private
method (e.g., processStreamMessages) that accepts the stream iterator, options,
callbacks, and mutable state (responseText, specDetected, tasksCompleted,
aborted) and returns the updated state; keep execute as the orchestration layer
that calls setupFileWriters, then processStreamMessages, handles cleanup
(clearing timers, final writes) and returns the result.
apps/server/src/services/pipeline-orchestrator.ts (1)

56-70: Constructor takes 12 parameters — consider a config/options object.

The constructor has grown to accept 12 dependencies. While this is common for orchestrators, it reduces readability and makes instantiation (especially in tests) cumbersome. A grouped options/deps object would improve ergonomics.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/pipeline-orchestrator.ts` around lines 56 - 70, The
PipelineOrchestrator constructor currently accepts 12 positional dependencies
(TypedEventBus, FeatureStateManager, AgentExecutor, TestRunnerService,
WorktreeResolver, ConcurrencyManager, SettingsService|null,
UpdateFeatureStatusFn, loadContextFiles, BuildFeaturePromptFn, ExecuteFeatureFn,
RunAgentFn), which is unwieldy; refactor the constructor to accept a single
options/deps object (e.g. type PipelineOrchestratorOptions) that contains those
named properties, update the class to destructure that options object in the
constructor, and then update all call sites and tests that instantiate
PipelineOrchestrator to pass a named object instead of a long positional
argument list, preserving existing types and nullability for SettingsService and
keeping the same property names as in the original constructor.

Comment on lines +377 to +384
const pn = detectPhaseCompleteMarker(text);
if (pn !== null)
this.eventBus.emitAutoModeEvent('auto_mode_phase_complete', {
featureId,
projectPath,
branchName,
phaseNumber: pn,
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Find the implementation of detectPhaseCompleteMarker and related marker detection functions
rg -n "detectPhaseCompleteMarker|detectTaskStartMarker|detectTaskCompleteMarker" --type=ts -C 3

Repository: AutoMaker-Org/automaker

Length of output: 14676


🏁 Script executed:

# Also search for where these functions are defined
rg -n "function detectPhaseCompleteMarker|export.*detectPhaseCompleteMarker" --type=ts -B 2 -A 10

Repository: AutoMaker-Org/automaker

Length of output: 1136


🏁 Script executed:

# Check the agent-executor.ts file around lines 354-384 to see the full context
cat -n apps/server/src/services/agent-executor.ts | sed -n '340,400p'

Repository: AutoMaker-Org/automaker

Length of output: 2740


detectPhaseCompleteMarker called on chunk text instead of accumulated taskOutput, inconsistent with other marker detections.

Lines 354 and 366 correctly detect task markers on the accumulated taskOutput, but line 377 detects the phase marker on the individual streaming chunk text. The unit tests (spec-parser.test.ts:267-276) confirm all three marker functions are designed to work on accumulated text. If a phase-complete marker is split across two chunks, it will be missed.

Additionally, unlike the task markers which have guards (if (!taskStartDetected) and if (!taskCompleteDetected)) to ensure single detection, the phase marker detection lacks such a guard, so changing to taskOutput would also need deduplication logic to prevent emitting the same phase-complete event multiple times as the accumulated text grows.

Proposed fix
-              const pn = detectPhaseCompleteMarker(text);
-              if (pn !== null)
+              if (!phaseCompleted && detectPhaseCompleteMarker(taskOutput) !== null) {
+                phaseCompleted = true;
+                const pn = detectPhaseCompleteMarker(taskOutput);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/agent-executor.ts` around lines 377 - 384,
detectPhaseCompleteMarker is being run on the streaming chunk variable text
instead of the accumulated taskOutput (so markers split across chunks can be
missed) and there’s no dedup guard like for
taskStartDetected/taskCompleteDetected; change the phase detection to run on
taskOutput (use detectPhaseCompleteMarker(taskOutput)) and add a boolean guard
(e.g., phaseCompleteDetected) similar to taskStartDetected/taskCompleteDetected
to ensure eventBus.emitAutoModeEvent('auto_mode_phase_complete', {...,
phaseNumber: pn}) is emitted only once per phase.

Comment on lines +178 to +249
async (
workDir: string,
featureId: string,
prompt: string,
abortController: AbortController,
pPath: string,
imagePaths?: string[],
model?: string,
opts?: Record<string, unknown>
) => {
const resolvedModel = model || 'claude-sonnet-4-20250514';
const provider = ProviderFactory.getProviderForModel(resolvedModel);
const effectiveBareModel = stripProviderPrefix(resolvedModel);

// Resolve custom provider (GLM, MiniMax, etc.) for baseUrl and credentials
let claudeCompatibleProvider:
| import('@automaker/types').ClaudeCompatibleProvider
| undefined;
let credentials: import('@automaker/types').Credentials | undefined;
if (resolvedModel && settingsService) {
const providerResult = await getProviderByModelId(
resolvedModel,
settingsService,
'[AutoModeFacade]'
);
if (providerResult.provider) {
claudeCompatibleProvider = providerResult.provider;
credentials = providerResult.credentials;
}
}

await agentExecutor.execute(
{
workDir,
featureId,
prompt,
projectPath: pPath,
abortController,
imagePaths,
model: resolvedModel,
planningMode: opts?.planningMode as PlanningMode | undefined,
requirePlanApproval: opts?.requirePlanApproval as boolean | undefined,
previousContent: opts?.previousContent as string | undefined,
systemPrompt: opts?.systemPrompt as string | undefined,
autoLoadClaudeMd: opts?.autoLoadClaudeMd as boolean | undefined,
thinkingLevel: opts?.thinkingLevel as ThinkingLevel | undefined,
branchName: opts?.branchName as string | null | undefined,
provider,
effectiveBareModel,
credentials,
claudeCompatibleProvider,
},
{
waitForApproval: (fId, projPath) => planApprovalService.waitForApproval(fId, projPath),
saveFeatureSummary: (projPath, fId, summary) =>
featureStateManager.saveFeatureSummary(projPath, fId, summary),
updateFeatureSummary: (projPath, fId, summary) =>
featureStateManager.saveFeatureSummary(projPath, fId, summary),
buildTaskPrompt: (task, allTasks, taskIndex, _planContent, template, feedback) => {
let taskPrompt = template
.replace(/\{\{taskName\}\}/g, task.description)
.replace(/\{\{taskIndex\}\}/g, String(taskIndex + 1))
.replace(/\{\{totalTasks\}\}/g, String(allTasks.length))
.replace(/\{\{taskDescription\}\}/g, task.description || `Task ${task.id}`);
if (feedback) {
taskPrompt = taskPrompt.replace(/\{\{userFeedback\}\}/g, feedback);
}
return taskPrompt;
},
}
);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Substantial duplication between the two runAgentFn callbacks.

The PipelineOrchestrator callback (lines 178–249) and ExecutionService callback (lines 297–374) contain nearly identical provider-resolution + agentExecutor.execute logic. This violates DRY and makes it easy for fixes to land in one copy but not the other (as evidenced by the task.description bug above).

Extract a shared helper, e.g.:

const resolveAndExecuteAgent = async (
  workDir: string, featureId: string, prompt: string,
  abortController: AbortController, pPath: string,
  imagePaths?: string[], model?: string,
  opts?: Record<string, unknown>
) => {
  const resolvedModel = model || 'claude-sonnet-4-20250514';
  const provider = ProviderFactory.getProviderForModel(resolvedModel);
  const effectiveBareModel = stripProviderPrefix(resolvedModel);
  // ... shared provider resolution ...
  await agentExecutor.execute(/* ... */);
};

Then pass resolveAndExecuteAgent to both PipelineOrchestrator and ExecutionService.

Also applies to: 291-400

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/facade.ts` around lines 178 - 249, Extract
the duplicated provider-resolution + agentExecutor.execute logic into a shared
async helper (e.g. resolveAndExecuteAgent) that takes the same parameters shown
in the diff (workDir, featureId, prompt, abortController, pPath, imagePaths?,
model?, opts?) and performs: resolveModel defaulting to
'claude-sonnet-4-20250514', call ProviderFactory.getProviderForModel and
stripProviderPrefix, call getProviderByModelId to populate
claudeCompatibleProvider and credentials, then invoke agentExecutor.execute with
the same options and callbacks (use planApprovalService.waitForApproval and
featureStateManager.saveFeatureSummary for save/update, and keep the
buildTaskPrompt logic). Replace the anonymous async callbacks passed to both
PipelineOrchestrator and ExecutionService with calls to this new
resolveAndExecuteAgent helper so all provider-resolution and agent execution
logic lives in one place.

model?: string,
opts?: Record<string, unknown>
) => {
const resolvedModel = model || 'claude-sonnet-4-20250514';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n apps/server/src/services/auto-mode/facade.ts | sed -n '180,200p'

Repository: AutoMaker-Org/automaker

Length of output: 1103


🏁 Script executed:

cat -n apps/server/src/services/auto-mode/facade.ts | sed -n '310,320p'

Repository: AutoMaker-Org/automaker

Length of output: 621


🏁 Script executed:

rg -n "resolveModelString" apps/server/src/ --type ts -B 2 -A 2

Repository: AutoMaker-Org/automaker

Length of output: 7920


🏁 Script executed:

fd -e ts -e tsx apps/server/src/ -exec grep -l "resolveModelString" {} \;

Repository: AutoMaker-Org/automaker

Length of output: 320


Use resolveModelString() instead of hardcoding model fallback.

Lines 188 and 315 both use a hardcoded fallback to 'claude-sonnet-4-20250514' instead of resolving a model alias. Replace the hardcoded string with resolveModelString(model || 'sonnet') to align with the coding guidelines requirement to convert model aliases to full model names before making API calls.

See usage pattern elsewhere in codebase
// From execution-service.ts
const model = resolveModelString(feature.model, DEFAULT_MODELS.claude);

// From enhance.ts
resolveModelString(model, CLAUDE_MODEL_MAP.sonnet)

// From ideation-service.ts
let modelId = resolveModelString(options?.model ?? 'sonnet');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/facade.ts` at line 188, The code hardcodes
a model fallback ("claude-sonnet-4-20250514") when computing resolvedModel;
replace that by calling resolveModelString with the alias fallback: use
resolveModelString(model || 'sonnet') where resolvedModel is assigned (and do
the same replacement at the other occurrence around line 315), ensuring you
import or reference resolveModelString and remove the hardcoded full model name
so aliases are converted to full model IDs before API calls.

Comment on lines +996 to +1008
async markAllRunningFeaturesInterrupted(reason?: string): Promise<void> {
const allRunning = this.concurrencyManager.getAllRunning();

for (const rf of allRunning) {
await this.featureStateManager.markFeatureInterrupted(rf.projectPath, rf.featureId, reason);
}

if (allRunning.length > 0) {
logger.info(
`Marked ${allRunning.length} running feature(s) as interrupted: ${reason || 'no reason provided'}`
);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how markAllRunningFeaturesInterrupted is called
rg -n "markAllRunningFeaturesInterrupted" apps/server/src/ --type ts -B 3 -A 3

Repository: AutoMaker-Org/automaker

Length of output: 3101


🏁 Script executed:

#!/bin/bash
# Examine the facade class structure
head -100 apps/server/src/services/auto-mode/facade.ts | cat -n

Repository: AutoMaker-Org/automaker

Length of output: 4985


🏁 Script executed:

#!/bin/bash
# Find concurrencyManager definition and getAllRunning implementation
rg -n "getAllRunning" apps/server/src/ --type ts -B 2 -A 5

Repository: AutoMaker-Org/automaker

Length of output: 8633


🏁 Script executed:

#!/bin/bash
# Check what autoModeService is at the call site in index.ts
sed -n '870,895p' apps/server/src/index.ts | cat -n

Repository: AutoMaker-Org/automaker

Length of output: 1135


🏁 Script executed:

#!/bin/bash
# Compare getStatus and getRunningAgents implementations in facade vs global-service
echo "=== FACADE getStatus ===" && \
sed -n '718,724p' apps/server/src/services/auto-mode/facade.ts && \
echo -e "\n=== GLOBAL getStatus ===" && \
sed -n '121,127p' apps/server/src/services/auto-mode/global-service.ts && \
echo -e "\n=== FACADE getRunningAgents ===" && \
sed -n '773,800p' apps/server/src/services/auto-mode/facade.ts && \
echo -e "\n=== GLOBAL getRunningAgents ===" && \
sed -n '151,180p' apps/server/src/services/auto-mode/global-service.ts

Repository: AutoMaker-Org/automaker

Length of output: 2549


🏁 Script executed:

#!/bin/bash
# Find where autoModeService is instantiated in index.ts
rg -n "autoModeService\s*=" apps/server/src/index.ts | head -5

Repository: AutoMaker-Org/automaker

Length of output: 161


🏁 Script executed:

#!/bin/bash
# Check routes to see how facade methods are called
rg -n "getStatus\|getRunningAgents\|markAllRunningFeaturesInterrupted" apps/server/src/routes/ --type ts -B 2 -A 1 | head -40

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

#!/bin/bash
# View the full compat.ts to understand the architecture
cat -n apps/server/src/services/auto-mode/compat.ts | head -100

Repository: AutoMaker-Org/automaker

Length of output: 4253


🏁 Script executed:

#!/bin/bash
# Search for direct facade method calls to see if these global-scope methods are actually used
rg -n "facade\.(getStatus|getRunningAgents|markAllRunningFeaturesInterrupted)" apps/server/src/ --type ts

Repository: AutoMaker-Org/automaker

Length of output: 261


🏁 Script executed:

#!/bin/bash
# Check how facade is used in routes - are these methods accessed through compat or directly?
rg -n "createFacade\|\.getStatus\(\)\|\.getRunningAgents\(\)" apps/server/src/routes/ --type ts -B 3 -A 2 | head -50

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

#!/bin/bash
# Check compat.ts for per-project status methods
grep -n "getStatusForProject\|getRunningFeaturesForProject" apps/server/src/services/auto-mode/compat.ts

Repository: AutoMaker-Org/automaker

Length of output: 153


Facade methods should be project-scoped, not global.

markAllRunningFeaturesInterrupted, getStatus(), and getRunningAgents() on the per-project facade operate on all running features globally instead of filtering by this.projectPath. This is inconsistent with the facade's design (per-project factory pattern) and could mark features from other projects as interrupted. Filter these methods by this.projectPath:

🐛 Proposed fix
   async markAllRunningFeaturesInterrupted(reason?: string): Promise<void> {
-    const allRunning = this.concurrencyManager.getAllRunning();
+    const allRunning = this.concurrencyManager.getAllRunning()
+      .filter(rf => rf.projectPath === this.projectPath);
 
     for (const rf of allRunning) {

Also apply to getStatus() (line 719) and getRunningAgents() (line 775).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/auto-mode/facade.ts` around lines 996 - 1008, These
facade methods (markAllRunningFeaturesInterrupted, getStatus, getRunningAgents)
operate globally by calling concurrencyManager.getAllRunning() and touching
features without scoping to this.projectPath; update each method to first filter
the running feature list by rf.projectPath === this.projectPath (e.g., const
projectRunning = this.concurrencyManager.getAllRunning().filter(rf =>
rf.projectPath === this.projectPath)) and then operate on projectRunning (call
this.featureStateManager.markFeatureInterrupted for
markAllRunningFeaturesInterrupted and use the filtered list for getStatus() and
getRunningAgents()) so only features belonging to this.projectPath are affected
and reported.

Comment on lines +335 to +344
// Capture old content BEFORE applying updates for version comparison
const oldContent = feature.planSpec.content;

// Apply updates
Object.assign(feature.planSpec, updates);

// If content is being updated and it's different from old content, increment version
if (updates.content !== undefined && updates.content !== oldContent) {
feature.planSpec.version = (feature.planSpec.version || 0) + 1;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Version double-increment when updates includes both content and version.

Object.assign(feature.planSpec, updates) at line 339 applies all updates (including any explicit version), then lines 342-343 unconditionally increment version again if content changed. If a caller passes { content: '...', version: 5 }, the version ends up as 6 instead of the intended 5.

Proposed fix — only auto-increment when version isn't explicitly provided
      // If content is being updated and it's different from old content, increment version
-     if (updates.content !== undefined && updates.content !== oldContent) {
+     if (updates.content !== undefined && updates.content !== oldContent && updates.version === undefined) {
        feature.planSpec.version = (feature.planSpec.version || 0) + 1;
      }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Capture old content BEFORE applying updates for version comparison
const oldContent = feature.planSpec.content;
// Apply updates
Object.assign(feature.planSpec, updates);
// If content is being updated and it's different from old content, increment version
if (updates.content !== undefined && updates.content !== oldContent) {
feature.planSpec.version = (feature.planSpec.version || 0) + 1;
}
// Capture old content BEFORE applying updates for version comparison
const oldContent = feature.planSpec.content;
// Apply updates
Object.assign(feature.planSpec, updates);
// If content is being updated and it's different from old content, increment version
if (updates.content !== undefined && updates.content !== oldContent && updates.version === undefined) {
feature.planSpec.version = (feature.planSpec.version || 0) + 1;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/feature-state-manager.ts` around lines 335 - 344,
The code currently copies all fields via Object.assign(feature.planSpec,
updates) and then always auto-increments feature.planSpec.version when content
changed, which causes a double-increment if the caller supplied an explicit
version; modify the logic in the update routine (around oldContent,
Object.assign(feature.planSpec, updates)) so that you only auto-increment
feature.planSpec.version when updates.content is provided and different AND
updates.version is undefined (i.e., do not increment if the caller explicitly
set updates.version); keep using the same symbols (oldContent,
Object.assign(feature.planSpec, updates), feature.planSpec.version) to locate
and change the conditional.

Comment on lines +373 to +380
this.eventBus.emitAutoModeEvent('auto_mode_feature_complete', {
featureId,
featureName: feature.title,
branchName: feature.branchName ?? null,
passes: true,
message: 'Pipeline resumed successfully',
projectPath,
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Completion event emitted even when pipeline ended in merge_conflict.

When reloadedFeature.status === 'merge_conflict' (line 369), the status update is correctly skipped. However, the auto_mode_feature_complete event at lines 373-380 still fires with passes: true and message: 'Pipeline resumed successfully', which is misleading for a merge-conflict outcome.

Proposed fix
      // Only update status if not already in a terminal state
      if (reloadedFeature && reloadedFeature.status !== 'merge_conflict') {
        await this.updateFeatureStatusFn(projectPath, featureId, finalStatus);
+       this.eventBus.emitAutoModeEvent('auto_mode_feature_complete', {
+         featureId,
+         featureName: feature.title,
+         branchName: feature.branchName ?? null,
+         passes: true,
+         message: 'Pipeline resumed successfully',
+         projectPath,
+       });
+     } else {
+       logger.info(`Pipeline resume for feature ${featureId} ended with merge conflict`);
      }
-     logger.info(`Pipeline resume completed for feature ${featureId}`);
-     this.eventBus.emitAutoModeEvent('auto_mode_feature_complete', {
-       featureId,
-       featureName: feature.title,
-       branchName: feature.branchName ?? null,
-       passes: true,
-       message: 'Pipeline resumed successfully',
-       projectPath,
-     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/pipeline-orchestrator.ts` around lines 373 - 380,
The completion event is being emitted unconditionally even when
reloadedFeature.status === 'merge_conflict'; update the logic around
eventBus.emitAutoModeEvent('auto_mode_feature_complete', {...}) so it only emits
a successful completion when reloadedFeature.status indicates success (not
'merge_conflict'), and for the merge_conflict case either skip emission or emit
a distinct/adjusted event with passes: false and a clear message (e.g., 'Merge
conflict detected') referencing reloadedFeature.status, featureId,
feature.title/branchName and projectPath to construct the event payload; ensure
event emission happens after the status check that previously skipped the
update.

Comment on lines +504 to +552
/** Attempt to merge feature branch (REQ-F05) */
async attemptMerge(context: PipelineContext): Promise<MergeResult> {
const { projectPath, featureId, branchName, worktreePath, feature } = context;
if (!branchName) return { success: false, error: 'No branch name for merge' };

logger.info(`Attempting auto-merge for feature ${featureId} (branch: ${branchName})`);
try {
// Get the primary branch dynamically instead of hardcoding 'main'
const targetBranch = await this.worktreeResolver.getCurrentBranch(projectPath);

// Call merge service directly instead of HTTP fetch
const result = await performMerge(
projectPath,
branchName,
worktreePath || projectPath,
targetBranch || 'main',
{
deleteWorktreeAndBranch: false,
}
);

if (!result.success) {
if (result.hasConflicts) {
await this.updateFeatureStatusFn(projectPath, featureId, 'merge_conflict');
this.eventBus.emitAutoModeEvent('pipeline_merge_conflict', {
featureId,
branchName,
projectPath,
});
return { success: false, hasConflicts: true, needsAgentResolution: true };
}
return { success: false, error: result.error };
}

logger.info(`Auto-merge successful for feature ${featureId}`);
this.eventBus.emitAutoModeEvent('auto_mode_feature_complete', {
featureId,
featureName: feature.title,
branchName,
passes: true,
message: 'Pipeline completed and merged',
projectPath,
});
return { success: true };
} catch (error) {
logger.error(`Merge failed for ${featureId}:`, error);
return { success: false, error: (error as Error).message };
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Check test files that mock worktreeResolver and look for getCurrentBranch
rg -n "worktreeResolver" --type=ts -g '*test*' -A 10 | head -80

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Find the WorktreeResolver class definition and check its methods
fd -e ts -e tsx | xargs rg -l "class WorktreeResolver|export.*WorktreeResolver" | head -5

Repository: AutoMaker-Org/automaker

Length of output: 112


🏁 Script executed:

# Look for any definitions of getCurrentBranch in the codebase
rg -n "getCurrentBranch" --type=ts -B 3 -A 3 | head -60

Repository: AutoMaker-Org/automaker

Length of output: 4889


🏁 Script executed:

# Check for dedicated primary/default branch methods
rg -n "getPrimaryBranch|getDefaultBranch|getMainBranch|primaryBranch|defaultBranch" --type=ts | head -40

Repository: AutoMaker-Org/automaker

Length of output: 3623


🏁 Script executed:

# Check the actual attemptMerge implementation and surrounding context
cat -n apps/server/src/services/pipeline-orchestrator.ts | sed -n '500,560p'

Repository: AutoMaker-Org/automaker

Length of output: 2583


🏁 Script executed:

# Find pipeline-orchestrator test files
fd -e test.ts -e spec.ts | xargs rg -l "pipeline-orchestrator" 2>/dev/null | head -10

Repository: AutoMaker-Org/automaker

Length of output: 128


🏁 Script executed:

# Also search for any test files that instantiate or mock PipelineOrchestrator
rg -l "PipelineOrchestrator" --type=ts -g '*test*' -g '*spec*' 2>/dev/null

Repository: AutoMaker-Org/automaker

Length of output: 128


🏁 Script executed:

# Check if there's a test file for pipeline-orchestrator specifically
fd "pipeline-orchestrator" -e test.ts -e spec.ts 2>/dev/null

Repository: AutoMaker-Org/automaker

Length of output: 128


🏁 Script executed:

# Look at the worktreeResolver initialization in pipeline-orchestrator to understand how it's used
cat -n apps/server/src/services/pipeline-orchestrator.ts | head -100

Repository: AutoMaker-Org/automaker

Length of output: 4572


🏁 Script executed:

# Check the test file for pipeline-orchestrator to see how worktreeResolver is mocked
cat -n apps/server/tests/unit/services/pipeline-orchestrator.test.ts | head -150

Repository: AutoMaker-Org/automaker

Length of output: 5757


🏁 Script executed:

# Search for worktreeResolver mock setup in the test
rg -n "worktreeResolver" apps/server/tests/unit/services/pipeline-orchestrator.test.ts -B 3 -A 3

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Check if attemptMerge test exists and how it mocks worktreeResolver
rg -n "attemptMerge\|getCurrentBranch" apps/server/tests/unit/services/pipeline-orchestrator.test.ts -B 5 -A 5

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Continue reading the test file to see the mock setup for worktreeResolver
cat -n apps/server/tests/unit/services/pipeline-orchestrator.test.ts | sed -n '150,250p'

Repository: AutoMaker-Org/automaker

Length of output: 4058


🏁 Script executed:

# Search for any test that calls attemptMerge in the entire test file
rg -n "attemptMerge" apps/server/tests/unit/services/pipeline-orchestrator.test.ts

Repository: AutoMaker-Org/automaker

Length of output: 528


🏁 Script executed:

# Get the complete worktreeResolver mock setup
rg -n "mockWorktreeResolver\|worktreeResolver.*=" apps/server/tests/unit/services/pipeline-orchestrator.test.ts -A 5

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Get the attemptMerge test cases to see the details
cat -n apps/server/tests/unit/services/pipeline-orchestrator.test.ts | sed -n '464,570p'

Repository: AutoMaker-Org/automaker

Length of output: 4124


🏁 Script executed:

# Check if getCurrentBranch is mocked anywhere in the test file
rg -n "getCurrentBranch" apps/server/tests/unit/services/pipeline-orchestrator.test.ts

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Check if the test actually runs without mocking performMerge or if there's a path that exercises getCurrentBranch
rg -n "beforeEach\|afterEach" apps/server/tests/unit/services/pipeline-orchestrator.test.ts -A 3 | head -40

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Check the import of performMerge and see how it's used in tests
rg -n "performMerge\|vi.mocked" apps/server/tests/unit/services/pipeline-orchestrator.test.ts | head -20

Repository: AutoMaker-Org/automaker

Length of output: 49


🏁 Script executed:

# Look at WorktreeResolver to confirm getCurrentBranch is a real method
cat -n apps/server/src/services/worktree-resolver.ts | sed -n '40,60p'

Repository: AutoMaker-Org/automaker

Length of output: 811


🏁 Script executed:

# Check if there are alternative methods like getDefaultBranch or similar
rg -n "getDefault\|getPrimary" apps/server/src/services/worktree-resolver.ts

Repository: AutoMaker-Org/automaker

Length of output: 49


Mock needs getCurrentBranch method to prevent runtime failures.

The test mock for worktreeResolver (pipeline-orchestrator.test.ts line 166-168) only mocks findWorktreeForBranch and is missing getCurrentBranch. While the actual WorktreeResolver class has this method (worktree-resolver.ts line 48), the mock doesn't. Tests currently pass because performMerge is mocked, but getCurrentBranch is never validated. If attemptMerge runs with an incomplete mock or in production, it will fail at line 512.

Consider a dedicated method like getDefaultMergeBranch() in WorktreeResolver instead of using getCurrentBranch as a proxy. The current approach conflates the currently checked-out branch with the merge target branch, which may differ if someone manually checks out a feature branch in the main worktree.

🧰 Tools
🪛 GitHub Actions: Test Suite

[error] 512-512: Merge failed: TypeError: this.worktreeResolver.getCurrentBranch is not a function during attemptMerge

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/pipeline-orchestrator.ts` around lines 504 - 552,
The attemptMerge method calls worktreeResolver.getCurrentBranch but the
test/mock doesn't provide that method, which can cause runtime failures; fix
this by either (A) adding getCurrentBranch to the worktreeResolver test mock to
return a sensible default branch used as the merge target, or (B) introduce a
dedicated WorktreeResolver.getDefaultMergeBranch() method (implement in
WorktreeResolver) and update attemptMerge to call getDefaultMergeBranch() (with
a 'main' fallback) so the intent is explicit; update the tests to mock the new
method accordingly and keep performMerge mocked as before.

Comment on lines +582 to +583
if (trimmed.includes('AssertionError') || trimmed.includes('AssertionError')) {
failedTests.push(trimmed);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate identical condition — likely a copy-paste error.

Both sides of the || check for the exact same string 'AssertionError', making the second operand redundant. One was likely intended to catch a common misspelling (e.g., 'AssertionError' vs 'AssertionError').

Proposed fix
-      if (trimmed.includes('AssertionError') || trimmed.includes('AssertionError')) {
+      if (trimmed.includes('AssertionError')) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/server/src/services/pipeline-orchestrator.ts` around lines 582 - 583,
The condition checking trimmed.includes('AssertionError') is duplicated; update
the second operand so it either removes the redundancy or matches the intended
alternate error string (e.g., replace the second 'AssertionError' with
'AssertionFailedError' or another correct variant), keeping the logic that if
either string is present you push trimmed onto failedTests (refer to the trimmed
variable and failedTests array in pipeline-orchestrator.ts).

@gsxdsm gsxdsm merged commit 46dd219 into AutoMaker-Org:v0.15.0rc Feb 16, 2026
6 checks passed
@gsxdsm gsxdsm deleted the refactor/auto-mode-service-gsxdsm branch February 16, 2026 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Do Not Merge Use this label if something should not be merged. Refactor A complete logic rewrite is requested or being performed for an issue. Testers-Requested Request for others to test an enhancement or bug fix/etc. Tests Adding / Updating / Removing tests across the project.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants