Project Name: Claude Code Watchdog Duration: 4 weeks (4 sprints) Team Size: 1-2 developers Methodology: Agile with 1-week sprints
- Sprint Duration: 1 week (5 working days)
- Sprint Planning: Monday morning (1 hour)
- Daily Standups: 15 minutes (async via logs acceptable)
- Sprint Review: Friday afternoon (1 hour)
- Sprint Retrospective: Friday afternoon (30 minutes)
Dates: Week 1 Goal: Build the foundational watchdog process with basic monitoring capabilities Success Criteria: Watchdog can monitor a single Claude Code session and auto-continue on TODOs
Priority: P0 (Critical) Estimated Effort: 2 hours Assigned To: Developer Dependencies: None
Description: Create the complete directory structure and placeholder files for the project
Acceptance Criteria:
- All directories created as per architecture
- All PowerShell files created with function signatures
- Module imports working
- Basic script execution verified
Tasks:
- Create src/ directory with all subdirectories
- Create config/, docs/, tests/, examples/ directories
- Create all .ps1 files with function signatures
- Add module imports and dot-sourcing
- Verify structure with test import
Priority: P0 (Critical) Estimated Effort: 3 hours Assigned To: Developer Dependencies: WI-1.1
Description: Create wrapper functions for Windows MCP tools (State, Click, Type, Key)
Acceptance Criteria:
- State-Tool wrapper functional
- Click-Tool wrapper functional
- Type-Tool wrapper functional
- Key-Tool wrapper functional
- Error handling implemented
- Unit tests passing
Tasks:
- Implement
Invoke-WindowsMCPStateTool - Implement
Invoke-WindowsMCPClick - Implement
Invoke-WindowsMCPType - Implement
Invoke-WindowsMCPKey - Add retry logic with exponential backoff
- Write unit tests for each function
- Test with live Claude Code session
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: WI-1.2
Description: Implement state detection logic to classify Claude Code session states
Acceptance Criteria:
- Detects all 6 states correctly (InProgress, WaitingForInput, HasTodos, PhaseComplete, Error, Idle)
- Parses TODOs with count and status
- Detects errors and warnings
- Calculates idle time accurately
- Identifies reply field coordinates
- 95%+ accuracy on test cases
Tasks:
- Implement
Get-ClaudeCodeStatemain function - Implement
Get-SessionStatusclassification logic - Implement
Get-TodosFromUIparser - Implement
Get-ErrorsFromUIparser - Implement
Test-ProcessingIndicator - Create test fixtures with sample UI states
- Run validation tests
Priority: P0 (Critical) Estimated Effort: 3 hours Assigned To: Developer Dependencies: WI-1.3
Description: Implement simple rule-based decision logic (no API yet)
Acceptance Criteria:
- Returns correct action for each state
- Reasoning is clear and actionable
- Confidence scores appropriate
- Handles edge cases gracefully
- Decision logging implemented
Tasks:
- Implement
Invoke-SimpleDecisionfunction - Create rule set for each state
- Add confidence calculation logic
- Implement decision history tracking
- Add unit tests for all decision paths
- Test with simulated states
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: WI-1.2
Description: Implement command sending to Claude Code with retry and verification
Acceptance Criteria:
- Commands sent successfully to Claude Code
- Retry logic works (3 attempts)
- Verification detects send failures
- Handles UI quirks (timing, focus)
- Logs all command attempts
Tasks:
- Implement
Send-ClaudeCodeCommandfunction - Add reply field detection logic
- Implement click → type → enter sequence
- Add verification logic
- Implement retry with exponential backoff
- Add comprehensive error handling
- Test with live Claude Code session
Priority: P0 (Critical) Estimated Effort: 3 hours Assigned To: Developer Dependencies: WI-1.1
Description: Build system to register and manage multiple projects
Acceptance Criteria:
- Can register new projects
- Validates project configurations
- Creates necessary state files
- Stores registry in ~/.claude-automation/
- Can list registered projects
- Can pause/resume projects
Tasks:
- Implement
Register-Projectfunction - Implement
Test-ProjectConfigurationvalidation - Implement
Initialize-ProjectStatesetup - Create
Get-RegisteredProjectsfunction - Create
Update-ProjectStatefunction - Add JSON schema validation
- Test with sample project configs
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: WI-1.3, WI-1.4, WI-1.5, WI-1.6
Description: Implement the core polling loop that orchestrates all components
Acceptance Criteria:
- Loop runs continuously without crashing
- Processes all active projects
- Respects polling interval (2 min default)
- Handles errors without stopping
- Stops gracefully on Ctrl+C
- Updates heartbeat regularly
Tasks:
- Implement
Start-Watchdogmain function - Implement
Process-Projectfunction - Add project iteration logic
- Implement graceful shutdown handler
- Add heartbeat tracking
- Implement error isolation per project
- Add console output with colors
- Test 2+ hour continuous run
Priority: P1 (High) Estimated Effort: 2 hours Assigned To: Developer Dependencies: WI-1.1
Description: Create comprehensive logging and notification system
Acceptance Criteria:
- Logs to markdown files
- Logs to console with colors
- Decision log format correct
- Windows toast notifications work
- Log rotation implemented
- Notification rate limiting works
Tasks:
- Implement
Write-WatchdogLogfunction - Implement
Add-DecisionToLogfunction - Implement
Send-Notificationfunction - Add BurntToast integration
- Create log file rotation logic
- Add timestamp formatting
- Test notification delivery
Priority: P1 (High) Estimated Effort: 2 hours Assigned To: Developer Dependencies: WI-1.1
Description: Create installation wizard for easy setup
Acceptance Criteria:
- Checks prerequisites
- Creates directories
- Installs required modules
- Sets up scheduled task (optional)
- Provides clear error messages
- Runs on fresh Windows install
Tasks:
- Create
Install-Watchdog.ps1script - Add prerequisite checks
- Add module installation logic
- Create directory structure
- Add scheduled task creation (optional)
- Add validation steps
- Test on clean Windows VM
Priority: P1 (High) Estimated Effort: 3 hours Assigned To: Developer Dependencies: WI-1.7
Description: End-to-end testing with real Claude Code session
Acceptance Criteria:
- Can monitor live session
- Detects states correctly
- Sends commands successfully
- Auto-continues on TODOs
- Logs all decisions
- Runs for 2+ hours without issues
Tasks:
- Create test project with config
- Register test project
- Start Claude Code session
- Start watchdog
- Monitor for 2+ hours
- Verify all states detected
- Verify all commands sent
- Review logs for accuracy
- Document any issues
- Fix critical bugs
Total Story Points: 30 Total Estimated Hours: 30 Key Deliverables:
- Working watchdog process
- Basic state detection
- Rule-based decisions
- Auto-continue functionality
- Project registration
- Logging system
Dates: Week 2 Goal: Add Claude API integration, skill-based error resolution, and cost management Success Criteria: Watchdog uses AI to make smart decisions and can invoke skills for errors
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: Sprint 1 Complete
Description: Integrate Anthropic Claude API for intelligent decision-making
Acceptance Criteria:
- Can call Claude API successfully
- API key stored securely (Windows Credential Manager)
- Error handling for API failures
- Retries on transient failures
- Token usage tracked
- Response parsed correctly
Tasks:
- Implement
Invoke-AnthropicAPIfunction - Implement
Set-WatchdogAPIKeyfor secure storage - Implement
Get-SecureAPIKeyretrieval - Add request/response logging
- Add retry logic with backoff
- Test with various prompts
- Validate JSON response parsing
Priority: P0 (Critical) Estimated Effort: 5 hours Assigned To: Developer Dependencies: WI-2.1
Description: Build decision engine that uses Claude API with comprehensive context
Acceptance Criteria:
- Builds detailed decision prompts
- Includes project config in context
- Includes decision history
- Parses API responses to JSON
- Falls back to rules if API fails
- Confidence scores reflect API confidence
Tasks:
- Implement
Invoke-ClaudeAPIDecisionfunction - Implement
Build-DecisionPromptfunction - Add context aggregation logic
- Add response validation
- Implement fallback to rule-based
- Add decision comparison logging
- Test with various scenarios
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: WI-2.2
Description: Enable watchdog to invoke Claude Skills for error resolution
Acceptance Criteria:
- Detects errors that match skills
- Constructs skill invocation commands
- Sends skill commands to Claude Code
- Tracks skill usage
- Logs skill results
- Handles skill failures
Tasks:
- Implement
Find-SkillForErrorfunction - Create error-to-skill mapping logic
- Implement skill command generation
- Add skill invocation tracking
- Test with sample skills
- Document skill integration patterns
Priority: P0 (Critical) Estimated Effort: 3 hours Assigned To: Developer Dependencies: WI-2.1
Description: Track API costs and enforce budget limits
Acceptance Criteria:
- Tracks token usage per call
- Calculates costs based on pricing
- Aggregates costs per project
- Warns at 80% of daily limit
- Stops API calls at 100% of limit
- Generates cost reports
Tasks:
- Implement
Update-APICostsfunction - Implement
Get-APICostsfunction - Implement
Calculate-APICostfunction - Add cost threshold checks
- Add warning/limit enforcement
- Create cost report generator
- Test with simulated usage
Priority: P1 (High) Estimated Effort: 3 hours Assigned To: Developer Dependencies: Sprint 1 WI-1.3
Description: Improve state detection accuracy and add more states
Acceptance Criteria:
- Detects compilation errors specifically
- Detects test failures specifically
- Identifies skill invocations
- Parses error severity levels
- Handles multi-line errors
- 98%+ accuracy
Tasks:
- Add compilation error detection
- Add test failure detection
- Improve error severity classification
- Add multi-line error parsing
- Create additional test fixtures
- Validate accuracy improvements
Priority: P1 (High) Estimated Effort: 2 hours Assigned To: Developer Dependencies: WI-2.2
Description: Enhance decision logs with API metadata and richer context
Acceptance Criteria:
- Logs include API tokens used
- Logs include estimated cost
- Logs include confidence scores
- Logs include skill invocations
- Logs formatted as markdown
- Logs easily readable
Tasks:
- Update
Add-DecisionToLogfunction - Add API metadata fields
- Add skill invocation details
- Improve markdown formatting
- Add decision comparison (API vs Rules)
- Test log readability
Priority: P2 (Medium) Estimated Effort: 2 hours Assigned To: Developer Dependencies: WI-2.1
Description: Create configuration system for API settings
Acceptance Criteria:
- Configurable model selection
- Configurable max tokens
- Configurable temperature
- Configurable cost limits
- Settings persist across restarts
- Validation on config changes
Tasks:
- Add API settings to global config
- Implement
Set-APISettingsfunction - Implement
Get-APISettingsfunction - Add validation for settings
- Test configuration persistence
Priority: P1 (High) Estimated Effort: 3 hours Assigned To: Developer Dependencies: WI-2.2, WI-2.3, WI-2.4
Description: Test AI-powered decision making end-to-end
Acceptance Criteria:
- API decisions more accurate than rules
- Skills invoked correctly
- Costs tracked accurately
- System stays under budget
- Fallback to rules works
- 4+ hour continuous operation
Tasks:
- Set up test project with API enabled
- Create scenarios for testing
- Monitor decision quality
- Verify skill invocations
- Check cost calculations
- Test fallback scenarios
- Document findings
Total Story Points: 26 Total Estimated Hours: 26 Key Deliverables:
- Claude API integration
- AI-powered decisions
- Skill-based error resolution
- Cost tracking and limits
- Enhanced decision logging
Dates: Week 3 Goal: Enable concurrent project monitoring and automated Git operations Success Criteria: Watchdog manages 3+ projects simultaneously with automatic commits and PRs
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: Sprint 2 Complete
Description: Enable watchdog to identify and track multiple Claude Code sessions
Acceptance Criteria:
- Detects all open Claude Code tabs
- Maps sessions to registered projects
- Handles projects without active sessions
- Distinguishes between different projects
- Updates session mapping dynamically
Tasks:
- Implement
Find-ClaudeCodeSessionfunction - Add window title parsing
- Add URL-based project identification
- Create session-to-project mapping
- Handle multiple browser windows
- Test with 3+ concurrent sessions
Priority: P0 (Critical) Estimated Effort: 3 hours Assigned To: Developer Dependencies: WI-3.1
Description: Refactor main loop to process multiple projects efficiently
Acceptance Criteria:
- Processes all active projects each cycle
- Isolates errors per project
- Maintains separate state per project
- No interference between projects
- Resource usage acceptable (<5% CPU)
Tasks:
- Refactor
Process-Projectfor parallel execution - Add project isolation logic
- Implement error quarantine per project
- Add resource monitoring
- Test with 5 concurrent projects
- Optimize for performance
Priority: P0 (Critical) Estimated Effort: 5 hours Assigned To: Developer Dependencies: None (can start early)
Description: Create Git wrapper functions for all operations
Acceptance Criteria:
- Can create branches
- Can commit changes
- Can push to remote
- Can detect commit completion
- Handles authentication
- Error handling for Git failures
Tasks:
- Implement
Invoke-GitBranchfunction - Implement
Invoke-GitCommitfunction - Implement
Invoke-GitPushfunction - Implement
Wait-ForGitCommitfunction - Add Git status checking
- Add authentication handling
- Test with test repository
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: WI-3.3
Description: Implement phase-based workflow management
Acceptance Criteria:
- Detects phase completion
- Triggers commits at phase boundaries
- Advances to next phase automatically
- Sends notifications on transitions
- Logs phase transitions
- Handles final phase completion
Tasks:
- Implement
Invoke-PhaseTransitionfunction - Add phase completion detection
- Implement commit triggering
- Add next phase initialization
- Implement project completion detection
- Add transition logging
- Test full phase progression
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: WI-3.3
Description: Automate PR creation using GitHub API
Acceptance Criteria:
- Can create PRs via GitHub API
- Generates meaningful PR titles
- Includes phase summary in body
- Links to decision logs
- Handles API authentication
- Returns PR URL
Tasks:
- Implement
New-GitHubPullRequestfunction - Add GitHub API integration
- Implement PR title/body generation
- Add authentication handling
- Add error handling for API failures
- Test PR creation
- Verify PR formatting
Priority: P1 (High) Estimated Effort: 4 hours Assigned To: Developer Dependencies: WI-3.1
Description: Enable recovery from watchdog or browser crashes
Acceptance Criteria:
- Detects when sessions disappear
- Saves state before shutdown
- Resumes from saved state
- Notifies on recovery
- Handles corrupted state files
- Manual recovery option available
Tasks:
- Implement state persistence on shutdown
- Implement
Restore-ProjectStatefunction - Add session loss detection
- Add automatic state recovery
- Implement manual recovery command
- Add recovery notifications
- Test crash scenarios
Priority: P2 (Medium) Estimated Effort: 3 hours Assigned To: Developer Dependencies: WI-3.4
Description: Generate progress reports and summaries
Acceptance Criteria:
- Daily progress summaries
- Per-project status reports
- Phase completion reports
- Time tracking per phase
- Markdown-formatted reports
- Can export to CSV
Tasks:
- Implement
Generate-ProgressReportfunction - Implement
Generate-DailySummaryfunction - Add time tracking logic
- Create report templates
- Add CSV export
- Schedule daily reports
Priority: P1 (High) Estimated Effort: 4 hours Assigned To: Developer Dependencies: WI-3.2, WI-3.4, WI-3.5
Description: Test multi-project workflows with Git operations
Acceptance Criteria:
- 3+ projects monitored simultaneously
- Phase transitions work correctly
- Commits created at right times
- PRs created successfully
- Recovery works after interruption
- 8+ hour continuous operation
Tasks:
- Set up 3 test projects
- Start all projects in Claude Code
- Monitor phase progressions
- Verify all commits
- Verify all PRs
- Test recovery scenarios
- Review all logs
- Document issues
Total Story Points: 31 Total Estimated Hours: 31 Key Deliverables:
- Multi-project monitoring
- Git operations (commit, push, PR)
- Phase-based workflows
- Session recovery
- Progress reporting
Dates: Week 4 Goal: Production-ready system with comprehensive testing and documentation Success Criteria: System can be deployed by any user and runs reliably for days
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: Sprint 3 Complete
Description: Add robust error handling across all modules
Acceptance Criteria:
- All functions have try/catch blocks
- Meaningful error messages
- Errors logged appropriately
- Graceful degradation on failures
- No unhandled exceptions
- Recovery attempts before failing
Tasks:
- Audit all functions for error handling
- Add try/catch to all external calls
- Improve error messages
- Add error recovery logic
- Test failure scenarios
- Document error behaviors
Priority: P0 (Critical) Estimated Effort: 6 hours Assigned To: Developer Dependencies: None (can start early)
Description: Create comprehensive unit tests using Pester
Acceptance Criteria:
- 80%+ code coverage
- All core functions tested
- Mock Windows MCP calls
- Mock API calls
- Tests run in CI/CD
- All tests passing
Tasks:
- Set up Pester test framework
- Create test fixtures
- Write tests for state detection
- Write tests for decision engine
- Write tests for Git operations
- Write tests for logging
- Set up test runner
- Achieve 80% coverage
Priority: P1 (High) Estimated Effort: 4 hours Assigned To: Developer Dependencies: WI-4.2
Description: Create end-to-end integration tests
Acceptance Criteria:
- Tests cover full workflows
- Tests use test repositories
- Tests verify file outputs
- Tests check Git operations
- Tests validate notifications
- All tests automated
Tasks:
- Create test project repository
- Write full workflow tests
- Write multi-project tests
- Write recovery tests
- Write Git operation tests
- Automate test execution
- Document test procedures
Priority: P1 (High) Estimated Effort: 3 hours Assigned To: Developer Dependencies: Sprint 3 Complete
Description: Optimize for resource usage and responsiveness
Acceptance Criteria:
- CPU usage <5% when idle
- Memory usage <200MB for 5 projects
- State capture <2 seconds
- Decision latency <5 seconds
- No memory leaks
- Efficient polling
Tasks:
- Profile resource usage
- Optimize state detection
- Add caching where appropriate
- Optimize logging I/O
- Add resource monitoring
- Load test with 10 projects
- Document performance metrics
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: All features complete
Description: Create comprehensive user-facing documentation
Acceptance Criteria:
- README.md complete and accurate
- Quick start guide works
- All commands documented
- Configuration fully explained
- Examples provided
- Screenshots included
Tasks:
- Update README.md
- Create QUICKSTART.md
- Create CONFIGURATION.md
- Add usage examples
- Add troubleshooting section
- Capture screenshots
- Review for clarity
Priority: P1 (High) Estimated Effort: 3 hours Assigned To: Developer Dependencies: All features complete
Description: Document architecture and development guidelines
Acceptance Criteria:
- Architecture documented
- API references complete
- Development setup guide
- Contribution guidelines
- Code style guide
- Module interaction diagrams
Tasks:
- Finalize ARCHITECTURE.md
- Create API-REFERENCE.md
- Create DEVELOPMENT.md
- Create CONTRIBUTING.md
- Add code comments
- Generate module diagrams
Priority: P1 (High) Estimated Effort: 2 hours Assigned To: Developer Dependencies: Testing complete
Description: Document common issues and solutions
Acceptance Criteria:
- All known issues documented
- Solutions provided
- Diagnostic commands included
- FAQ section complete
- Contact information provided
Tasks:
- Create TROUBLESHOOTING.md
- Document common issues
- Add diagnostic procedures
- Create FAQ section
- Add support contact info
- Test solutions work
Priority: P2 (Medium) Estimated Effort: 3 hours Assigned To: Developer Dependencies: WI-4.5
Description: Improve installation script with better UX
Acceptance Criteria:
- Interactive prompts
- Prerequisite auto-installation
- Configuration wizard
- Validation steps
- Rollback on failure
- Success confirmation
Tasks:
- Add interactive prompts
- Add module auto-installation
- Add configuration wizard
- Add validation checks
- Add rollback logic
- Test on clean system
Priority: P0 (Critical) Estimated Effort: 4 hours Assigned To: Developer Dependencies: All WI-4.x items
Description: Deploy and test in production-like environment
Acceptance Criteria:
- Deployed on clean Windows system
- All prerequisites met
- 3+ real projects configured
- 24+ hour continuous operation
- No critical bugs
- Performance acceptable
Tasks:
- Set up clean Windows VM
- Run installation wizard
- Configure 3 real projects
- Start watchdog
- Monitor for 24+ hours
- Collect metrics
- Review all logs
- Fix any critical issues
- Validate success criteria
Priority: P0 (Critical) Estimated Effort: 2 hours Assigned To: Developer Dependencies: WI-4.9
Description: Prepare for v1.0 release
Acceptance Criteria:
- Version numbers updated
- CHANGELOG.md created
- Release notes written
- GitHub release created
- Installation package ready
- License file included
Tasks:
- Update version numbers
- Create CHANGELOG.md
- Write release notes
- Create GitHub release
- Package installation files
- Add LICENSE file
- Tag release in Git
Total Story Points: 35 Total Estimated Hours: 35 Key Deliverables:
- Comprehensive testing (unit + integration)
- Complete documentation
- Performance optimization
- Production-ready deployment
- v1.0 release
- Total Story Points: 122
- Total Estimated Hours: 122 hours
- Sprints: 4
- Work Items: 38
High Risk Items:
- Windows MCP reliability - Mitigation: Extensive error handling and retry logic
- API cost overruns - Mitigation: Strict cost limits and fallback to rules
- Session detection accuracy - Mitigation: Comprehensive testing and refinement
- Multi-project interference - Mitigation: Strong isolation and separate state
- Can monitor 5+ projects simultaneously
- 95%+ state detection accuracy
- Auto-continues on TODOs with 90%+ success rate
- Stays under $10/day API costs
- Runs 24+ hours without crashes
- Complete documentation
- 80%+ code coverage
A work item is "Done" when:
- Code implemented and reviewed
- Unit tests written and passing
- Integration tested
- Documentation updated
- Acceptance criteria met
- No critical bugs
- Demo-able
| Sprint | Dates | Focus | Key Deliverable |
|---|---|---|---|
| Sprint 1 | Week 1 | Foundation | Basic watchdog working |
| Sprint 2 | Week 2 | Intelligence | AI-powered decisions |
| Sprint 3 | Week 3 | Scale | Multi-project + Git |
| Sprint 4 | Week 4 | Polish | Production-ready |
- Review previous day's progress
- Update TODO list
- Work on highest priority item
- Test incrementally
- Commit frequently
- Update documentation
- End-of-day status update
- Follow PowerShell best practices
- Use approved verbs (Get-, Set-, Invoke-, etc.)
- Comment complex logic
- Write tests for all functions
- Keep functions focused (single responsibility)
- Handle errors gracefully
- Daily standup notes in logs
- Blockers reported immediately
- Questions documented and answered
- Decisions logged with reasoning
Created: 2024-11-22 Last Updated: 2024-11-22 Version: 1.0