This document describes a proven methodology for coordinating multiple AI agents to solve complex software engineering problems through focused, iterative collaboration. The approach emphasizes empirical validation, controlled scope, and systematic knowledge transfer between specialized agents.
Each agent receives a narrow, well-defined scope with minimal context to prevent:
- Scope creep beyond intended functionality
- Over-engineering of solutions
- Analysis paralysis from too much information
- Conflicting approaches within single implementations
Example: Schema Design Agent receives only requirements and foundation context, not implementation details or testing frameworks.
Every optimization or change must be validated with real data and measurements before acceptance:
- Theoretical calculations can be completely wrong (our case: predicted 52% reduction, got 1,342% increase)
- Real-world testing reveals hidden overhead and complexity
- Performance improvements must be demonstrated, not assumed
- Failed approaches provide valuable learning for future iterations
Information flows between agents through standardized handoff files rather than shared memory:
- Prevents information overload
- Creates clear dependency chains
- Enables quality gates between phases
- Allows stopping/redirecting at any decision point
Phase 1: Foundation
├── Agent 1A: Schema Design → tmp/agent_handoff_schema.md
├── Agent 1B: Recovery Logic → tmp/agent_handoff_recovery.md
└── Agent 1C: Verification → tmp/verification_foundation.md
Phase 2: Pilot Implementation
├── Agent 2A: Step Analysis → tmp/agent_handoff_step_analysis.md
├── Agent 2B: Conversion → tmp/agent_handoff_converted_procedure.md
├── Agent 2C: Testing → tmp/agent_handoff_pilot_test.md
└── Agent 2D: Verification → tmp/verification_pilot.md
Phase 3: Production Deployment
├── Agent 3A: Migration Creation
├── Agent 3B: Test Framework
└── Agent 3C: Final Review
tmp/implementation_status.md- Overall progress and decisionstmp/current_migration_context.md- Technical requirements and constraintstmp/verification_checklist.md- Quality gates and success criteria
- Input Context: Previous agent outputs + focused technical requirements
- Output Specification: Structured deliverables for next agent
- Scope Boundaries: Clear limitations on what agent should/shouldn't do
- Success Criteria: Measurable outcomes required for phase completion
- Quality Assessment: Pass/fail evaluation of completed work
- Issue Identification: Specific problems with remediation paths
- Integration Analysis: Compatibility with existing systems
- Decision Framework: Go/no-go recommendations with evidence
AGENT TASK TEMPLATE:
**Mission**: Single focused objective
**Context**: Read only relevant tmp/handoff files
**Focused Task**: 3-4 specific deliverables
**Technical Constraints**: Hard boundaries and limitations
**Output**: Structured deliverable in tmp/agent_handoff_*.md
**Success Criteria**: Measurable outcomes
**Scope Boundaries**: What NOT to do- Verification Agents: Dedicated agents that only assess quality, don't create
- Empirical Testing: All performance claims validated with real data
- Integration Gates: Compatibility verification at each phase boundary
- Rollback Capability: Clear path to previous working state
Agents access broader context through search rather than full consumption:
Agent Receives:
├── Direct Handoff: tmp/agent_handoff_previous.md (always read)
├── Global Context: tmp/implementation_status.md (always read)
├── Focused Context: tmp/current_migration_context.md (always read)
└── Search-Based Access: Query broader context when specific info needed
Search Patterns Used:
- Technical Details: "Find batch size settings" → discovers analysis_batch_size = 32768
- Implementation Patterns: "Find UPDATE operations" → locates specific SQL patterns
- Integration Points: "Find worker scheduling" → identifies admin.import_job_* functions
- Error Handling: "Find error propagation" → discovers existing error patterns
Sequential Handoff: Agent B reads Agent A's output file and builds upon it:
Agent 1A: Schema Design
↓ (tmp/agent_handoff_schema.md)
Agent 1B: Recovery Logic (reads schema + searches for recovery patterns)
↓ (tmp/agent_handoff_recovery.md)
Agent 1C: Verification (reads both + searches for integration requirements)
Parallel Execution: Multiple agents work independently but can search shared context:
Agent 2A: Analysis ──→ tmp/agent_handoff_step_analysis.md
↓ (searches: "current procedure implementations")
Agent 2B: Conversion ←─ (searches: "existing API patterns")
Agent 2C: Testing ←── (searches: "current job data for testing")
↓
Agent 2D: Verification (reads all outputs + searches for production patterns)
Convergence Points: Verification agents integrate multiple streams:
Foundation Components → Verification Agent → Go/No-Go Decision
Implementation Assets → Verification Agent → Production Readiness
Test Results + Code → Verification Agent → Deployment Recommendation
Dedicated coordination agent periodically reviews entire accumulated context:
Context Oversight Agent:
├── Reviews: All tmp/agent_handoff_*.md files
├── Reviews: All tmp/verification_*.md files
├── Reviews: tmp/implementation_status.md timeline
├── Searches: Codebase for consistency with agent outputs
└── Outputs: tmp/strategic_context_review.md with:
├── Consistency Assessment
├── Gap Identification
├── Strategic Recommendations
└── Course Corrections
Oversight Triggers:
- After each phase completion
- When contradictions detected between agent outputs
- When performance targets not being met
- When scope expansion beyond original goals detected
- Theoretical optimization showed 52% improvement in analysis
- Empirical testing revealed 1,342% performance degradation
- Verification agent provided definitive no-go recommendation
- Saved organization from catastrophic production deployment
- Each agent delivered focused, high-quality output within scope
- No single agent became overwhelmed by entire problem complexity
- Clear decision points prevented continued investment in failed approaches
- Reusable components (testing framework, schema designs) created for future iterations
- Clear phase boundaries allowed stopping after pilot failure
- Knowledge accumulated in structured format enabled alternative approaches
- Testing infrastructure reusable for validating different optimization strategies
- Lessons learned documented for future optimization attempts
- Agents receive minimum viable context for their specific task
- Prevents analysis paralysis and scope creep
- Enables focused problem-solving within defined boundaries
- Reduces cognitive load on individual agents
- Agents can search and access broader context when needed, but don't read everything
- Each agent identifies and reads only relevant portions of accumulated knowledge
- Search-based context retrieval prevents information overload while maintaining access to necessary details
- Agents cite specific sources (file names, line numbers) when referencing broader context
Benefits Demonstrated:
- Agent 2A found actual batch sizes (32,768) by searching codebase, correcting Agent 1B's assumptions (1,000)
- Agent 2B located exact UPDATE patterns by searching migration files, enabling precise optimization
- Agent 2C found real job data (3,924 processing rows) for empirical testing rather than creating synthetic data
- Verification agents could cross-reference claims against actual codebase implementation
- All performance claims must be measured with real data
- Theoretical calculations validated against actual system behavior
- Failed optimizations caught before production deployment
- Testing infrastructure becomes reusable asset
- Clear go/no-go criteria at each phase
- Evidence-based recommendations from verification agents
- Ability to stop/redirect without losing accumulated work
- Quality gates prevent poor decisions from propagating
- Each phase builds on previous phase outputs
- Components can be reused across different approaches
- Failed implementations don't invalidate entire framework
- Knowledge accumulates in structured, transferrable format
- Dedicated coordination agent periodically reviews entire context for consistency
- Identifies contradictions, gaps, or misalignments across agent outputs
- Ensures global coherence while maintaining individual agent focus
- Provides strategic redirection when accumulated knowledge suggests better approaches
- ❌ One agent trying to solve entire complex problem
- ✅ Multiple specialized agents with focused scopes
- ❌ Assuming performance improvements based on calculations
- ✅ Requiring empirical validation with real data
- ❌ Agents expanding beyond defined responsibilities
- ✅ Clear boundaries and handoff specifications
- ❌ Discovering incompatibilities at final deployment
- ✅ Verification agents checking integration at each phase
- ❌ Agents reading all accumulated context and getting overwhelmed
- ✅ Selective context access through search-based retrieval
- ❌ Agent outputs contradicting each other without detection
- ✅ Strategic oversight agent ensuring global coherence
Complex optimizations (UNLOGGED tables, additional tracking) can create more overhead than they eliminate. Simple approaches (batch size increases, existing hot-patches) often provide better results with lower risk.
Even failed optimizations can produce valuable testing frameworks and measurement tools that enable future successful optimization attempts.
The methodology successfully coordinated 8 specialized agents across 3 phases, with clear deliverables and decision points. This demonstrates scalability to larger, more complex problems.
Catching a failed optimization in pilot phase (rather than production) represents successful risk management and validates the methodology's effectiveness.
This methodology applies to complex software engineering problems that:
- Require multiple technical disciplines (schema design, performance optimization, testing, deployment)
- Have high consequences for failure (production systems, performance-critical applications)
- Benefit from iterative development with validation gates
- Need systematic knowledge transfer between solution phases
- Require empirical validation of theoretical improvements
The approach provides a structured framework for agent coordination that maintains solution quality while preventing common pitfalls of AI-assisted development.
The sql_saga project demonstrates successful AGENTIC methodology application:
Phase 1 Completed: Fixture System Foundation
- Agent coordination solved O(n²) scaling crisis (9+ hours → 25 seconds for 1M entities)
- Empirical validation prevented production disaster from theoretical optimizations
- Modular fixture system enables rapid optimization iteration
Current Status: ETL Performance Analysis
- Production benchmark (1M entities) actively running to identify temporal_merge bottlenecks
- Next phase will use AGENTIC coordination to optimize O(n²) vs O(n*log(n)) scaling patterns
- Focus on PostgreSQL 18 temporal constraints and client production requirements (1.1M+ entities)
Key Adaptations for Temporal Database Optimization:
- Empirical Focus: All performance claims validated with fixture-based benchmarks
- Search-Based Context: Query pg_stat_monitor data and codebase for technical details
- Production Scale: Must handle 1.1M+ entities efficiently
- Constraint Awareness: PostgreSQL 18 temporal constraint compatibility required
This ongoing application validates the methodology's effectiveness for database performance optimization and temporal data processing at production scale.