Agentic Development: Multi-Agent Coordination for Complex Software Engineering

Overview

This document describes a proven methodology for coordinating multiple AI agents to solve complex software engineering problems through focused, iterative collaboration. The approach emphasizes empirical validation, controlled scope, and systematic knowledge transfer between specialized agents.

Core Principles

1. Focused Agent Specialization

Each agent receives a narrow, well-defined scope with minimal context to prevent:

Scope creep beyond intended functionality
Over-engineering of solutions
Analysis paralysis from too much information
Conflicting approaches within single implementations

Example: Schema Design Agent receives only requirements and foundation context, not implementation details or testing frameworks.

2. Empirical Validation Over Theory

Every optimization or change must be validated with real data and measurements before acceptance:

Theoretical calculations can be completely wrong (our case: predicted 52% reduction, got 1,342% increase)
Real-world testing reveals hidden overhead and complexity
Performance improvements must be demonstrated, not assumed
Failed approaches provide valuable learning for future iterations

3. Structured Knowledge Transfer

Information flows between agents through standardized handoff files rather than shared memory:

Prevents information overload
Creates clear dependency chains
Enables quality gates between phases
Allows stopping/redirecting at any decision point

Architecture

Phase-Based Execution Model

Phase 1: Foundation
├── Agent 1A: Schema Design → tmp/agent_handoff_schema.md
├── Agent 1B: Recovery Logic → tmp/agent_handoff_recovery.md  
└── Agent 1C: Verification → tmp/verification_foundation.md

Phase 2: Pilot Implementation  
├── Agent 2A: Step Analysis → tmp/agent_handoff_step_analysis.md
├── Agent 2B: Conversion → tmp/agent_handoff_converted_procedure.md
├── Agent 2C: Testing → tmp/agent_handoff_pilot_test.md
└── Agent 2D: Verification → tmp/verification_pilot.md

Phase 3: Production Deployment
├── Agent 3A: Migration Creation
├── Agent 3B: Test Framework  
└── Agent 3C: Final Review

Context Management System

Global Context Files

tmp/implementation_status.md - Overall progress and decisions
tmp/current_migration_context.md - Technical requirements and constraints
tmp/verification_checklist.md - Quality gates and success criteria

Agent Handoff Files

Input Context: Previous agent outputs + focused technical requirements
Output Specification: Structured deliverables for next agent
Scope Boundaries: Clear limitations on what agent should/shouldn't do
Success Criteria: Measurable outcomes required for phase completion

Verification Files

Quality Assessment: Pass/fail evaluation of completed work
Issue Identification: Specific problems with remediation paths
Integration Analysis: Compatibility with existing systems
Decision Framework: Go/no-go recommendations with evidence

Implementation Methodology

Agent Task Design

Information Scoping Strategy

AGENT TASK TEMPLATE:

**Mission**: Single focused objective
**Context**: Read only relevant tmp/handoff files  
**Focused Task**: 3-4 specific deliverables
**Technical Constraints**: Hard boundaries and limitations
**Output**: Structured deliverable in tmp/agent_handoff_*.md
**Success Criteria**: Measurable outcomes
**Scope Boundaries**: What NOT to do

Quality Control Mechanisms

Verification Agents: Dedicated agents that only assess quality, don't create
Empirical Testing: All performance claims validated with real data
Integration Gates: Compatibility verification at each phase boundary
Rollback Capability: Clear path to previous working state

Context Management Patterns

Selective Context Access

Agents access broader context through search rather than full consumption:

Agent Receives:
├── Direct Handoff: tmp/agent_handoff_previous.md (always read)
├── Global Context: tmp/implementation_status.md (always read)  
├── Focused Context: tmp/current_migration_context.md (always read)
└── Search-Based Access: Query broader context when specific info needed

Search Patterns Used:

Technical Details: "Find batch size settings" → discovers analysis_batch_size = 32768
Implementation Patterns: "Find UPDATE operations" → locates specific SQL patterns
Integration Points: "Find worker scheduling" → identifies admin.import_job_* functions
Error Handling: "Find error propagation" → discovers existing error patterns

Context Handoff Patterns

Sequential Handoff: Agent B reads Agent A's output file and builds upon it:

Agent 1A: Schema Design
↓ (tmp/agent_handoff_schema.md)
Agent 1B: Recovery Logic (reads schema + searches for recovery patterns)
↓ (tmp/agent_handoff_recovery.md)  
Agent 1C: Verification (reads both + searches for integration requirements)

Parallel Execution: Multiple agents work independently but can search shared context:

Agent 2A: Analysis ──→ tmp/agent_handoff_step_analysis.md
                    ↓ (searches: "current procedure implementations")
Agent 2B: Conversion ←─ (searches: "existing API patterns")
Agent 2C: Testing   ←── (searches: "current job data for testing")
                    ↓
Agent 2D: Verification (reads all outputs + searches for production patterns)

Convergence Points: Verification agents integrate multiple streams:

Foundation Components → Verification Agent → Go/No-Go Decision
Implementation Assets → Verification Agent → Production Readiness  
Test Results + Code → Verification Agent → Deployment Recommendation

Strategic Context Oversight

Dedicated coordination agent periodically reviews entire accumulated context:

Context Oversight Agent:
├── Reviews: All tmp/agent_handoff_*.md files
├── Reviews: All tmp/verification_*.md files  
├── Reviews: tmp/implementation_status.md timeline
├── Searches: Codebase for consistency with agent outputs
└── Outputs: tmp/strategic_context_review.md with:
    ├── Consistency Assessment
    ├── Gap Identification
    ├── Strategic Recommendations
    └── Course Corrections

Oversight Triggers:

After each phase completion
When contradictions detected between agent outputs
When performance targets not being met
When scope expansion beyond original goals detected

Benefits Demonstrated

Prevented Production Disasters

Theoretical optimization showed 52% improvement in analysis
Empirical testing revealed 1,342% performance degradation
Verification agent provided definitive no-go recommendation
Saved organization from catastrophic production deployment

Maintained Solution Quality

Each agent delivered focused, high-quality output within scope
No single agent became overwhelmed by entire problem complexity
Clear decision points prevented continued investment in failed approaches
Reusable components (testing framework, schema designs) created for future iterations

Enabled Rapid Pivoting

Clear phase boundaries allowed stopping after pilot failure
Knowledge accumulated in structured format enabled alternative approaches
Testing infrastructure reusable for validating different optimization strategies
Lessons learned documented for future optimization attempts

Success Factors

Controlled Information Flow

Agents receive minimum viable context for their specific task
Prevents analysis paralysis and scope creep
Enables focused problem-solving within defined boundaries
Reduces cognitive load on individual agents

Selective Context Access

Agents can search and access broader context when needed, but don't read everything
Each agent identifies and reads only relevant portions of accumulated knowledge
Search-based context retrieval prevents information overload while maintaining access to necessary details
Agents cite specific sources (file names, line numbers) when referencing broader context

Benefits Demonstrated:

Agent 2A found actual batch sizes (32,768) by searching codebase, correcting Agent 1B's assumptions (1,000)
Agent 2B located exact UPDATE patterns by searching migration files, enabling precise optimization
Agent 2C found real job data (3,924 processing rows) for empirical testing rather than creating synthetic data
Verification agents could cross-reference claims against actual codebase implementation

Empirical Validation Requirement

All performance claims must be measured with real data
Theoretical calculations validated against actual system behavior
Failed optimizations caught before production deployment
Testing infrastructure becomes reusable asset

Structured Decision Points

Clear go/no-go criteria at each phase
Evidence-based recommendations from verification agents
Ability to stop/redirect without losing accumulated work
Quality gates prevent poor decisions from propagating

Modular Architecture

Each phase builds on previous phase outputs
Components can be reused across different approaches
Failed implementations don't invalidate entire framework
Knowledge accumulates in structured, transferrable format

Strategic Context Oversight

Dedicated coordination agent periodically reviews entire context for consistency
Identifies contradictions, gaps, or misalignments across agent outputs
Ensures global coherence while maintaining individual agent focus
Provides strategic redirection when accumulated knowledge suggests better approaches

Anti-Patterns Avoided

Single Agent Overwhelm

❌ One agent trying to solve entire complex problem
✅ Multiple specialized agents with focused scopes

Theoretical Optimization

❌ Assuming performance improvements based on calculations
✅ Requiring empirical validation with real data

Scope Creep

❌ Agents expanding beyond defined responsibilities
✅ Clear boundaries and handoff specifications

Integration Surprises

❌ Discovering incompatibilities at final deployment
✅ Verification agents checking integration at each phase

Context Information Overload

❌ Agents reading all accumulated context and getting overwhelmed
✅ Selective context access through search-based retrieval

Context Inconsistency

❌ Agent outputs contradicting each other without detection
✅ Strategic oversight agent ensuring global coherence

Lessons Learned

Simplicity Often Wins

Complex optimizations (UNLOGGED tables, additional tracking) can create more overhead than they eliminate. Simple approaches (batch size increases, existing hot-patches) often provide better results with lower risk.

Testing Infrastructure is Valuable

Even failed optimizations can produce valuable testing frameworks and measurement tools that enable future successful optimization attempts.

Agent Coordination Scales

The methodology successfully coordinated 8 specialized agents across 3 phases, with clear deliverables and decision points. This demonstrates scalability to larger, more complex problems.

Early Failure is Success

Catching a failed optimization in pilot phase (rather than production) represents successful risk management and validates the methodology's effectiveness.

Applicability

This methodology applies to complex software engineering problems that:

Require multiple technical disciplines (schema design, performance optimization, testing, deployment)
Have high consequences for failure (production systems, performance-critical applications)
Benefit from iterative development with validation gates
Need systematic knowledge transfer between solution phases
Require empirical validation of theoretical improvements

The approach provides a structured framework for agent coordination that maintains solution quality while preventing common pitfalls of AI-assisted development.

sql_saga Project Application

Current Context: Production-Scale Temporal ETL Performance Optimization

The sql_saga project demonstrates successful AGENTIC methodology application:

Phase 1 Completed: Fixture System Foundation

Agent coordination solved O(n²) scaling crisis (9+ hours → 25 seconds for 1M entities)
Empirical validation prevented production disaster from theoretical optimizations
Modular fixture system enables rapid optimization iteration

Current Status: ETL Performance Analysis

Production benchmark (1M entities) actively running to identify temporal_merge bottlenecks
Next phase will use AGENTIC coordination to optimize O(n²) vs O(n*log(n)) scaling patterns
Focus on PostgreSQL 18 temporal constraints and client production requirements (1.1M+ entities)

Key Adaptations for Temporal Database Optimization:

Empirical Focus: All performance claims validated with fixture-based benchmarks
Search-Based Context: Query pg_stat_monitor data and codebase for technical details
Production Scale: Must handle 1.1M+ entities efficiently
Constraint Awareness: PostgreSQL 18 temporal constraint compatibility required

This ongoing application validates the methodology's effectiveness for database performance optimization and temporal data processing at production scale.

FilesExpand file tree

AGENTIC.md

Latest commit

History

AGENTIC.md

File metadata and controls

Agentic Development: Multi-Agent Coordination for Complex Software Engineering

Overview

Core Principles

1. Focused Agent Specialization

2. Empirical Validation Over Theory

3. Structured Knowledge Transfer

Architecture

Phase-Based Execution Model

Context Management System

Global Context Files

Agent Handoff Files

Verification Files

Implementation Methodology

Agent Task Design

Information Scoping Strategy

Quality Control Mechanisms

Context Management Patterns

Selective Context Access

Context Handoff Patterns

Strategic Context Oversight

Benefits Demonstrated

Prevented Production Disasters

Maintained Solution Quality

Enabled Rapid Pivoting

Success Factors

Controlled Information Flow

Selective Context Access

Empirical Validation Requirement

Structured Decision Points

Modular Architecture

Strategic Context Oversight

Anti-Patterns Avoided

Single Agent Overwhelm

Theoretical Optimization

Scope Creep

Integration Surprises

Context Information Overload

Context Inconsistency

Lessons Learned

Simplicity Often Wins

Testing Infrastructure is Valuable

Agent Coordination Scales

Early Failure is Success

Applicability

sql_saga Project Application

Current Context: Production-Scale Temporal ETL Performance Optimization