Skip to content

Latest commit

 

History

History
675 lines (488 loc) · 33.3 KB

File metadata and controls

675 lines (488 loc) · 33.3 KB

Human-Driven, LLM-Navigated Software Development

Status: Draft Author: James Wiesebron, james-in-a-box Created: December 2025 Purpose: A philosophy and framework for software development where humans drive strategy while LLMs handle structural rigor and implementation Guiding Value: Rigor


Part of: A Pragmatic Guide for Software Engineering in a Post-LLM World


This document articulates a new paradigm for software development: human-driven, LLM-navigated. The core insight is that humans and LLMs have complementary cognitive strengths, and optimal software development emerges when each focuses on what they do best.

Humans excel at: Creative problem-solving, strategic decision-making under uncertainty, interpersonal collaboration, intuitive judgment about what matters, and adapting to novel situations.

LLMs excel at: Maintaining structural consistency across large codebases, exhaustive enumeration of edge cases, applying established patterns with unwavering precision, synthesizing large amounts of context, incorporating external research and best practices, and tireless execution of well-defined tasks.

The goal: Free human cognitive capacity for creativity, strategic thinking, and healthy collaboration by offloading structural rigor and implementation details to LLMs. This isn't about replacing humans—it's about amplifying what makes humans uniquely valuable.

The guiding value—rigor: Establish precise roles and maintain them consistently. The captain/navigator metaphor isn't just a suggestion—it's a discipline that prevents the chaos of undefined collaboration. When roles are clear and consistently maintained, both humans and LLMs can operate with confidence.


Table of Contents


The Core Philosophy

Captain and Navigator: Complementary Roles

Consider the analogy of a ship's voyage. Navigation is meaningless without a destination.

A navigator can chart the optimal course, track position, monitor weather, and identify hazards. But none of that matters if you don't know where you're going. The navigator cannot decide where to go—only how to get there once the destination is set.

The captain's essential role:

  • Sets the destination — Where are we going? Why this destination and not another?
  • Establishes priorities — Should we prioritize speed, safety, or fuel efficiency?
  • Makes command decisions — When to change course, when to wait out a storm, when the plan needs adjustment
  • Bears ultimate responsibility — Accountable for the success of the voyage

The navigator's essential role:

  • Tracks position — Where are we right now?
  • Charts the course — What's the optimal path given current conditions?
  • Monitors continuously — Weather, hazards, fuel, crew status
  • Provides critical information — Data the captain needs to make decisions

This isn't about hierarchy—both roles are essential. It's about different cognitive capabilities serving different purposes:

Role Responsibility Cognitive Load
Captain (Human) Sets destination and priorities, makes command decisions, bears ultimate responsibility Creative, strategic, social, accountable
Navigator (LLM) Charts course, tracks position, monitors conditions, identifies hazards Systematic, exhaustive, precise, tireless

This is rigor in action: defining roles precisely based on who's suited for what, and maintaining the discipline to stay in your lane.

The Problem with Traditional Development

Traditional software development places an enormous cognitive burden on humans:

┌────────────────────────────────────────────────────────────────┐
│                    HUMAN COGNITIVE LOAD                        │
│                                                                │
│  Strategic Thinking        │  Implementation Details           │
│  ─────────────────         │  ──────────────────────           │
│  • What should we build?   │  • Did I handle null?             │
│  • Why does this matter?   │  • Is this pattern consistent?    │
│  • How does this fit?      │  • Did I update all call sites?   │
│                            │  • Are the tests comprehensive?   │
│                            │  • Is the documentation current?  │
│                            │  • Did I miss any edge cases?     │
│                            │                                   │
└────────────────────────────────────────────────────────────────┘

A significant portion of cognitive effort goes toward ensuring correctness, consistency, and completeness—tasks that require exhaustive attention to detail rather than creative insight.

The Human-Driven, LLM-Navigated Model

┌────────────────────────────────────────────────────────────────┐
│                    COGNITIVE LOAD REDISTRIBUTION               │
│                                                                │
│     HUMAN (Captain)        │      LLM (Navigator)              │
│     ───────────────        │      ─────────────────            │
│  • What should we build?   │  • Enumerate all edge cases       │
│  • Why does this matter?   │  • Ensure pattern consistency     │
│  • How does this fit?      │  • Update all call sites          │
│  • Is this the right       │  • Generate comprehensive tests   │
│    approach?               │  • Keep documentation current     │
│  • Should we proceed?      │  • Validate against standards     │
│  • What trade-offs are     │  • Track dependencies and         │
│    acceptable?             │    implications                   │
│                            │                                   │
│  Creative, Strategic       │  Systematic, Exhaustive           │
└────────────────────────────────────────────────────────────────┘

Bidirectional Feedback and Growth

A core principle of this model is that feedback flows both ways: human feedback improves LLM behavior, and LLM feedback improves human behavior. This isn't a one-way relationship where humans simply direct and LLMs execute—it's a collaborative loop where both parties learn and adapt.

Human feedback improves LLM behavior:

  • Course corrections help LLMs understand context and preferences
  • Reviews teach LLMs what "good" looks like in this specific codebase
  • Strategic decisions inform future navigation choices
  • Explicit feedback on quality shapes LLM outputs over time

LLM feedback improves human behavior:

  • Systematic enumeration of edge cases trains human thoroughness
  • Consistent application of patterns raises human quality standards
  • Questions during elicitation sharpen human thinking about requirements
  • Comprehensive analysis reveals blind spots in human reasoning

This bidirectional improvement creates a virtuous cycle: the more humans and LLMs work together, the better each becomes at their respective roles. The human becomes a better captain; the LLM becomes a better navigator. The partnership compounds over time.

Non-Attachment to Implementation

A key part of rigorous human-driven development is not being attached to a specific implementation during a development cycle. This non-attachment enables:

  • Rapid cycles of continuous improvement — When you're not invested in defending a particular approach, you can iterate quickly based on evidence
  • Evidence-based decision making — Let data, not ego, determine which approach works best
  • Systematic A/B testing — Compare implementations objectively rather than arguing about preferences
  • Hypothesis-driven development — Frame changes as experiments with measurable outcomes

The captain/navigator model supports this: humans set the strategic intent, not the specific implementation. When LLMs handle implementation details, humans naturally develop healthier distance from any particular solution—freeing them to evaluate approaches based on outcomes rather than authorship.

This also connects to external research: a team that actively incorporates lessons from academic literature, industry best practices, and prior art is inherently less attached to "not invented here" solutions. The LLM navigator can surface relevant research, but the human captain decides what evidence matters.


Division of Cognitive Labor

Human Responsibilities

Domain What Humans Do Why Humans
Vision Define what success looks like Requires understanding of user needs, business context, organizational goals
Strategy Choose between competing approaches Requires judgment under uncertainty, risk tolerance, stakeholder management
Review Approve or reject proposed changes Requires accountability, institutional knowledge, taste
Collaboration Coordinate with other humans Requires empathy, persuasion, relationship-building
Novelty Handle unprecedented situations Requires creative problem-solving, analogical reasoning

LLM Responsibilities

Domain What LLMs Do Why LLMs
Completeness Enumerate every consideration Tireless attention, no cognitive fatigue
Consistency Apply patterns uniformly Perfect recall of established patterns
Precision Get details exactly right No typos, no oversights, no "I'll fix it later"
Documentation Keep everything current No resistance to "boring" work
Validation Verify against standards Instant access to reference materials
Research Incorporate external research and best practices Synthesize large volumes of literature and prior art
Implementation Execute well-defined tasks Efficient translation of spec to code

The Handoff Points

        Human                              LLM
          │                                  │
          │  "I want to add OAuth2 to        │
          │   the API for third-party        │
          │   integrations"                  │
          │────────────────────────────────▶│
          │                                  │
          │                                  │ • Research OAuth2 best practices
          │                                  │ • Enumerate security considerations
          │                                  │ • Identify affected components
          │                                  │ • Draft implementation plan
          │                                  │
          │◀────────────────────────────────│
          │  "Here are 3 approaches with     │
          │   trade-offs. Approach A is      │
          │   simplest but limits future     │
          │   flexibility..."                │
          │                                  │
          │  [Human reviews, asks            │
          │   clarifying questions,          │
          │   makes strategic decision]      │
          │                                  │
          │  "Let's go with Approach B,      │
          │   but use PKCE instead of        │
          │   client secrets"                │
          │────────────────────────────────▶│
          │                                  │
          │                                  │ • Implement Approach B with PKCE
          │                                  │ • Write comprehensive tests
          │                                  │ • Update documentation
          │                                  │ • Ensure consistency with codebase
          │                                  │
          │◀────────────────────────────────│
          │  [PR ready for human review]     │
          │                                  │
          ▼                                  ▼

The Workflow in Practice

Phase 1: Human Initiates with Intent

The human expresses what they want to accomplish, not necessarily how:

"I want to add granular permission scopes to our API so partners can request only the access they need"

This is captaining: the human sets the destination based on business needs.

Phase 2: LLM Navigates the Solution Space

The LLM exhaustively explores the solution space:

  • What existing patterns does the codebase use for authorization?
  • What are the security implications of different scope hierarchies?
  • Which components need to be modified?
  • What edge cases must be handled (scope inheritance, partial access, etc.)?
  • What do industry best practices recommend?

This is navigating: systematic, thorough exploration of all paths.

Phase 3: Human Makes Strategic Decisions

The LLM presents options with trade-offs. The human decides:

  • "Keep scope hierarchies flat to reduce complexity"
  • "We'll accept some increased verbosity to support fine-grained permissions"
  • "Let's prioritize clarity over backward compatibility"

This is captaining: the human makes the command decisions.

Phase 4: LLM Executes with Precision

Given the strategic decisions, the LLM implements with unwavering consistency:

  • Applies the chosen pattern across all relevant components
  • Generates tests covering the specified edge cases
  • Updates documentation to reflect the new behavior
  • Ensures the implementation matches the codebase's conventions

This is navigating: precise execution of the charted course.

Phase 5: Human Reviews and Approves

The human reviews the implementation with fresh eyes:

  • Does this match my intent?
  • Are there any concerns I didn't anticipate?
  • Is this something I'm comfortable deploying?

This is captaining: the human has final authority.


Benefits for Humans

Cognitive Relief

When LLMs handle the exhaustive details, humans experience:

  • Reduced mental fatigue - No more tracking every edge case
  • Fewer context switches - Stay in strategic thinking mode
  • Less anxiety about oversights - Trust that the navigator is watching
  • More sustainable work patterns - Creative energy isn't depleted by routine tasks

Focus on High-Value Work

With cognitive load reduced, humans can focus on:

  • Innovation - Exploring new approaches and capabilities
  • Mentorship - Developing other team members
  • Architecture - Shaping the long-term direction
  • Stakeholder relationships - Understanding and serving user needs

Better Decision Quality

When humans aren't mentally exhausted by implementation details:

  • Clearer strategic thinking - Full cognitive capacity for important decisions
  • Better judgment - Not rushing to "just get it done"
  • More thoughtful review - Actually reviewing, not rubber-stamping
  • Healthier skepticism - Questioning assumptions and approaches

Benefits for Teams

Healthier Collaboration

When humans aren't cognitively overloaded:

  • More patience - Bandwidth for thoughtful discussion
  • Better communication - Energy for explaining and listening
  • Reduced conflict - Less stress-driven friction
  • Stronger relationships - Time for the human side of teamwork

Knowledge Democratization

When structural rigor is automated:

  • Lower barrier to contribution - Focus on ideas, not implementation details
  • Faster onboarding - New team members can contribute strategically sooner
  • Reduced bus factor - Knowledge is captured in systems, not just heads
  • More inclusive participation - Different cognitive styles can contribute

Sustainable Pace

When the cognitive burden is shared with LLMs:

  • Consistent velocity - Not dependent on heroic effort
  • Reduced burnout - Sustainable work patterns
  • Better work-life balance - Mental energy left at end of day
  • Long-term team health - Sustainable for years, not just sprints

Rigor Through Collaborative Planning

The most significant benefit of LLM-navigated development isn't faster code—it's enforced rigor that would be impractical for humans alone. When LLMs drive the planning process through structured dialogue, they introduce consistency and thoroughness that transforms how software gets built.

This section embodies the guiding value of this entire pillar: rigor as a practice, not just an aspiration.

The Collaborative Planning Framework

For complex changes, an LLM-powered planning framework ensures nothing is missed:

Note on Phase Terminology: CPF has two complementary phase sequences depending on context. This document describes the implementation workflow phases (ELICITATION → DESIGN → PLANNING → HANDOFF) for executing approved work. The Foundational Technical Requirements document uses the strategic planning phases (IDEATION → ASSESSMENT → REINFORCEMENT → PLANNING) for evaluating and shaping new initiatives. Both are part of CPF—one for how to build, the other for deciding what to build.

┌─────────────────────────────────────────────────────────────────┐
│         Collaborative Planning Framework (CPF)                  │
│              Implementation Workflow Phases                     │
│                                                                 │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  PHASE 1: ELICITATION                                    │   │
│  │  Transform vague intent → validated requirements         │   │
│  │  • LLM asks clarifying questions                         │   │
│  │  • Human articulates what they actually want             │   │
│  │  • Ambiguities surfaced and resolved                     │   │
│  │  → Human checkpoint: Approve requirements                │   │
│  └──────────────────────────────────────────────────────────┘   │
│                            ↓                                    │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  PHASE 2: DESIGN                                         │   │
│  │  Create comprehensive architecture before any code       │   │
│  │  • LLM explores solution space exhaustively              │   │
│  │  • Trade-offs enumerated with reasoning                  │   │
│  │  • Edge cases identified proactively                     │   │
│  │  → Human checkpoint: Choose approach                     │   │
│  └──────────────────────────────────────────────────────────┘   │
│                            ↓                                    │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  PHASE 3: PLANNING                                       │   │
│  │  Break down into implementable tasks                     │   │
│  │  • Phased implementation plan                            │   │
│  │  • Detailed subtasks with dependencies                   │   │
│  │  • Risk identification and mitigation                    │   │
│  │  → Human checkpoint: Approve plan                        │   │
│  └──────────────────────────────────────────────────────────┘   │
│                            ↓                                    │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │  PHASE 4: HANDOFF                                        │   │
│  │  Package for autonomous execution                        │   │
│  │  • Machine-readable task specifications                  │   │
│  │  • Success criteria for each task                        │   │
│  │  • Documentation thorough enough for implementation      │   │
│  └──────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘

Why This Rigor Matters

Without LLM-driven planning: Requirements remain vague until implementation reveals gaps. Design decisions happen implicitly during coding. Edge cases discovered in production. Documentation lags behind reality.

With LLM-driven planning: Requirements interrogated before any code is written. Design decisions made explicitly with documented trade-offs. Edge cases enumerated systematically during design. Documentation created as a byproduct of the planning process.

Human-in-the-Loop Checkpoints

The framework introduces structured moments where human judgment is required:

Phase What LLM Presents What Human Decides
Elicitation Clarified requirements with assumptions stated "Yes, that's what I want" or corrections
Design 2-3 approaches with trade-offs Which approach fits the situation
Planning Phased implementation with dependencies Scope, priority, timing

These checkpoints prevent the LLM from going too far down the wrong path while keeping humans focused on strategic decisions rather than implementation details.

The Rigor Compounds

Each project that goes through this framework:

  • Codifies organizational knowledge - Decisions and rationale are captured
  • Raises the quality floor - Even routine work gets systematic treatment
  • Builds institutional memory - Past decisions inform future ones
  • Trains the team - Engineers internalize thorough planning habits

Making Implicit Knowledge Explicit

Engineering organizations often rely on implicit knowledge—"everyone knows we don't do it that way." The Collaborative Planning Framework surfaces these assumptions by requiring explicit specification during the Elicitation phase, which then becomes available to all team members and future LLM interactions.


Workflow Flexibility: Supporting All Working Styles

The human-driven, LLM-navigated model supports multiple working styles—from heavy up-front planning to iterative experimentation. The captain/navigator roles remain consistent, but how you move through the work can vary based on the task, your preferences, and what you learn along the way.

The Full-Planning Approach

Some developers and some tasks benefit from comprehensive planning before implementation:

Human: "I want to add OAuth2 support for third-party integrations"
   ↓
[Full CPF cycle: ELICITATION → DESIGN → PLANNING → HANDOFF]
   ↓
Human reviews and approves complete plan
   ↓
LLM executes with full context
   ↓
Implementation proceeds systematically

When this works well:

  • Complex features with many dependencies
  • High-risk changes requiring careful analysis
  • Novel problems where you need to explore the solution space
  • Team projects requiring coordination
  • Learning: when you want to understand the full scope before diving in

The value: Comprehensive up-front thinking reduces surprises, ensures alignment, and creates documentation as a byproduct.

The Iterative-Experimental Approach

Other developers and tasks benefit from rapid prototyping and learning by doing:

Human: "Let me try adding OAuth2—start with a basic implementation"
   ↓
[Quick implementation, light planning]
   ↓
Human: "Okay, this works but feels clunky. Let's refine the API"
   ↓
[Iterate on the design]
   ↓
Human: "Now let's properly specify what we learned and clean it up"
   ↓
[Formalize and document]

When this works well:

  • Exploratory work where you don't know what's best until you try it
  • Quick prototypes or proof-of-concepts
  • Refactoring where you need to feel the code to understand it
  • UI/UX work where you need to see and interact to evaluate
  • Individual work where coordination overhead isn't necessary

The value: Fast feedback, discovery through experimentation, and avoiding over-planning for simple changes.

The Hybrid Approach

Many developers blend approaches based on the situation:

Human: "I want to add OAuth2. Let me sketch a quick prototype first."
   ↓
[Rapid prototype—minimal LLM involvement]
   ↓
Human: "Okay, I see what this involves. Now let's do it properly."
   ↓
[Full CPF cycle based on prototype learnings]
   ↓
Implementation with LLM handling rigor and completeness

When this works well:

  • Validating feasibility before committing to an approach
  • De-risking unknowns through quick experiments
  • Building conviction before formal planning
  • Learning enough to specify requirements clearly

The value: Combines the discovery benefits of experimentation with the rigor benefits of formal planning.

The Common Thread: Human Commands, LLM Navigates

Regardless of working style, the core philosophy remains consistent:

Working Style Human Role (Captain) LLM Role (Navigator)
Full Planning Defines intent, reviews comprehensive plan, makes strategic decisions, approves execution Explores solution space, enumerates options, creates detailed plan, executes with precision
Iterative-Experimental Tries approaches, evaluates what works, decides when to formalize, sets direction Implements quickly, handles details, keeps code consistent, documents what works
Hybrid Prototypes to learn, then specifies formally, approves final approach Supports quick experiments, then brings rigor to final implementation

The key insight: The captain/navigator model isn't about how much planning you do—it's about who does what. Whether you're planning comprehensively or iterating rapidly:

  • Humans provide strategic direction and make command decisions
  • LLMs handle structural rigor and implementation precision

CPF Phases as Optional Checkpoints

The Collaborative Planning Framework phases (ELICITATION → DESIGN → PLANNING → HANDOFF) can be viewed as optional checkpoints rather than mandatory sequential stages:

  • Light-touch work: Skip directly to implementation
  • Medium-complexity: Use ELICITATION to clarify, then implement
  • High-complexity: Use full cycle for comprehensive planning
  • Exploratory: Prototype first, then use CPF to formalize what you learned

The phases exist to provide structure when you need it—not to impose process when you don't.

Validating All Approaches

Both heavy planning and iterative experimentation are legitimate and valuable working styles:

  • If you prefer comprehensive planning: The CPF supports you with structured dialogue and thorough exploration
  • If you prefer rapid iteration: The framework doesn't force planning—let the LLM handle consistency while you experiment
  • If you blend approaches: Use planning when it helps, skip it when it doesn't

The framework respects that different engineers, different tasks, and different contexts call for different approaches. What matters is that however you work, you maintain the human-driven, LLM-navigated discipline: you make the strategic decisions, the LLM handles the structural rigor.


Implementation Patterns

Pattern 1: Interactive Planning

For complex changes, use structured dialogue:

  1. Human states intent - What outcome is desired?
  2. LLM explores comprehensively - What are all the considerations?
  3. Human makes decisions - Which path forward?
  4. LLM implements precisely - Execute the chosen path
  5. Human reviews and approves - Final authority

Pattern 2: Structured Handoffs

For routine changes, use well-defined interfaces:

# Task handoff from human to LLM
intent: "Add rate limiting to the API"
constraints:
  - "Must not break existing clients"
  - "Prefer Redis for distributed rate limiting"
  - "Start with 100 requests/minute default"
success_criteria:
  - "Tests pass"
  - "Documentation updated"
  - "No performance regression"

Pattern 3: Review Checkpoints

For autonomous work, define where human judgment is required:

Checkpoint Trigger Human Decision
Architecture New component needed Approve design
Security Auth/authz changes Verify approach
API Breaking changes Approve migration
Scope Task growing Continue or split

Anti-Patterns to Avoid

Anti-Pattern 1: Abdication

Problem: Human fully delegates without reviewing

Human: "Just fix whatever you think needs fixing"
       [Never reviews the result]

Why it fails: LLMs can be confidently wrong. Human judgment is essential.

Solution: Humans review all changes, even if briefly.

Anti-Pattern 2: Micromanagement

Problem: Human dictates every implementation detail

Human: "Use a for loop, not map. Name the variable 'items', not 'data'.
        Put the function on line 42..."

Why it fails: Wastes human cognitive capacity on low-value decisions.

Solution: Specify intent and constraints; let LLM choose implementation details.

Anti-Pattern 3: Rubber Stamping

Problem: Human approves without genuine review

Human: "LGTM" [after 30 seconds on a 500-line change]

Why it fails: Defeats the purpose of human oversight.

Solution: Reviews should be substantive; if too rushed, the workflow needs adjustment.

Anti-Pattern 4: LLM as Oracle

Problem: Treating LLM output as authoritative truth

Human: "The LLM said this is the best approach, so it must be"

Why it fails: LLMs can hallucinate, be outdated, or miss context.

Solution: LLM provides options and information; human makes decisions.


Success Criteria

For Individuals

  • Humans report feeling less cognitively fatigued
  • More time spent on creative and strategic work
  • Fewer mistakes due to oversight or rushing
  • Better work-life balance

For Teams

  • Sustainable velocity without heroic effort
  • Improved collaboration and communication
  • Faster onboarding for new team members
  • Reduced knowledge silos

For Codebases

  • Higher consistency across components
  • More comprehensive test coverage
  • Current documentation
  • Fewer regressions from incomplete changes

For Organizations

  • Faster delivery of business value
  • Reduced burnout and turnover
  • More innovation and experimentation
  • Better developer experience

Related Documents

Document Description
A Pragmatic Guide for Software Engineering in a Post-LLM World Strategic umbrella connecting all three pillars
LLM-First Code Reviews Practical guide to LLM-first review practices
Radical Self-Improvement for LLMs Framework for autonomous LLM self-improvement

Authored-by: jib