Status: Draft Author: James Wiesebron, james-in-a-box Created: December 2025 Purpose: A philosophy and framework for software development where humans drive strategy while LLMs handle structural rigor and implementation Guiding Value: Rigor
Part of: A Pragmatic Guide for Software Engineering in a Post-LLM World
This document articulates a new paradigm for software development: human-driven, LLM-navigated. The core insight is that humans and LLMs have complementary cognitive strengths, and optimal software development emerges when each focuses on what they do best.
Humans excel at: Creative problem-solving, strategic decision-making under uncertainty, interpersonal collaboration, intuitive judgment about what matters, and adapting to novel situations.
LLMs excel at: Maintaining structural consistency across large codebases, exhaustive enumeration of edge cases, applying established patterns with unwavering precision, synthesizing large amounts of context, incorporating external research and best practices, and tireless execution of well-defined tasks.
The goal: Free human cognitive capacity for creativity, strategic thinking, and healthy collaboration by offloading structural rigor and implementation details to LLMs. This isn't about replacing humans—it's about amplifying what makes humans uniquely valuable.
The guiding value—rigor: Establish precise roles and maintain them consistently. The captain/navigator metaphor isn't just a suggestion—it's a discipline that prevents the chaos of undefined collaboration. When roles are clear and consistently maintained, both humans and LLMs can operate with confidence.
- The Core Philosophy
- Division of Cognitive Labor
- The Workflow in Practice
- Benefits for Humans
- Benefits for Teams
- Rigor Through Collaborative Planning
- Workflow Flexibility: Supporting All Working Styles
- Implementation Patterns
- Anti-Patterns to Avoid
- Success Criteria
- Related Documents
Consider the analogy of a ship's voyage. Navigation is meaningless without a destination.
A navigator can chart the optimal course, track position, monitor weather, and identify hazards. But none of that matters if you don't know where you're going. The navigator cannot decide where to go—only how to get there once the destination is set.
The captain's essential role:
- Sets the destination — Where are we going? Why this destination and not another?
- Establishes priorities — Should we prioritize speed, safety, or fuel efficiency?
- Makes command decisions — When to change course, when to wait out a storm, when the plan needs adjustment
- Bears ultimate responsibility — Accountable for the success of the voyage
The navigator's essential role:
- Tracks position — Where are we right now?
- Charts the course — What's the optimal path given current conditions?
- Monitors continuously — Weather, hazards, fuel, crew status
- Provides critical information — Data the captain needs to make decisions
This isn't about hierarchy—both roles are essential. It's about different cognitive capabilities serving different purposes:
| Role | Responsibility | Cognitive Load |
|---|---|---|
| Captain (Human) | Sets destination and priorities, makes command decisions, bears ultimate responsibility | Creative, strategic, social, accountable |
| Navigator (LLM) | Charts course, tracks position, monitors conditions, identifies hazards | Systematic, exhaustive, precise, tireless |
This is rigor in action: defining roles precisely based on who's suited for what, and maintaining the discipline to stay in your lane.
Traditional software development places an enormous cognitive burden on humans:
┌────────────────────────────────────────────────────────────────┐
│ HUMAN COGNITIVE LOAD │
│ │
│ Strategic Thinking │ Implementation Details │
│ ───────────────── │ ────────────────────── │
│ • What should we build? │ • Did I handle null? │
│ • Why does this matter? │ • Is this pattern consistent? │
│ • How does this fit? │ • Did I update all call sites? │
│ │ • Are the tests comprehensive? │
│ │ • Is the documentation current? │
│ │ • Did I miss any edge cases? │
│ │ │
└────────────────────────────────────────────────────────────────┘
A significant portion of cognitive effort goes toward ensuring correctness, consistency, and completeness—tasks that require exhaustive attention to detail rather than creative insight.
┌────────────────────────────────────────────────────────────────┐
│ COGNITIVE LOAD REDISTRIBUTION │
│ │
│ HUMAN (Captain) │ LLM (Navigator) │
│ ─────────────── │ ───────────────── │
│ • What should we build? │ • Enumerate all edge cases │
│ • Why does this matter? │ • Ensure pattern consistency │
│ • How does this fit? │ • Update all call sites │
│ • Is this the right │ • Generate comprehensive tests │
│ approach? │ • Keep documentation current │
│ • Should we proceed? │ • Validate against standards │
│ • What trade-offs are │ • Track dependencies and │
│ acceptable? │ implications │
│ │ │
│ Creative, Strategic │ Systematic, Exhaustive │
└────────────────────────────────────────────────────────────────┘
A core principle of this model is that feedback flows both ways: human feedback improves LLM behavior, and LLM feedback improves human behavior. This isn't a one-way relationship where humans simply direct and LLMs execute—it's a collaborative loop where both parties learn and adapt.
Human feedback improves LLM behavior:
- Course corrections help LLMs understand context and preferences
- Reviews teach LLMs what "good" looks like in this specific codebase
- Strategic decisions inform future navigation choices
- Explicit feedback on quality shapes LLM outputs over time
LLM feedback improves human behavior:
- Systematic enumeration of edge cases trains human thoroughness
- Consistent application of patterns raises human quality standards
- Questions during elicitation sharpen human thinking about requirements
- Comprehensive analysis reveals blind spots in human reasoning
This bidirectional improvement creates a virtuous cycle: the more humans and LLMs work together, the better each becomes at their respective roles. The human becomes a better captain; the LLM becomes a better navigator. The partnership compounds over time.
A key part of rigorous human-driven development is not being attached to a specific implementation during a development cycle. This non-attachment enables:
- Rapid cycles of continuous improvement — When you're not invested in defending a particular approach, you can iterate quickly based on evidence
- Evidence-based decision making — Let data, not ego, determine which approach works best
- Systematic A/B testing — Compare implementations objectively rather than arguing about preferences
- Hypothesis-driven development — Frame changes as experiments with measurable outcomes
The captain/navigator model supports this: humans set the strategic intent, not the specific implementation. When LLMs handle implementation details, humans naturally develop healthier distance from any particular solution—freeing them to evaluate approaches based on outcomes rather than authorship.
This also connects to external research: a team that actively incorporates lessons from academic literature, industry best practices, and prior art is inherently less attached to "not invented here" solutions. The LLM navigator can surface relevant research, but the human captain decides what evidence matters.
| Domain | What Humans Do | Why Humans |
|---|---|---|
| Vision | Define what success looks like | Requires understanding of user needs, business context, organizational goals |
| Strategy | Choose between competing approaches | Requires judgment under uncertainty, risk tolerance, stakeholder management |
| Review | Approve or reject proposed changes | Requires accountability, institutional knowledge, taste |
| Collaboration | Coordinate with other humans | Requires empathy, persuasion, relationship-building |
| Novelty | Handle unprecedented situations | Requires creative problem-solving, analogical reasoning |
| Domain | What LLMs Do | Why LLMs |
|---|---|---|
| Completeness | Enumerate every consideration | Tireless attention, no cognitive fatigue |
| Consistency | Apply patterns uniformly | Perfect recall of established patterns |
| Precision | Get details exactly right | No typos, no oversights, no "I'll fix it later" |
| Documentation | Keep everything current | No resistance to "boring" work |
| Validation | Verify against standards | Instant access to reference materials |
| Research | Incorporate external research and best practices | Synthesize large volumes of literature and prior art |
| Implementation | Execute well-defined tasks | Efficient translation of spec to code |
Human LLM
│ │
│ "I want to add OAuth2 to │
│ the API for third-party │
│ integrations" │
│────────────────────────────────▶│
│ │
│ │ • Research OAuth2 best practices
│ │ • Enumerate security considerations
│ │ • Identify affected components
│ │ • Draft implementation plan
│ │
│◀────────────────────────────────│
│ "Here are 3 approaches with │
│ trade-offs. Approach A is │
│ simplest but limits future │
│ flexibility..." │
│ │
│ [Human reviews, asks │
│ clarifying questions, │
│ makes strategic decision] │
│ │
│ "Let's go with Approach B, │
│ but use PKCE instead of │
│ client secrets" │
│────────────────────────────────▶│
│ │
│ │ • Implement Approach B with PKCE
│ │ • Write comprehensive tests
│ │ • Update documentation
│ │ • Ensure consistency with codebase
│ │
│◀────────────────────────────────│
│ [PR ready for human review] │
│ │
▼ ▼
The human expresses what they want to accomplish, not necessarily how:
"I want to add granular permission scopes to our API so partners can request only the access they need"
This is captaining: the human sets the destination based on business needs.
The LLM exhaustively explores the solution space:
- What existing patterns does the codebase use for authorization?
- What are the security implications of different scope hierarchies?
- Which components need to be modified?
- What edge cases must be handled (scope inheritance, partial access, etc.)?
- What do industry best practices recommend?
This is navigating: systematic, thorough exploration of all paths.
The LLM presents options with trade-offs. The human decides:
- "Keep scope hierarchies flat to reduce complexity"
- "We'll accept some increased verbosity to support fine-grained permissions"
- "Let's prioritize clarity over backward compatibility"
This is captaining: the human makes the command decisions.
Given the strategic decisions, the LLM implements with unwavering consistency:
- Applies the chosen pattern across all relevant components
- Generates tests covering the specified edge cases
- Updates documentation to reflect the new behavior
- Ensures the implementation matches the codebase's conventions
This is navigating: precise execution of the charted course.
The human reviews the implementation with fresh eyes:
- Does this match my intent?
- Are there any concerns I didn't anticipate?
- Is this something I'm comfortable deploying?
This is captaining: the human has final authority.
When LLMs handle the exhaustive details, humans experience:
- Reduced mental fatigue - No more tracking every edge case
- Fewer context switches - Stay in strategic thinking mode
- Less anxiety about oversights - Trust that the navigator is watching
- More sustainable work patterns - Creative energy isn't depleted by routine tasks
With cognitive load reduced, humans can focus on:
- Innovation - Exploring new approaches and capabilities
- Mentorship - Developing other team members
- Architecture - Shaping the long-term direction
- Stakeholder relationships - Understanding and serving user needs
When humans aren't mentally exhausted by implementation details:
- Clearer strategic thinking - Full cognitive capacity for important decisions
- Better judgment - Not rushing to "just get it done"
- More thoughtful review - Actually reviewing, not rubber-stamping
- Healthier skepticism - Questioning assumptions and approaches
When humans aren't cognitively overloaded:
- More patience - Bandwidth for thoughtful discussion
- Better communication - Energy for explaining and listening
- Reduced conflict - Less stress-driven friction
- Stronger relationships - Time for the human side of teamwork
When structural rigor is automated:
- Lower barrier to contribution - Focus on ideas, not implementation details
- Faster onboarding - New team members can contribute strategically sooner
- Reduced bus factor - Knowledge is captured in systems, not just heads
- More inclusive participation - Different cognitive styles can contribute
When the cognitive burden is shared with LLMs:
- Consistent velocity - Not dependent on heroic effort
- Reduced burnout - Sustainable work patterns
- Better work-life balance - Mental energy left at end of day
- Long-term team health - Sustainable for years, not just sprints
The most significant benefit of LLM-navigated development isn't faster code—it's enforced rigor that would be impractical for humans alone. When LLMs drive the planning process through structured dialogue, they introduce consistency and thoroughness that transforms how software gets built.
This section embodies the guiding value of this entire pillar: rigor as a practice, not just an aspiration.
For complex changes, an LLM-powered planning framework ensures nothing is missed:
Note on Phase Terminology: CPF has two complementary phase sequences depending on context. This document describes the implementation workflow phases (ELICITATION → DESIGN → PLANNING → HANDOFF) for executing approved work. The Foundational Technical Requirements document uses the strategic planning phases (IDEATION → ASSESSMENT → REINFORCEMENT → PLANNING) for evaluating and shaping new initiatives. Both are part of CPF—one for how to build, the other for deciding what to build.
┌─────────────────────────────────────────────────────────────────┐
│ Collaborative Planning Framework (CPF) │
│ Implementation Workflow Phases │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ PHASE 1: ELICITATION │ │
│ │ Transform vague intent → validated requirements │ │
│ │ • LLM asks clarifying questions │ │
│ │ • Human articulates what they actually want │ │
│ │ • Ambiguities surfaced and resolved │ │
│ │ → Human checkpoint: Approve requirements │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ PHASE 2: DESIGN │ │
│ │ Create comprehensive architecture before any code │ │
│ │ • LLM explores solution space exhaustively │ │
│ │ • Trade-offs enumerated with reasoning │ │
│ │ • Edge cases identified proactively │ │
│ │ → Human checkpoint: Choose approach │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ PHASE 3: PLANNING │ │
│ │ Break down into implementable tasks │ │
│ │ • Phased implementation plan │ │
│ │ • Detailed subtasks with dependencies │ │
│ │ • Risk identification and mitigation │ │
│ │ → Human checkpoint: Approve plan │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ PHASE 4: HANDOFF │ │
│ │ Package for autonomous execution │ │
│ │ • Machine-readable task specifications │ │
│ │ • Success criteria for each task │ │
│ │ • Documentation thorough enough for implementation │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
Without LLM-driven planning: Requirements remain vague until implementation reveals gaps. Design decisions happen implicitly during coding. Edge cases discovered in production. Documentation lags behind reality.
With LLM-driven planning: Requirements interrogated before any code is written. Design decisions made explicitly with documented trade-offs. Edge cases enumerated systematically during design. Documentation created as a byproduct of the planning process.
The framework introduces structured moments where human judgment is required:
| Phase | What LLM Presents | What Human Decides |
|---|---|---|
| Elicitation | Clarified requirements with assumptions stated | "Yes, that's what I want" or corrections |
| Design | 2-3 approaches with trade-offs | Which approach fits the situation |
| Planning | Phased implementation with dependencies | Scope, priority, timing |
These checkpoints prevent the LLM from going too far down the wrong path while keeping humans focused on strategic decisions rather than implementation details.
Each project that goes through this framework:
- Codifies organizational knowledge - Decisions and rationale are captured
- Raises the quality floor - Even routine work gets systematic treatment
- Builds institutional memory - Past decisions inform future ones
- Trains the team - Engineers internalize thorough planning habits
Engineering organizations often rely on implicit knowledge—"everyone knows we don't do it that way." The Collaborative Planning Framework surfaces these assumptions by requiring explicit specification during the Elicitation phase, which then becomes available to all team members and future LLM interactions.
The human-driven, LLM-navigated model supports multiple working styles—from heavy up-front planning to iterative experimentation. The captain/navigator roles remain consistent, but how you move through the work can vary based on the task, your preferences, and what you learn along the way.
Some developers and some tasks benefit from comprehensive planning before implementation:
Human: "I want to add OAuth2 support for third-party integrations"
↓
[Full CPF cycle: ELICITATION → DESIGN → PLANNING → HANDOFF]
↓
Human reviews and approves complete plan
↓
LLM executes with full context
↓
Implementation proceeds systematically
When this works well:
- Complex features with many dependencies
- High-risk changes requiring careful analysis
- Novel problems where you need to explore the solution space
- Team projects requiring coordination
- Learning: when you want to understand the full scope before diving in
The value: Comprehensive up-front thinking reduces surprises, ensures alignment, and creates documentation as a byproduct.
Other developers and tasks benefit from rapid prototyping and learning by doing:
Human: "Let me try adding OAuth2—start with a basic implementation"
↓
[Quick implementation, light planning]
↓
Human: "Okay, this works but feels clunky. Let's refine the API"
↓
[Iterate on the design]
↓
Human: "Now let's properly specify what we learned and clean it up"
↓
[Formalize and document]
When this works well:
- Exploratory work where you don't know what's best until you try it
- Quick prototypes or proof-of-concepts
- Refactoring where you need to feel the code to understand it
- UI/UX work where you need to see and interact to evaluate
- Individual work where coordination overhead isn't necessary
The value: Fast feedback, discovery through experimentation, and avoiding over-planning for simple changes.
Many developers blend approaches based on the situation:
Human: "I want to add OAuth2. Let me sketch a quick prototype first."
↓
[Rapid prototype—minimal LLM involvement]
↓
Human: "Okay, I see what this involves. Now let's do it properly."
↓
[Full CPF cycle based on prototype learnings]
↓
Implementation with LLM handling rigor and completeness
When this works well:
- Validating feasibility before committing to an approach
- De-risking unknowns through quick experiments
- Building conviction before formal planning
- Learning enough to specify requirements clearly
The value: Combines the discovery benefits of experimentation with the rigor benefits of formal planning.
Regardless of working style, the core philosophy remains consistent:
| Working Style | Human Role (Captain) | LLM Role (Navigator) |
|---|---|---|
| Full Planning | Defines intent, reviews comprehensive plan, makes strategic decisions, approves execution | Explores solution space, enumerates options, creates detailed plan, executes with precision |
| Iterative-Experimental | Tries approaches, evaluates what works, decides when to formalize, sets direction | Implements quickly, handles details, keeps code consistent, documents what works |
| Hybrid | Prototypes to learn, then specifies formally, approves final approach | Supports quick experiments, then brings rigor to final implementation |
The key insight: The captain/navigator model isn't about how much planning you do—it's about who does what. Whether you're planning comprehensively or iterating rapidly:
- Humans provide strategic direction and make command decisions
- LLMs handle structural rigor and implementation precision
The Collaborative Planning Framework phases (ELICITATION → DESIGN → PLANNING → HANDOFF) can be viewed as optional checkpoints rather than mandatory sequential stages:
- Light-touch work: Skip directly to implementation
- Medium-complexity: Use ELICITATION to clarify, then implement
- High-complexity: Use full cycle for comprehensive planning
- Exploratory: Prototype first, then use CPF to formalize what you learned
The phases exist to provide structure when you need it—not to impose process when you don't.
Both heavy planning and iterative experimentation are legitimate and valuable working styles:
- If you prefer comprehensive planning: The CPF supports you with structured dialogue and thorough exploration
- If you prefer rapid iteration: The framework doesn't force planning—let the LLM handle consistency while you experiment
- If you blend approaches: Use planning when it helps, skip it when it doesn't
The framework respects that different engineers, different tasks, and different contexts call for different approaches. What matters is that however you work, you maintain the human-driven, LLM-navigated discipline: you make the strategic decisions, the LLM handles the structural rigor.
For complex changes, use structured dialogue:
- Human states intent - What outcome is desired?
- LLM explores comprehensively - What are all the considerations?
- Human makes decisions - Which path forward?
- LLM implements precisely - Execute the chosen path
- Human reviews and approves - Final authority
For routine changes, use well-defined interfaces:
# Task handoff from human to LLM
intent: "Add rate limiting to the API"
constraints:
- "Must not break existing clients"
- "Prefer Redis for distributed rate limiting"
- "Start with 100 requests/minute default"
success_criteria:
- "Tests pass"
- "Documentation updated"
- "No performance regression"For autonomous work, define where human judgment is required:
| Checkpoint | Trigger | Human Decision |
|---|---|---|
| Architecture | New component needed | Approve design |
| Security | Auth/authz changes | Verify approach |
| API | Breaking changes | Approve migration |
| Scope | Task growing | Continue or split |
Problem: Human fully delegates without reviewing
Human: "Just fix whatever you think needs fixing"
[Never reviews the result]
Why it fails: LLMs can be confidently wrong. Human judgment is essential.
Solution: Humans review all changes, even if briefly.
Problem: Human dictates every implementation detail
Human: "Use a for loop, not map. Name the variable 'items', not 'data'.
Put the function on line 42..."
Why it fails: Wastes human cognitive capacity on low-value decisions.
Solution: Specify intent and constraints; let LLM choose implementation details.
Problem: Human approves without genuine review
Human: "LGTM" [after 30 seconds on a 500-line change]
Why it fails: Defeats the purpose of human oversight.
Solution: Reviews should be substantive; if too rushed, the workflow needs adjustment.
Problem: Treating LLM output as authoritative truth
Human: "The LLM said this is the best approach, so it must be"
Why it fails: LLMs can hallucinate, be outdated, or miss context.
Solution: LLM provides options and information; human makes decisions.
- Humans report feeling less cognitively fatigued
- More time spent on creative and strategic work
- Fewer mistakes due to oversight or rushing
- Better work-life balance
- Sustainable velocity without heroic effort
- Improved collaboration and communication
- Faster onboarding for new team members
- Reduced knowledge silos
- Higher consistency across components
- More comprehensive test coverage
- Current documentation
- Fewer regressions from incomplete changes
- Faster delivery of business value
- Reduced burnout and turnover
- More innovation and experimentation
- Better developer experience
| Document | Description |
|---|---|
| A Pragmatic Guide for Software Engineering in a Post-LLM World | Strategic umbrella connecting all three pillars |
| LLM-First Code Reviews | Practical guide to LLM-first review practices |
| Radical Self-Improvement for LLMs | Framework for autonomous LLM self-improvement |
Authored-by: jib