Comprehensive reference of terms, concepts, and acronyms used throughout ProjectKeystone
Hierarchical Multi-Agent System - A distributed computing architecture where agents are organized in multiple hierarchical layers, communicating via message passing. ProjectKeystone implements a 4-layer HMAS with strategic coordination at the top (Level 0) down to task execution at the bottom (Level 3).
Keystone Interchange Message - The standard message format for inter-agent communication in ProjectKeystone. Each KIM contains:
msg_id: Unique message identifiersender_id: ID of the sending agentreceiver_id: ID of the receiving agentcommand: The action to performpayload: Optional JSON data attached to the messagetimestamp: When the message was created
See docs/NETWORK_PROTOCOL.md for complete KIM specification.
Central Message Routing Hub - A decoupled communication infrastructure that handles routing KIM messages between agents without direct agent-to-agent dependencies. The MessageBus:
- Registers and discovers agents dynamically
- Routes messages to their intended recipients
- Enables agent decoupling (agents communicate through IDs, not pointers)
- Maintains thread-safe agent registry with mutex protection
- Supports both synchronous (Phase 1) and asynchronous (Phase 2+) delivery
See docs/plan/adr/ADR-001-message-bus-architecture.md for architecture decisions.
Strategic Orchestration Layer - The top level of the HMAS hierarchy, containing the ChiefArchitectAgent. L0 is responsible for:
- Strategic system-wide decisions
- Component selection and coordination
- Overall goal decomposition
- High-level task scheduling
Analogy: CEO/CTO of the system
Component Coordination Layer - Coordinates multiple modules within a component. Implemented by ComponentLeadAgent. Responsibilities:
- Module coordination and sequencing
- Component-level resource management
- Cross-module dependency resolution
- Result aggregation from modules
Analogy: VP of Engineering for a component
Module Coordination Layer - Decomposes module-level goals into concrete tasks. Implemented by ModuleLeadAgent. Responsibilities:
- Task decomposition and planning
- Task scheduling and monitoring
- Result synthesis from task agents
- Module-level error handling
Analogy: Tech Lead managing a module
Task Execution Layer - The lowest level where concrete work happens. Implemented by TaskAgent. Responsibilities:
- Execute individual tasks
- Command execution (bash, system calls, etc.)
- Result reporting back to ModuleLeads
- Task-level error handling
Analogy: Individual Contributor implementing features
Strategic Orchestrator Agent - The Level 0 agent that orchestrates the entire HMAS system. It:
- Receives high-level goals or commands
- Decomposes goals into components
- Delegates to ComponentLeadAgents (Phase 3+) or ModuleLeadAgents (Phase 2)
- Aggregates results from lower levels
- Makes strategic architectural decisions
Component Coordinator Agent - The Level 1 agent that coordinates multiple modules within a single component. It:
- Receives module planning requests from ChiefArchitect
- Decomposes component goals into module tasks
- Coordinates 2+ ModuleLeadAgents
- Monitors module progress
- Aggregates module results for the ChiefArchitect
Introduced in: Phase 3 (Full 4-Layer Hierarchy)
Module Coordinator Agent - The Level 2 agent that breaks down module goals into executable tasks. It:
- Receives task planning goals
- Decomposes goals into concrete executable tasks
- Coordinates 2+ TaskAgents
- Monitors task progress and results
- Synthesizes task results into module output
- Manages state transitions (IDLE → PLANNING → WAITING → SYNTHESIZING)
Introduced in: Phase 2 (3-Layer Hierarchy)
Task Executor Agent - The Level 3 agent that executes concrete work. It:
- Receives executable commands
- Executes commands (bash, system operations, etc.)
- Reports results back to ModuleLeads
- Handles task-level errors
- Manages task state (IDLE → EXECUTING → COMPLETED)
Introduced in: Phase 1 (2-Layer Hierarchy)
Lock-Free Task Distribution System - A high-performance task scheduling mechanism that distributes work across worker threads without using locks or mutex primitives. Key characteristics:
- Each worker thread has a local work queue
- When a thread runs out of work, it "steals" tasks from other threads' queues
- Uses atomic operations for lock-free synchronization
- Scales well with many threads
- Minimizes contention and context switching
- Used in Phase 7+ for distributed task execution
Fixed-Size Thread Executor - A pool of worker threads that execute tasks concurrently. ProjectKeystone uses ThreadPool for:
- Parallel agent execution
- Concurrent message processing
- Load distribution across cores
- Resource-controlled parallelism
C++20 Asynchronous Function - A function that can be suspended and resumed using C++20 coroutine primitives
(co_await, co_return). ProjectKeystone uses coroutines for:
- Non-blocking message processing
- Asynchronous task coordination
- Efficient async/await semantics
Custom Awaitable Type - A C++20 awaitable type that wraps coroutine execution. Features:
- Implements
operator co_await()for use withco_await - Returns type
Twhen completed - Supports exception propagation
- Used extensively in async message processing
Practices for Correct Coroutine Usage - Guidelines for using C++20 coroutines safely in ProjectKeystone:
- Lifetime Management: Always use
Task<T>wrapper; never manually managestd::coroutine_handle - Suspension Points: Only use
co_awaitinside coroutine functions - Exception Safety: Capture exceptions in promise or use try/catch inside coroutine
- Thread Safety: Never hold locks across
co_awaitpoints; use scheduler for safe execution - Data Validity: Capture by value across suspension points; function-local variables are safe
- Symmetric Transfer: Used in
final_suspend()to avoid stack growth in chained coroutines - Pitfalls to Avoid: Forgetting
co_return, extractingget_handle(), creating Task but not awaiting it
See ADR-013: Coroutine Safety Patterns for detailed patterns and anti-patterns.
Coroutine Suspension Operator - A keyword that suspends the current coroutine and awaits completion of another asynchronous operation. Usage:
- Can only be used inside coroutine functions
- Automatically integrates with
Task<T>type - Stores awaiting coroutine as continuation
- Resumed when awaited coroutine completes
- Example:
auto result = co_await someTask();
Coroutine Return Statement - A keyword that returns a value from a coroutine and suspends. Usage:
- Must be used in any function returning
Task<T> - Stores value in promise and triggers
final_suspend() - Can return
voidinTask<void>functions - Example:
co_return 42;
Coroutine Protocol Handler - A nested type inside coroutine return types (like Task<T>) that defines:
get_return_object(): Creates the Task from promiseinitial_suspend(): Controls start behaviorfinal_suspend(): Handles completion and continuationreturn_value()orreturn_void(): Stores resultunhandled_exception(): Captures exceptions thrown in coroutine- See
Task<T>::promise_typeininclude/concurrency/task.hppfor implementation
Protocol for co_await Operations - Requires three methods for any type usable with co_await:
bool await_ready(): Returns true if result immediately availablestd::coroutine_handle<> await_suspend(std::coroutine_handle<> h): Handles suspensionT await_resume(): Returns result when resumed- Both
Task<T>andPullOrStealimplement this protocol - See ADR-013 Pattern 6 for symmetric transfer optimization
Test-Driven Development - A software development methodology where tests are written BEFORE implementation:
- RED: Write failing E2E test
- GREEN: Implement minimal code to pass test
- REFACTOR: Improve code while keeping tests green
- COMMIT: Submit changes when tests pass
ProjectKeystone strictly follows TDD throughout all phases.
End-to-End Integration Tests - Comprehensive tests that verify system behavior across multiple layers. ProjectKeystone E2E tests:
- Test full message flows from L0 down to L3
- Verify agent coordination and delegation
- Use Google Test (GTest) framework
- Use descriptive test names (e.g.,
TEST(E2E_Phase1, BasicDelegation)) - Drive development priorities (TDD approach)
Examples:
- Phase 1: ChiefArchitect → TaskAgent delegation
- Phase 2: ModuleLead coordinates multiple TaskAgents
- Phase 3: ComponentLead coordinates multiple ModuleLeads
- Phase 8: Distributed gRPC communication
Isolated Component Tests - Tests for individual components in isolation:
- MessageBus routing and discovery
- MessageQueue operations
- ThreadPool task execution
- Coroutine helpers
- Message serialization/deserialization
Fault Injection Testing Methodology - Deliberately introducing failures to test system resilience:
- Random message delays
- Agent process crashes
- Network partition simulation
- Resource exhaustion scenarios
- Used in Phase 5+ for robustness validation
Typical chaos tests:
- 20 random agent failures tolerated
- Message delivery guaranteed despite chaos
- System recovers within reasonable bounds
Architecture Decision Record - A document that captures an important architectural decision, including:
- Context: The issue or problem being addressed
- Decision: What was decided
- Consequences: Positive and negative outcomes
- Status: PROPOSED, ACCEPTED, SUPERSEDED, etc.
ProjectKeystone ADRs are stored in docs/plan/adr/ with naming convention ADR-###-title.md
Examples:
- ADR-001 - Message Bus Architecture
- ADR-002 - Coroutine Task Design
- ADR-003 - Work-Stealing Scheduler
Development Iteration - ProjectKeystone is built through phases following TDD principles:
- Phase 1 (Weeks 1-3): L0 ↔ L3 basic delegation (2 agents)
- Phase 2 (Weeks 4-6): L0 ↔ L2 ↔ L3 module coordination (3 layers)
- Phase 3 (Weeks 7-9): L0 ↔ L1 ↔ L2 ↔ L3 full hierarchy (4 layers)
- Phase 4-5: Multi-component and scale testing
- Phase 6-7: Performance and distributed features
- Phase 8 (Optional): gRPC-based distributed multi-node communication
Each phase adds new agent levels or features while maintaining all previous functionality.
Modern C++ Standard - ProjectKeystone is implemented exclusively in C++20, utilizing:
- Coroutines (
std::coroutine_handle) - Concepts (template constraints)
- Structured bindings
- Smart pointers (
std::unique_ptr,std::shared_ptr) - Modules (when compiler support available)
- Ranges and views
Build Configuration System - Used to:
- Configure the build with options (
-DENABLE_GRPC=ON, etc.) - Manage dependencies (Google Test, nlohmann/json, etc.)
- Generate build artifacts for multiple targets
- Support sanitizers (ASan, UBSan, TSan, MSan)
C++ Unit and E2E Testing Framework - Features used:
- Test cases with
TEST()and test fixtures withTEST_F() - Assertions (
ASSERT_*,EXPECT_*) - Test parameterization
- Death tests for crash scenarios
- Comprehensive mocking framework (GMock)
Container Platform - Used for:
- Consistent build and test environments
- Multi-stage builds (builder, runtime, development)
- Docker Compose for local orchestration
- Kubernetes-ready images
RPC Framework for Distributed Communication - Enables:
- Multi-node agent communication
- Protocol Buffers message serialization
- Service definitions for HMAS coordinator
- Load balancing strategies
- Heartbeat monitoring between nodes
The pattern where a central bus routes messages between decoupled agents:
- Agents don't need to know physical addresses of other agents
- Dynamic agent registration and discovery
- Synchronous routing in Phase 1, async in Phase 2+
The pattern where higher-level agents break down goals and delegate work to lower levels:
- ChiefArchitect delegates to ComponentLead (L1)
- ComponentLead delegates to ModuleLead (L2)
- ModuleLead delegates to TaskAgent (L3)
The pattern where a parent agent aggregates results from child agents:
- ModuleLead waits for all TaskAgent results
- ComponentLead waits for all ModuleLead results
- ChiefArchitect waits for all ComponentLead results
Single Instruction, Multiple Data - Vector operations used in performance-critical sections (when applicable to agent operations).
Synchronization without Mutex - Used in:
- Work-stealing scheduler queues
- Message queue implementation
- Atomic counters for result tracking
Data Passing without Duplication - Achieved through:
- Move semantics (C++20)
- Message serialization with minimal overhead
- Shared ownership via
std::shared_ptrwhen necessary
ProjectKeystone agents use finite state machines for coordinated workflows:
ModuleLeadAgent States:
IDLE: Waiting for commandsPLANNING: Decomposing goals into tasksWAITING: Monitoring task executionSYNTHESIZING: Aggregating task results- Return to
IDLE
ComponentLeadAgent States:
IDLE: Waiting for commandsPLANNING: Decomposing component goals into modulesWAITING_FOR_MODULES: Monitoring module executionAGGREGATING: Collecting module results- Return to
IDLE
Container Orchestration Platform - ProjectKeystone supports K8s deployment with:
- Deployment manifests with health checks
- Service definitions (ClusterIP, Headless)
- ConfigMaps for configuration
- Metrics endpoints for monitoring
Multi-Container Orchestration (Local) - Used for:
- Local development with mounted source
- Multi-node simulation
- Phase 8 distributed testing
- CI/CD pipelines
Pattern for Handling Degradation - Prevents cascading failures by:
- Monitoring for error rates
- Opening circuit if threshold exceeded
- Failing fast instead of retrying indefinitely
- Optional feature in Phase 5+
Configurable Retry Strategy - Handles transient failures:
- Exponential backoff
- Maximum retry attempts
- Optional jitter for distributed systems
Periodic Health Check - Used in Phase 8:
- 1-second interval for distributed nodes
- 3-second timeout threshold
- Detects agent and node failures
- Triggers failover or recovery
- CLAUDE.md - Project configuration and overview
- TDD_FOUR_LAYER_ROADMAP.md - Phase-by-phase development plan
- ARCHITECTURE.md - Detailed architecture documentation
- NETWORK_PROTOCOL.md - Message protocol specifications
- KUBERNETES_DEPLOYMENT.md - Kubernetes deployment guide
Last Updated: 2025-11-26 Version: 1.1 (Added coroutine safety terminology) Status: ACTIVE