DevSquad — Multi-Role AI Task Orchestrator

📑 Table of Contents

Features & Architecture
Quick Start
Installation
Key Features
Cybernetics Modules
Integration Architecture
Role System
Module Reference
CLI Usage
Python API
Running Tests
Version History

One task → Multi-role AI collaboration → One conclusion
Production Ready | V3.6.1

🚀 V3.6.1: Cybernetics Enhancement Release

DevSquad V3.6.1 adds 5 new cybernetics modules: FeedbackControlLoop for closed-loop feedback control, ExecutionGuard for safe execution with rollback, PerformanceFingerprint for performance baseline tracking, SimilarTaskRecommender for TF-IDF-based task similarity search, and AdaptiveRoleSelector for intelligent role selection based on task characteristics — making multi-agent collaboration more adaptive, self-optimizing, and resilient.

🔄 V3.6.1 Cybernetics Module Details

1️⃣ FeedbackControlLoop (Feedback Controller)

Chinese Name: 反馈闭环控制器 (Feedback Closed-Loop Controller) Core Capabilities:

Closed-loop feedback control with automatic iteration until quality threshold met
Configurable quality gate (quality_gate) and maximum iterations
Lightweight quality assessment (no LLM calls), supports dry-run mode

from scripts.collaboration.feedback_control_loop import FeedbackControlLoop
from scripts.collaboration.dispatcher import MultiAgentDispatcher

dispatcher = MultiAgentDispatcher()
loop = FeedbackControlLoop(dispatcher, quality_gate=0.7, max_iterations=3)
result = loop.run("Design secure auth system", roles=["architect", "security"])
print(f"Iterations: {loop.iteration_count}")
print(f"Best quality: {loop.best_quality:.2f}")
# Automatically iterates until quality gate met or max iterations reached

2️⃣ ExecutionGuard (Execution Guardian)

Chinese Name: 执行守护者 (Execution Guardian) Core Capabilities:

Real-time execution monitoring with 4 abort conditions: timeout, output size, token count, critical keywords
Lightweight checks (<1ms), zero external dependencies
Dynamically configurable thresholds (max_duration_sec, max_output_tokens, etc.)

from scripts.collaboration.execution_guard import ExecutionGuard

guard = ExecutionGuard(max_duration_sec=300.0, max_output_tokens=8000)
should_abort, reason = guard.check_abort(
    worker_output="Generating code...",
    elapsed_time=120.5,
    token_count=5000
)
if should_abort:
    print(f"Aborting: {reason}")
    # Example: "Timeout exceeded: 120.5s > 300.0s"
# Also detect warning keywords (without triggering abort)
warnings = guard.check_warnings("WARNING: High memory usage")
print(f"Warnings: {warnings}")  # ['WARNING']

3️⃣ PerformanceFingerprint (Performance Fingerprint)

Chinese Name: 性能指纹系统 (Performance Fingerprint System) Core Capabilities:

Unified execution fingerprint recording (fuses 4 data sources: invocation counts, latency, state snapshots, retrospective deviations)
Pure Python TF-IDF implementation (no sklearn/numpy), supports English/Chinese mixed content
JSON persistence to .devsquad_data/fingerprints/, graceful cold-start degradation

from scripts.collaboration.performance_fingerprint import PerformanceFingerprint

fingerprint = PerformanceFingerprint()
fid = fingerprint.record_execution(
    task="Implement user authentication",
    result=dispatch_result,
    timing={"total": 12.5, "planning": 2.0, "coding": 8.0, "review": 2.5},
    roles_used=["architect", "coder", "tester"],
)
print(f"Fingerprint ID: {fid}")  # fp_20260518_143052_a1b2c3d4

# Find similar historical tasks using TF-IDF
similar = fingerprint.find_similar("Add login page", top_k=3)
for case in similar:
    print(f"Task: {case['task']}")
    print(f"Similarity: {case['similarity']:.2%}")
    print(f"Roles used: {case['roles_used']}")
    print(f"Success: {case['success']}")

# Get overall statistics
stats = fingerprint.get_stats()
print(f"Total executions: {stats['total']}")
print(f"Success rate: {stats['success_rate']:.1%}")

4️⃣ SimilarTaskRecommender (Similar Task Recommender)

Chinese Name: 相似任务推荐器 (Similar Task Recommender) Core Capabilities:

TF-IDF-based task similarity search with historical success configuration recommendations
Intelligent role combination recommendation, intent prediction, execution time estimation
Confidence scoring (high/medium/low), graceful cold-start degradation

from scripts.collaboration.similar_task_recommender import SimilarTaskRecommender

recommender = SimilarTaskRecommender()
result = recommender.recommend("Design user authentication system")
print(f"Recommended roles: {result['recommended_roles']}")
# Output: ['architect', 'coder', 'tester', 'security']
print(f"Confidence: {result['confidence']}")  # high/medium/low
print(f"Estimated duration: {result['estimated_duration_s']:.1f}s")

# View similar case details
for case in result['similar_cases']:
    print(f"Task: {case['task']}")
    print(f"Similarity: {case['similarity']:.2%}")
    print(f"Historical roles: {case['roles']}")
    print(f"Success: {case['success']}")

# Quick method: get role suggestions only
roles = recommender.get_role_suggestion("Implement payment API")
print(f"Suggested roles: {roles}")

5️⃣ AdaptiveRoleSelector (Adaptive Role Selector)

Chinese Name: 自适应角色选择器 (Adaptive Role Selector) Core Capabilities:

Three-tier selection strategy based on historical success rates (similar tasks → intent match → fallback to default)
Configurable minimum success rate and maximum role count
Supports manual statistics updates and comprehensive role effectiveness reporting

from scripts.collaboration.adaptive_role_selector import AdaptiveRoleSelector

selector = AdaptiveRoleSelector()
roles = selector.select_roles(
    task="Build high-concurrency microservices architecture",
    intent="feature_implementation",
    min_success_rate=0.5,
    max_roles=5,
)
print(f"Recommended roles: {roles}")
# Output: ['architect', 'devops', 'security', 'tester']
# Or: [] (returns empty when no historical data, caller falls back to default RoleMatcher)

# Manually update statistics (for external system integration)
selector.update_stats(["architect", "coder"], success=True, duration_s=12.5)

# Generate role effectiveness report
report = selector.get_role_report()
for role_name, metrics in report.items():
    print(f"{role_name}: success_rate={metrics['success_rate']:.1%}, "
          f"avg_duration={metrics['avg_duration']:.1f}s")

🔗 Integration Architecture

The 5 cybernetic modules are designed as non-invasive wrappers — they work independently or together without modifying existing core logic:

User Task
    ↓
[SimilarTaskRecommender] ← Optional: suggest roles from history
    ↓
[AdaptiveRoleSelector]   ← Optional: optimize role selection
    ↓
[MultiAgentDispatcher]
    ↓
[FeedbackControlLoop]     ← Wrap dispatcher for auto-iteration
    ↓ [each worker step]
[ExecutionGuard]          ← Guard each worker execution
    ↓
[PerformanceFingerprint]  ← Record after dispatch completes

Recommended usage (progressive adoption):

from scripts.collaboration import (
    MultiAgentDispatcher, FeedbackControlLoop,
    ExecutionGuard, PerformanceFingerprint
)

dispatcher = MultiAgentDispatcher()
guard = ExecutionGuard()
fingerprint = PerformanceFingerprint()

# Option 1: Full cybernetics stack
loop = FeedbackControlLoop(dispatcher, quality_gate=0.7)
result = loop.run("Your task here")

# Option 2: Guard only (minimal adoption)
result = dispatcher.dispatch("Your task")
for w in result.worker_results:
    abort, reason = guard.check_abort(w.output, w.duration)
    if abort:
        print(f"Aborted: {reason}")

# Option 3: Learning only
fingerprint.record_execution("task", result, result.timing, result.matched_roles)
similar = fingerprint.find_similar("new task", top_k=3)

All modules are optional switches — DevSquad works perfectly without them.

🎯 Quick Start (4 Ways to Use DevSquad)

0️⃣ First Time? Start Here!

# Interactive setup wizard (1-2 minutes)
python scripts/cli.py init

# Then start collaborating!
devsquad dispatch -t "your task description"

1️⃣ Interactive Web Dashboard (Recommended)

# Start Streamlit dashboard with authentication
streamlit run scripts/dashboard.py

# Open http://localhost:8501
# ⚠️ Security: Default credentials are for initial setup only.
#    Login with default account, then change password immediately.
#    Username: admin   Password: <your-secure-password>
#    Or set via environment variables: $DASHBOARD_USER / $DASHBOARD_PASS

2️⃣ REST API Server

# Install dependencies
pip install fastapi uvicorn

# Start API server
uvicorn scripts.api_server:app --host 0.0.0.0 --port 8000 --reload

# Access Swagger UI: http://localhost:8000/docs
# Access ReDoc:      http://localhost:8000/redoc

3️⃣ Command Line Interface

# Standard CLI usage
python scripts/cli.py lifecycle build

# Enhanced visual output
python scripts/cli.py lifecycle build --visual --verbose

🏗️ Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                    User Access Layer                         │
│  ┌──────────────┐ ┌──────────────┐ ┌──────────────┐        │
│  │ Streamlit    │ │ FastAPI REST │ │ CLI/Notebook │        │
│  │ Dashboard    │ │ API Server   │ │ (Existing)   │        │
│  │ (Auth+HTTPS) │ │ (Swagger)    │ │              │        │
│  └──────┬───────┘ └──────┬───────┘ └──────────────┘        │
└─────────┼───────────────┼───────────────────────────────────┘
          │               │
          ▼               ▼
┌─────────────────────────────────────────────────────────────┐
│                   Business Logic Layer                      │
│  ┌─────────────┐ ┌─────────────┐ ┌─────────────┐           │
│  │AuthManager  │ │AlertManager │ │HistoryMgr   │           │
│  │(RBAC Auth)  │ │(Multi-Chnl) │ │(SQLite TSDB)│           │
│  └─────────────┘ └─────────────┘ └─────────────┘           │
│  ┌─────────────────────────────────────────────┐            │
│  │     LifecycleProtocol (11-Phase Engine)       │            │
│  │     UnifiedGateEngine + CheckpointManager     │            │
│  └─────────────────────────────────────────────┘            │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                    Data Persistence Layer                    │
│  ┌────────────┐ ┌────────────┐ ┌────────────────────────┐  │
│  │ SQLite DB  │ │ YAML Config│ │ Checkpoint Files       │  │
│  │ (History)  │ │ (Deploy)   │ │ (Lifecycle State)      │  │
│  └────────────┘ └────────────┘ └────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘

✨ Key Features (V3.6.1)

⚓ AnchorChecker (NEW)

Milestone anchor verification that ensures critical checkpoints are properly validated before proceeding:

Anchor Point Definition — Define mandatory validation anchors at key lifecycle milestones
Cross-Phase Verification — Verify consistency between phase outputs and anchor criteria
Drift Detection — Detect when project execution drifts from defined anchor points
Auto-Recovery — Suggest corrective actions when anchor checks fail

🔄 RetrospectiveEngine (NEW)

Independent retrospective mechanism for continuous improvement after each dispatch cycle:

Post-Dispatch Review — Automatically analyze what went well and what could improve
Pattern Extraction — Extract reusable patterns from successful collaborations
Anti-Pattern Detection — Identify recurring issues and suggest process improvements
Metric Trend Analysis — Track quality metrics across dispatches to spot degradation

📊 FeatureUsageTracker (NEW)

Thread-safe feature invocation counter for data-driven feature optimization:

Invocation Tracking — Count every feature call (dispatch, anchor_check, retrospective, consensus, etc.)
Usage Reports — Top features, unused features, low-usage features with markdown export
Auto-Persist — Periodic JSON persistence every 100 ticks
30 Known Features — Pre-registered feature set covering all DevSquad capabilities

🎯 StructuredGoal (NEW)

Structured goal management that decomposes high-level objectives into trackable, verifiable sub-goals:

Goal Decomposition — Break complex objectives into hierarchical sub-goals with clear criteria
Progress Tracking — Real-time progress measurement against defined goal structure
Dependency Mapping — Visualize and manage dependencies between sub-goals
Completion Verification — Automated verification that goals meet their success criteria

🔀 FallbackBackend (NEW)

Automatic backend failover that ensures LLM availability even when primary backends are down:

Health Monitoring — Continuous health checks for all configured LLM backends
Automatic Failover — Seamlessly switch to backup backend when primary fails
Priority-Based Routing — Configure backend priority order (e.g., OpenAI → Anthropic → Mock)
Recovery Detection — Automatically restore primary backend when it recovers

🔍 VerificationGate — Evidence-Based Quality

Prove-It Pattern: Every completion claim must include verifiable evidence (test output, diff, benchmark)
7 Red Flags: no_test | tests_pass_first_run | no_regression_test | no_security_scan | no_perf_baseline | vague_description | evidence_missing
Auto-active: Integrated into TaskCompletionChecker — zero config required

🔐 Authentication & Authorization

Multi-user support with role-based access control (RBAC)
Three roles: Admin (full access), Operator (execute), Viewer (read-only)
Secure password hashing with SHA-256
Session management for Streamlit dashboard
OAuth2 support (optional, for enterprise deployments)

🌐 REST API (FastAPI)

10+ endpoints for complete lifecycle management
Automatic OpenAPI/Swagger documentation at /docs
CORS middleware for cross-origin requests
Request timing and comprehensive logging
Standardized error responses

Key Endpoints:

Lifecycle:
  GET    /api/v1/lifecycle/phases          → List all 11 phases
  POST   /api/v1/lifecycle/actions         → Execute phase actions
  GET    /api/v1/lifecycle/status          → Current status

Metrics:
  GET    /api/v1/metrics/current          → Real-time metrics
  GET    /api/v1/metrics/history          → Historical data

Gates:
  GET    /api/v1/gates/status             → All gate statuses
  POST   /api/v1/gates/check              → Check specific gate

System:
  GET    /api/v1/health                   → Health check

🔔 Alert Notification System

4 severity levels: INFO, WARNING, ERROR, CRITICAL
Multiple channels: Console, Slack, Email, Webhook
Rate limiting to prevent alert spam (configurable)
Deduplication within time window
Alert history tracking and statistics

📊 Historical Data Storage (SQLite)

Metrics snapshots with time-range queries
Alert history with acknowledgment tracking
API request logs with performance metrics
Lifecycle events audit trail
Automatic cleanup with configurable retention

📈 Visualization & Monitoring

Streamlit Dashboard: Real-time monitoring with authentication
CLI Visual Module: Rich terminal output with colors and icons
Jupyter Notebook: Interactive 10-section tutorial
Benchmark Reports: HTML/JSON performance reports

🧩 Layered Sub-Skill Architecture (V3.6.1)

DevSquad provides 6 atomic sub-skills that can be used independently or together. Each sub-skill is a thin wrapper (~50 lines) importing existing core modules — no duplicated logic.

skills/
├── dispatch/       → DispatchSkill — MultiAgentDispatcher (7-role orchestration)
├── intent/         → IntentSkill   — IntentWorkflowMapper (6 intents × 3 languages)
├── review/         → ReviewSkill   — FiveAxisConsensusEngine (5-axis code review)
├── security/       → SecuritySkill — InputValidator + OperationClassifier + PermissionGuard
├── test/           → TestSkill     — TestQualityGuard + test strategy generation
└── retrospective/  → RetroSkill    — RetrospectiveEngine + pattern extraction

Sub-Skill Quick Reference

Skill	Core Method	Wraps	Mock Mode
`dispatch`	`run(task, roles, mode)`	MultiAgentDispatcher	✅
`intent`	`detect(text, lang)`	IntentWorkflowMapper	✅
`review`	`review(code)`	FiveAxisConsensusEngine	✅
`security`	`scan_input(text)`	InputValidator + OpClassifier	✅
`test`	`generate_strategy(module)`	TestQualityGuard	✅
`retrospective`	`run_retrospective(results)`	RetrospectiveEngine	✅

Usage Examples

# Direct import (recommended for single skill)
from skills.dispatch.handler import DispatchSkill
result = DispatchSkill().run("Fix login bug", roles=["coder", "tester"])

# Via registry (dynamic discovery)
from skills import get_skill, list_skills
print(list_skills())  # ['dispatch', 'intent', 'review', 'security', 'test', 'retrospective']
skill = get_skill("security")
result = skill.scan_input("DROP TABLE users; --")

All sub-skills work without any API key in Mock mode.

📋 Plan C Architecture (Core Engine)

Unified Lifecycle Architecture - Resolves CLI 6 commands vs 11-phase lifecycle:

CLI View Layer (6 commands)          Core Engine (11 phases)
┌─────────────────────┐            ┌──────────────────────────┐
│ spec → P1, P2       │───View ──→│ P1: Requirements         │
│ plan → P7           │   Mapping │ P2: Architecture         │
│ build → P8          │            │ P3: Technical Design     │
│ test → P9           │            │ ...                      │
│ review → P8,P6      │            │ P10: Deployment          │
│ ship → P10          │            │ P11: Operations          │
└─────────────────────┘            └──────────────────────────┘
        ↓                                    ↓
  UnifiedGateEngine                   CheckpointManager
  (Phase + Worker gates)              (Lifecycle state persistence)

Core Components:

✅ LifecycleProtocol - Abstract interface for unified lifecycle management
✅ UnifiedGateEngine - Integrates VerificationGate + Phase transition gates
✅ FullLifecycleAdapter - Complete 11-phase lifecycle with dependency resolution
✅ Enhanced CheckpointManager - Auto save/restore lifecycle state across sessions

What is DevSquad?

DevSquad transforms a single AI task into a multi-role AI collaboration. It automatically dispatches your task to the right combination of expert roles — architect, product manager, coder, tester, security reviewer, DevOps — orchestrates their parallel collaboration through a shared workspace, resolves conflicts via weighted consensus voting, and delivers a unified structured report.

You: "Design a microservices e-commerce backend"
         │
         ▼
┌─────────────────┐
│  InputValidator   ──→ Security check (XSS, SQL injection, prompt injection)
└────────┬────────┘
         ▼
┌─────────────────┐
│  RoleMatcher     ──→ Auto-match: architect + devops + security
└────────┬────────┘
         ▼
┌──────────┬──────────┬──────────┐
│ Architect │  DevOps   │ Security │   ← ThreadPoolExecutor parallel execution
│(Design)   │(Infra)   │(Threat)  │
└────┬──────┴────┬─────┴────┬────┘
     └────────────┼───────────┘
                  ▼
      ┌──────────────────┐
      │    Scratchpad     │ ← Shared blackboard (real-time sync)
      └────────┬─────────┘
               ▼
      ┌──────────────────┐
      │ Consensus Engine  │ ← Weighted vote + veto + escalation
      └────────┬─────────┘
               ▼
      ┌──────────────────┐
      │ Structured Report │ ← Findings + Action Items (H/M/L)
      └──────────────────┘

📦 Installation

Prerequisites

Python 3.9+ (3.9, 3.10, 3.11, 3.12 supported)
pip or pipenv for package management

Option A: PyPI Install (Recommended)

# Install from PyPI — zero setup, ready to use
pip install devsquad

# With optional dependencies
pip install "devsquad[api]"    # FastAPI + Streamlit dashboard
pip install "devsquad[all]"    # All optional features

Option B: Core Installation (CLI + Dashboard)

git clone https://github.com/your-org/DevSquad.git
cd DevSquad

# Install core package (minimal dependencies)
pip install -e .

# Ready to use!
devsquad dispatch -t "Design user authentication system"

Option C: Full Production Stack (All Features)

# Clone and install with all production features
git clone https://github.com/your-org/DevSquad.git
cd DevSquad

# Install with API server dependencies
pip install -e ".[api]"

# Or install all optional features
pip install -e ".[all]"

Optional Feature Groups:

# API Server (FastAPI + Uvicorn)
pip install -e ".[api]"

# Visualization (Streamlit + Jupyter)
pip install -e ".[visualization]"

# Alerting (Slack SDK)
pip install -e ".[alerts]"

# Development & Testing
pip install -e ".[dev]"

# Everything combined
pip install -e ".[all]"

Verify Installation

# Check version
devsquad --version
# Expected: devsquad 3.6.1

# Run tests
pytest tests/ -v --tb=short
# Expected: 1500+ passed

3 Ways to Use

1. CLI (Recommended)

# Mock mode (default) — no API key needed
python3 scripts/cli.py dispatch -t "Design user authentication system"

# Real AI output — set environment variables first
export OPENAI_API_KEY="sk-..."
export OPENAI_BASE_URL="https://api.openai.com/v1"   # optional
export OPENAI_MODEL="gpt-4"                            # optional
python3 scripts/cli.py dispatch -t "Design auth system" --backend openai

# Specify roles (short IDs: arch/pm/test/coder/ui/infra/sec)
python3 scripts/cli.py dispatch -t "Design auth system" -r arch sec --backend openai

# Stream output in real-time
python3 scripts/cli.py dispatch -t "Design auth system" -r arch --backend openai --stream

# Other commands
python3 scripts/cli.py status          # System status
python3 scripts/cli.py roles           # List available roles
python3 scripts/cli.py --version       # Show version (3.6.1)

2. Python API

from scripts.collaboration.dispatcher import MultiAgentDispatcher

# Mock mode (default)
disp = MultiAgentDispatcher()
result = disp.dispatch("Design REST API for user management")
print(result.to_markdown())
disp.shutdown()

# With LLM backend
from scripts.collaboration.llm_backend import create_backend
backend = create_backend("openai", api_key="sk-...", base_url="https://api.openai.com/v1")
disp = MultiAgentDispatcher(llm_backend=backend)
result = disp.dispatch("Design auth system", roles=["architect", "security"])
print(result.summary)
disp.shutdown()

4. Sub-Skills (Lightweight Independent)

# Each sub-skill works independently — no Dispatcher needed
from skills.security.handler import SecuritySkill
risk = SecuritySkill().scan_input("malicious input")

from skills.review.handler import ReviewScore
verdict = ReviewSkill().review(code_snippet)

from skills.intent.handler import IntentSkill
intent = IntentSkill().detect("修复登录漏洞", lang="zh")

3. MCP Server (for Cursor / any MCP client)

pip install mcp
python3 scripts/mcp_server.py              # stdio mode
python3 scripts/mcp_server.py --port 8080  # SSE mode

Exposes 6 tools: multiagent_dispatch, multiagent_quick, multiagent_roles, multiagent_status, multiagent_analyze, multiagent_shutdown.

7 Core Roles

Role	CLI ID	Aliases	Weight	Best For
Architect	`arch`	`architect`	1.5	System design, tech stack, performance/security architecture
Product Manager	`pm`	`product-manager`	1.2	Requirements, user stories, acceptance criteria
Security Expert	`sec`	`security`	1.1	Threat modeling, vulnerability audit, compliance
Tester	`test`	`tester`, `qa`	1.0	Test strategy, quality assurance, edge cases
Coder	`coder`	`solo-coder`, `dev`	1.0	Implementation, code review, performance optimization
DevOps	`infra`	`devops`	1.0	CI/CD, containerization, monitoring, infrastructure
UI Designer	`ui`	`ui-designer`	0.9	UX flow, interaction design, accessibility

Auto-match: If no roles specified, the dispatcher automatically matches based on task keywords.

Architecture Overview (60+ Core Modules)

DevSquad is built on a layered architecture with clear separation of concerns:

┌─────────────────────────────────────────────────┐
│                    CLI / MCP / API               │  Entry Points
├─────────────────────────────────────────────────┤
│              MultiAgentDispatcher                │  Orchestration
│  ┌────────────┬──────────────┬────────────────┐ │
│  │RoleMatcher │ReportFormatter│InputValidator  │ │  Extracted Components
│  └────────────┴──────────────┴────────────────┘ │
│  ┌────────────────────────────────────────────┐ │
│  │ RuleCollector (NL Rule Intercept)          │ │  Rule Collection
│  └────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────┤
│                 Coordinator                      │  Task Planning
│  ┌──────────┬───────────┬────────────────────┐  │
│  │ Scratchpad│ Consensus │  BatchScheduler    │  │  Collaboration
│  └──────────┴───────────┴────────────────────┘  │
├─────────────────────────────────────────────────┤
│              Worker (per role)                   │  Execution
│  ┌────────────────────────────────────────────┐ │
│  │ PromptAssembler → LLMBackend → Output      │ │
│  └────────────────────────────────────────────┘ │
├─────────────────────────────────────────────────┤
│  LLMBackend: Mock | OpenAI | Anthropic          │  LLM Layer
├─────────────────────────────────────────────────┤
│  CheckpointManager | WorkflowEngine | ...       │  Infrastructure
└─────────────────────────────────────────────────┘

What's New in V3.6.1 🆕

AnchorChecker System

Milestone anchor verification that ensures critical checkpoints are validated before proceeding:

from scripts.collaboration.anchor_checker import AnchorChecker

checker = AnchorChecker()
checker.define_anchor("architecture_complete", criteria=["API spec defined", "tech stack selected"])
result = checker.check_anchor("architecture_complete", phase_output)
print(f"Anchor passed: {result.passed}")
print(f"Drift detected: {result.drift_score}")

Features:

Cross-phase consistency verification
Drift detection with severity scoring
Auto-recovery suggestions
Anchor point persistence

RetrospectiveEngine

Independent retrospective mechanism for continuous improvement:

from scripts.collaboration.retrospective_engine import RetrospectiveEngine

engine = RetrospectiveEngine()
report = engine.run_retrospective(dispatch_result)
print(f"Patterns found: {len(report.patterns)}")
print(f"Anti-patterns: {len(report.anti_patterns)}")
print(f"Improvement suggestions: {report.suggestions}")

Features:

Post-dispatch quality analysis
Pattern and anti-pattern extraction
Metric trend tracking
Actionable improvement suggestions

StructuredGoal

Structured goal management with hierarchical decomposition:

from scripts.collaboration.structured_goal import StructuredGoal

goal = StructuredGoal("Build e-commerce platform")
goal.add_sub_goal("User auth", criteria=["OAuth2 support", "2FA ready"])
goal.add_sub_goal("Product catalog", criteria=["Search", "Filter", "Pagination"])
progress = goal.get_progress()
print(f"Overall: {progress.completion_pct}%")

Features:

Hierarchical goal decomposition
Dependency mapping between sub-goals
Real-time progress tracking
Automated completion verification

FallbackBackend

Automatic LLM backend failover for high availability:

from scripts.collaboration.llm_backend import FallbackBackend

backend = FallbackBackend(
    primary="openai",
    fallbacks=["anthropic", "mock"],
    health_check_interval=30,
)
result = backend.generate("Design auth system")
# Automatically fails over if primary is down

Features:

Continuous backend health monitoring
Seamless automatic failover
Priority-based routing configuration
Automatic primary recovery detection

Natural Language Rule Collection

Automatically detect and store user rules from natural language input:

# User says: "记住规则：写代码时必须加注释"
# DevSquad automatically:
# 1. Detects rule-storing intent
# 2. Extracts: trigger="写代码时", action="必须加注释", type="always"
# 3. Sanitizes content (removes dangerous patterns)
# 4. Stores via CarryMem or local JSON fallback

# List stored rules
# User says: "列出规则" → Returns all stored rules

# Delete a rule
# User says: "删除规则 RULE-LOCAL-abc123"

Pipeline: User Input → IntentDetector → RuleExtractor → RuleSanitizer → RuleStorage (CarryMem + local JSON)

Features:

11 intent patterns (Chinese + English)
4 rule types: always / avoid / prefer / forbid
Prompt injection protection in rule content
CarryMem primary + local JSON fallback storage
Automatic rule injection into Worker prompts

See Integration Guide for detailed usage.

Key Features

Security

InputValidator: XSS, SQL injection, command injection, HTML injection detection
Prompt Injection Protection: 21+ patterns (ignore previous instructions, jailbreak, DAN mode, system prompt extraction, etc.)
API Key Safety: Environment variables only, never CLI arguments or logs
PermissionGuard: 4-level safety gate (PLAN → DEFAULT → AUTO → BYPASS)

Performance

ThreadPoolExecutor: Real parallel execution for multi-role dispatch
LLM Cache: TTL-based LRU cache with disk persistence (60-80% cost reduction)
LLM Retry: Exponential backoff + circuit breaker + multi-backend fallback
Streaming Output: Real-time chunk-by-chunk LLM output via --stream

Reliability

CheckpointManager: SHA256 integrity, handoff documents, auto-cleanup
WorkflowEngine: Task-to-workflow auto-split, step execution, resume from checkpoint, 11-phase lifecycle templates (full/backend/frontend/internal_tool/minimal), requirement change management
TaskCompletionChecker: DispatchResult/ScheduleResult completion tracking
ConsensusEngine: Weighted voting with veto power and human escalation

Project Lifecycle (11-Phase Model)

DevSquad V3.6.1 defines an 11-phase (4 optional) project lifecycle with clear roles, dependencies, and gate conditions:

P1 → P2 ──┬──→ P3 ──→ P6 ──→ P7 ──→ P8 ──→ P9 ──→ P10 ──→ P11
           ├──→ P4(∥P3) ──↗
           └──→ P5(dep P1+P3) ──↗

Template	Phases	Use Case
`full`	P1-P11	Complete project
`backend`	No P5	Backend services
`frontend`	No P4,P6	Frontend applications
`internal_tool`	No P4,P5,P6,P11	Internal tools
`minimal`	P1,P3,P7,P8,P9	Minimum set

See GUIDE.md §4 for full lifecycle details with gate conditions and requirement change process.

Developer Experience

Configuration File: .devsquad.yaml in project root with env var overrides
Quality Control Injection: Auto-inject QC rules (hallucination prevention, overconfidence check, security guard, RACI protocol) into Worker prompts based on .devsquad.yaml config
Docker Support: docker build -t devsquad . && docker run devsquad dispatch -t "task"
GitHub Actions CI: Python 3.9-3.12 matrix testing
pip installable: pip install -e . with optional dependencies

Module Reference (60+ Modules)

💡 Table too wide? View modules online or use devsquad --help modules for a compact list.

Module	File	Purpose
MultiAgentDispatcher	`dispatcher.py`	Unified entry point
Coordinator	`coordinator.py`	Global orchestration: plan → assign → execute → collect
Worker	`worker.py`	Role executor with LLM backend integration
EnhancedWorker	`enhanced_worker.py`	Worker with auto QA (briefing + confidence + retry + memory rules)
Scratchpad	`scratchpad.py`	Shared blackboard for inter-worker communication
ConsensusEngine	`consensus.py`	Weighted voting + veto + escalation
RoleMatcher	`role_matcher.py`	Keyword-based role matching with alias resolution
ReportFormatter	`report_formatter.py`	Structured/compact/detailed report generation
InputValidator	`input_validator.py`	Security validation + prompt injection detection
AISemanticMatcher	`ai_semantic_matcher.py`	LLM-powered semantic role matching
CheckpointManager	`checkpoint_manager.py`	State persistence + handoff documents
WorkflowEngine	`workflow_engine.py`	Task-to-workflow auto-split + 11-phase lifecycle templates + requirement change
TaskCompletionChecker	`task_completion_checker.py`	Completion tracking + progress reporting
CodeMapGenerator	`code_map_generator.py`	Python AST-based code structure analysis
DualLayerContextManager	`dual_layer_context.py`	Project-level + task-level context management
SkillRegistry	`skill_registry.py`	Reusable skill registration + discovery
IntentWorkflowMapper	`intent_workflow_mapper.py`	User intent → workflow chain mapping (6 intents × 3 languages)
OperationClassifier	`operation_classifier.py`	Three-tier operation classification (ALWAYS_SAFE/NEEDS_REVIEW/FORBIDDEN)
FiveAxisConsensusEngine	`five_axis_consensus.py`	Five-axis review consensus with weighted voting
FeatureUsageTracker	`feature_usage_tracker.py`	Feature usage tracking + reporting + auto-persistence
LLMBackend	`llm_backend.py`	Mock/OpenAI/Anthropic with streaming support
LLMCache	`llm_cache.py`	TTL-based LRU cache with disk persistence
LLMRetry	`llm_retry.py`	Exponential backoff + circuit breaker
ConfigManager	`config_loader.py`	YAML config + env var overrides
PromptAssembler	`prompt_assembler.py`	Dynamic prompt assembly + QC rule injection
AgentBriefing	`agent_briefing.py`	Context-aware task briefing with priority filtering
ConfidenceScorer	`confidence_score.py`	5-factor response quality assessment
PerformanceMonitor	`performance_monitor.py`	P95/P99 tracking + CPU/memory monitoring
MCEAdapter	`mce_adapter.py`	CarryMem integration adapter (optional dependency, supports match_rules + format_rules_as_prompt + add_rule)
Protocols	`protocols.py`	Interface definitions (CacheProvider, MemoryProvider, etc.)
NullProviders	`null_providers.py`	Graceful degradation providers
PermissionGuard	`permission_guard.py`	4-level safety gate
MemoryBridge	`memory_bridge.py`	Cross-session memory
BatchScheduler	`batch_scheduler.py`	Batch task scheduling
ContextCompressor	`context_compressor.py`	Context compression for long tasks
RoleTemplateMarket	`role_template_market.py`	Role template sharing marketplace
Skillifier	`skillifier.py`	Auto skill learning from tasks
UsageTracker	`usage_tracker.py`	Token/cost tracking
WarmupManager	`warmup_manager.py`	Startup warmup optimization
TestQualityGuard	`test_quality_guard.py`	Test quality enforcement
PromptVariantGenerator	`prompt_variant_generator.py`	A/B prompt testing
ConfigManager (YAML)	`config_manager.py`	Project-level YAML config
WorkBuddyClawSource	`memory_bridge.py`	WorkBuddy read-only bridge
Models	`models.py`	Shared data models and type definitions
LLMCacheAsync	`llm_cache_async.py`	Async LLM cache for concurrent workloads
LLMRetryAsync	`llm_retry_async.py`	Async LLM retry with backoff
IntegrationExample	`integration_example.py`	DevSquad integration example code
AsyncIntegrationExample	`async_integration_example.py`	Async DevSquad integration example
AnchorChecker	`anchor_checker.py`	Milestone anchor verification + drift detection + auto-recovery
RetrospectiveEngine	`retrospective.py`	Independent post-dispatch retrospective + pattern extraction + anti-pattern detection
FeatureUsageTracker	`feature_usage_tracker.py`	Feature invocation counter + usage reports + auto-persist
FallbackBackend	`llm_backend.py`	Automatic LLM backend failover with health monitoring

Configuration

Create .devsquad.yaml in your project root:

quality_control:
  enabled: true
  strict_mode: true
  min_quality_score: 85
  ai_quality_control:
    enabled: true
    hallucination_check:
      enabled: true
      require_traceable_references: true
    overconfidence_check:
      enabled: true
      require_alternatives_min: 2
  ai_security_guard:
    enabled: true
    permission_level: "DEFAULT"
  ai_team_collaboration:
    enabled: true
    raci:
      mode: "strict"

llm:
  backend: openai
  base_url: ""  # Set via LLM_BASE_URL env var
  model: ""     # Set via LLM_MODEL env var
  timeout: 120
  log_level: WARNING

Or use environment variables (higher priority):

export DEVSQUAD_LLM_BACKEND=openai
export DEVSQUAD_BASE_URL=https://api.openai.com/v1
export DEVSQUAD_MODEL=gpt-4
export OPENAI_API_KEY=sk-...

Environment Variables

Variable	Purpose	Default
`OPENAI_API_KEY`	OpenAI API key	None (required for OpenAI backend)
`OPENAI_BASE_URL`	OpenAI-compatible base URL	None
`OPENAI_MODEL`	Model name	`gpt-4`
`ANTHROPIC_API_KEY`	Anthropic API key	None (required for Anthropic backend)
`ANTHROPIC_MODEL`	Model name	`claude-sonnet-4-20250514`
`DEVSQUAD_LLM_BACKEND`	Default backend type	`mock`
`DEVSQUAD_LOG_LEVEL`	Logging level	`WARNING`

Running Tests

# Core tests (748+ tests all passing)
python3 -m pytest scripts/collaboration/core_test.py \
  scripts/collaboration/role_mapping_test.py \
  scripts/collaboration/upstream_test.py \
  scripts/collaboration/mce_adapter_test.py \
  tests/ test_v35_integration.py \
  tests/test_anti_rationalization.py \
  tests/test_verification_gate.py \
  tests/test_intent_workflow_mapper.py \
  tests/test_cli_lifecycle.py -v

# Quick smoke test
python3 scripts/cli.py --version    # 3.6.1
python3 scripts/cli.py status       # System ready
python3 scripts/cli.py roles        # List 7 roles

# Lifecycle commands (NEW in v3.4.1)
python3 scripts/cli.py spec -t "User authentication system"
python3 scripts/cli.py build -t "Implement login API"
python3 scripts/cli.py test -t "Run all unit tests"
python3 scripts/cli.py review -t "Check PR #123"
python3 scripts/cli.py ship -t "Deploy to production"

🔄 Upgrade Smoke Test

After upgrading DevSquad, run these commands to verify your environment:

# Quick health check (should complete in < 30s)
python3 scripts/cli.py --version       # Expected: DevSquad 3.6.1
python3 scripts/cli.py status          # Expected: System ready
python3 scripts/cli.py roles           # Expected: 7 core roles listed

# Full test suite
python3 -m pytest tests/ -q --tb=line # Expected: 1662 passed

With Coverage Report

# Install coverage tool first: pip install pytest-cov
python3 -m pytest tests/ --cov=scripts --cov-report=term-missing --cov-fail-under=80
# Expected: coverage ≥ 80%, detailed missing-line report

Test Layering Strategy

DevSquad uses a priority-based test layering strategy:

Priority	Scope	Examples	Count
P0	Quality Framework Core	AntiRationalization (39), VerificationGate (42), IntentWorkflowMapper (58), AuthManager (35)	~200
P1	Enhancement Modules	FiveAxisConsensus (29), OperationClassifier (27), OutputSlicer (26), CIFeedbackAdapter (22)	~150
P1+	Cybernetics (V3.6.1)	FeedbackControlLoop (19), ExecutionGuard (40), PerformanceFingerprint (13), SimilarTaskRecommender (17), AdaptiveRoleSelector (21)	110
P2	Integration & E2E	Full lifecycle dispatch, cross-module integration	~200
P3	Unit per Module	Core dispatcher, RoleMapping, MCEAdapter, LLM backends	~400+

Total: 1662 tests

Run by priority:

# P0 only (critical path, < 10s)
python3 -m pytest tests/ -k "anti_ratif or verification or intent_workflow or auth" -q

# P0 + P1 (quality + enhancement, < 30s)
python3 -m pytest tests/ -k "anti_ratif or verification or intent or auth or five_axis or operation" -q

# Full suite
python3 -m pytest tests/ -q --tb=line

Documentation

Document	Description
QUICK_START_EN.md	Quick start guide (English, 5 minutes)
REFERENCE_GUIDE_EN.md	Complete reference guide (English)
QUICK_START_JP.md	クイックスタートガイド (日本語, 5分)
REFERENCE_GUIDE_JP.md	完全リファレンスガイド (日本語)
GUIDE.md	Complete user guide (Chinese)
GUIDE_EN.md	~~Complete user guide (English)~~ → See QUICK_START + REFERENCE_GUIDE
GUIDE_JP.md	~~完全なユーザーガイド (日本語)~~ → クイックスタート＋リファレンスを参照
INSTALL.md	Installation guide (Unix + Windows)
EXAMPLES.md	Real-world usage examples
SKILL.md	Skill manual (EN/CN/JP)
CLAUDE.md	Claude Code project instructions
CHANGELOG.md	Version history
README-CN.md	中文说明
README-JP.md	日本語説明

🆕 Quick Start (Recommended for New Users)

New to DevSquad? Start here:

# 1. Run the interactive demo (3 scenarios, < 15 seconds)
python examples/quick_demo.py

# 2. Read the quick start guide
# English: docs/i18n/QUICK_START_EN.md
# Japanese: docs/i18n/QUICK_START_JP.md

# 3. Your first dispatch
python3 scripts/cli.py dispatch -t "Design user authentication system"

☸️ Kubernetes Deployment

# Deploy with Helm
helm install devsquad ./helm/devsquad

# Port forward
kubectl port-forward svc/devsquad-api 8000:8000

See helm/devsquad/README.md for full documentation.

Cross-Platform Compatibility

DevSquad is not TRAE-exclusive. It supports 6 integration methods:

Platform	Integration	Setup Difficulty	Key Features Available
TRAE IDE	Native Skill (`skill-manifest.yaml`)	Zero config	Full: Dispatcher + Dashboard + CLI
Claude Code	MCP Server / Python import	Low	6 MCP tools or direct API
Cursor	MCP Server (`stdio` mode)	Low	Same as Claude Code
OpenClaw / WorkBuddy Claw	`WorkBuddyClawSource` bridge	Auto	Read-only memory bridge
Any MCP Client	stdio / SSE dual mode	Low	6 tools, configurable port
Pure Python	`pip install -e .`	Low	CLI + API + Skills + REST
Docker	`docker build & run`	Low	Isolated container with all features

Quick Start per Platform

# === TRAE IDE ===
# Just use it — zero configuration

# === Claude Code / Cursor (MCP) ===
# Add to .claude/mcp.json or .cursor/mcp.json:
# {"mcpServers": {"devsquad": {"command": "python", "args": ["/path/to/mcp_server.py"]}}}

# === Pure Python ===
pip install -e "/path/to/DevSquad[all]"
devsquad dispatch -t "task description"

# === REST API ===
uvicorn scripts.api_server:app --port 8000   # → http://localhost:8000/docs

# === Docker ===
docker build -t devsquad . && docker run -it devsquad dispatch -t "test"

Version History

Date	Version	Highlights
2026-05-17	V3.6.1	Cybernetics Enhancement — 5 new modules (FeedbackControlLoop/ExecutionGuard/PerformanceFingerprint/SimilarTaskRecommender/AdaptiveRoleSelector) with feedback loops, execution guards, TF-IDF similarity search, and adaptive role selection. Inspired by upstream TraeMultiAgentSkill v2.5's cybernetics architecture.
2026-05-16	V3.6.0	Layered Sub-Skill Architecture + Core Modules — 6 atomic sub-skills (dispatch/intent/review/security/test/retrospective) with lazy-loading registry via importlib, each ~50 lines wrapping existing core modules. Plus: AnchorChecker (milestone anchor verification + drift detection), RetrospectiveEngine (independent retrospective + pattern extraction), StructuredGoal (structured goal decomposition + progress tracking), FallbackBackend (automatic LLM failover + health monitoring), FeatureUsageTracker (feature usage tracking + reporting + auto-persistence), 7 module integrations (IntentWorkflowMapper/AISemanticMatcher/DualLayerContextManager/OperationClassifier/SkillRegistry/FiveAxisConsensusEngine/NullProviders), 1662+ tests, 48 core modules. Cross-platform compatibility: Claude Code/Cursor/OpenClaw/Pure Python/Docker/MCP.
2026-05-05	V3.5.0	Enhancement Sprint — Code walkthrough enhancement, documentation consistency checks, Karpathy principles, project understanding (AgentBriefing), CLI lifecycle commands, structured output, 748+ tests
2026-05-03	V3.4.1	Agent Skills Quality Framework (P0) — AntiRationalizationEngine + VerificationGate + IntentWorkflowMapper + CLI Lifecycle Commands (spec/plan/build/test/review/ship) + 167 new tests + Google Agent Skills integration + 49 core modules
2026-05-02	V3.4.0	Foundation Release — Real LLM backend (OpenAI/Anthropic/Mock), ThreadPoolExecutor parallel execution, InputValidator + prompt injection protection, CheckpointManager, WorkflowEngine with 11-phase lifecycle templates (full/backend/frontend/internal_tool/minimal), TaskCompletionChecker, AISemanticMatcher, streaming output, Docker, GitHub Actions CI, config file, CodeMapGenerator, DualLayerContext, SkillRegistry, CarryMem integration, AgentBriefing, ConfidenceScore, EnhancedWorker with auto QA, Protocol interface system, 234+ unit tests, requirement change management with gate mechanism and gap reporting
2026-04-17	V3.2	E2E Demo, MCE Adapter, Dispatcher UX
2026-04-16	V3.0	Complete redesign — Coordinator/Worker/Scratchpad architecture

License

MIT License — see LICENSE for details.

Links

Link	URL
GitHub (This Repo)	https://github.com/lulin70/DevSquad
Original / Upstream	https://github.com/weiransoft/TraeMultiAgentSkill

Name		Name	Last commit message	Last commit date
Latest commit History 185 Commits
.github/workflows		.github/workflows
config		config
docs		docs
examples		examples
helm/devsquad		helm/devsquad
scripts		scripts
skills		skills
templates/concerns		templates/concerns
tests		tests
.devsquad.yaml		.devsquad.yaml
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
EXAMPLES.md		EXAMPLES.md
GUIDE.md		GUIDE.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README-CN.md		README-CN.md
README-JP.md		README-JP.md
README.md		README.md
SKILL.md		SKILL.md
USER_READINESS_ASSESSMENT.md		USER_READINESS_ASSESSMENT.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
skill-manifest.yaml		skill-manifest.yaml

Folders and files

Latest commit

History

Repository files navigation

DevSquad — Multi-Role AI Task Orchestrator

🚀 V3.6.1: Cybernetics Enhancement Release

🔄 V3.6.1 Cybernetics Module Details

1️⃣ FeedbackControlLoop (Feedback Controller)

2️⃣ ExecutionGuard (Execution Guardian)

3️⃣ PerformanceFingerprint (Performance Fingerprint)

4️⃣ SimilarTaskRecommender (Similar Task Recommender)

5️⃣ AdaptiveRoleSelector (Adaptive Role Selector)

🔗 Integration Architecture

🎯 Quick Start (4 Ways to Use DevSquad)

0️⃣ First Time? Start Here!

1️⃣ Interactive Web Dashboard (Recommended)

2️⃣ REST API Server

3️⃣ Command Line Interface

🏗️ Architecture Overview

✨ Key Features (V3.6.1)

⚓ AnchorChecker (NEW)

🔄 RetrospectiveEngine (NEW)

📊 FeatureUsageTracker (NEW)

🎯 StructuredGoal (NEW)

🔀 FallbackBackend (NEW)

🔍 VerificationGate — Evidence-Based Quality

🔐 Authentication & Authorization

🌐 REST API (FastAPI)

🔔 Alert Notification System

📊 Historical Data Storage (SQLite)

📈 Visualization & Monitoring

🧩 Layered Sub-Skill Architecture (V3.6.1)

Sub-Skill Quick Reference

Usage Examples

📋 Plan C Architecture (Core Engine)

What is DevSquad?

📦 Installation

Prerequisites

Option A: PyPI Install (Recommended)

Option B: Core Installation (CLI + Dashboard)

Option C: Full Production Stack (All Features)

Verify Installation

3 Ways to Use

7 Core Roles

Architecture Overview (60+ Core Modules)

What's New in V3.6.1 🆕

AnchorChecker System

RetrospectiveEngine

StructuredGoal

FallbackBackend

Natural Language Rule Collection

Key Features

Security

Performance

Reliability

Project Lifecycle (11-Phase Model)

Developer Experience

Module Reference (60+ Modules)

Configuration

Environment Variables

Running Tests

🔄 Upgrade Smoke Test

With Coverage Report

Test Layering Strategy

Documentation

🆕 Quick Start (Recommended for New Users)

☸️ Kubernetes Deployment

Cross-Platform Compatibility

Quick Start per Platform

Version History

License

Links

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Packages