An adversarial decision framework for systematic fund managers. 45 AI agents that challenge every fund decision before you execute.
Crucible is not an assistant — it is an adversary. Every agent is built to find the reason your trade, strategy, or deployment should not proceed. The framework spans six layers covering every function of an institutional fund: governance, execution, research, intelligence, operations, and specialized roles. It is connected to live market data, persists every decision to a local database, and self-improves through postmortem analysis.
/run-pipeline is the primary entry point. One command executes the full agent stack in the correct order, passes outputs between agents automatically via the context bus, and produces a unified Pipeline Report with a final actionable instruction.
/run-pipeline Long 2% NAV ES futures, trend signal fires on 20-day breakout
/run-pipeline Adding EM FX carry basket, 3% NAV, 6 positions equally weighted
/run-pipeline Roll front-month CL to M+1, current position 1.5% NAV long
The context bus (orchestrator/context-bus.md) is the shared state between all agents. regime-classifier writes REGIME_STATE at Stage 1 — every downstream agent reads it. portfolio-optimizer writes target weights at Stage 5 — rebalancer and order-router consume them at Stage 6.
Stage 0 Bus init — loads portfolio-state.md + risk-limits.md + macro-state.md
Stage 1 regime-classifier → writes REGIME_STATE to bus
Stage 2 compliance (hard gate) → drawdown-monitor (hard gate)
Stage 3 risk-officer + macro-analyst + kalshi-reader [parallel]
Stage 4 signal-researcher [if signal-related submission]
Stage 5 portfolio-optimizer → writes target weights to bus
Stage 6 rebalancer + order-router [parallel]
Stage 7 audit-logger [final gate]
Stage 8 unified Pipeline Report → logged to db/crucible.db
- HARD HALT — compliance
VIOLATIONor drawdownHALT: pipeline stops, no execution instructions issued - SOFT HALT — any other block or warning: pipeline pauses, issue surfaced, override auto-logged
Every Pipeline Report is written to db/crucible.db (SQLite) on completion. /postmortem reads from it to identify what agents missed. /calibration-report tracks agent accuracy over time. /export-audit compiles 30-day history for LP and regulatory distribution.
scripts/update-context.py runs daily at 6:30am and writes real data to context files before each session:
- FRED: 20 core macro series →
context/macro-state.mdwith auto-computed regime signal summary - IBKR: live positions and account summary →
context/portfolio-state.md - Kalshi: prediction market probabilities →
context/kalshi-state.md
┌─────────────────────────────────────────────────────────┐
│ LAYER 5 — GOVERNANCE │
│ risk-officer · signal-researcher · systems-architect │
│ compliance-officer · macro-analyst │
│ /crucible · /run-pipeline │
├─────────────────────────────────────────────────────────┤
│ LAYER 4 — EXECUTION │
│ order-router · slippage-monitor · position-reconciler │
│ rebalancer · roll-manager │
├─────────────────────────────────────────────────────────┤
│ LAYER 3 — RESEARCH │
│ signal-generator · backtest-designer · correlation-mapper│
│ capacity-estimator · decay-tracker · portfolio-optimizer│
├─────────────────────────────────────────────────────────┤
│ LAYER 2 — INTELLIGENCE │
│ macro-scanner · regime-classifier · flow-analyst │
│ sentiment-tracker · earnings-watcher · kalshi-reader │
├─────────────────────────────────────────────────────────┤
│ LAYER 1 — OPERATIONS │
│ drawdown-monitor · vendor-monitor · audit-logger │
│ cash-manager · nav-calculator · lp-reporter · tax-tracker│
├─────────────────────────────────────────────────────────┤
│ LAYER 0 — HUMAN ROLES (core) │
│ quant-researcher · infrastructure-auditor · fund-accountant│
│ chief-risk-officer · investor-relations · general-counsel│
│ head-of-trading │
├─────────────────────────────────────────────────────────┤
│ LAYER 0 — DEEP FUND FUNCTIONS │
│ alternative-data-analyst · derivatives-desk │
│ securities-lending · factor-attribution │
│ capital-allocator · counterparty-risk │
│ esg-analyst · business-development │
├─────────────────────────────────────────────────────────┤
│ ORCHESTRATION │
│ Context bus · Execution graph · Halt protocol │
├─────────────────────────────────────────────────────────┤
│ PERSISTENCE │
│ SQLite db/ · pipeline_runs · nav_snapshots · agent_verdicts│
├─────────────────────────────────────────────────────────┤
│ LIVE DATA PIPELINE │
│ FRED (20 series) · IBKR (positions) · Kalshi · Norgate │
└─────────────────────────────────────────────────────────┘
| Agent | Command | Function | Key Output |
|---|---|---|---|
| Risk Officer | /risk |
Position sizing, VaR, drawdown, tail risk | HARD BLOCK / SOFT FLAG / CLEAR |
| Signal Researcher | /signal |
Statistical validity, overfitting, look-ahead, regime decomposition | BLOCK / CONDITIONAL / CLEAR |
| Systems Architect | /systems |
Execution infrastructure, data pipelines, deployment readiness | DEPLOY BLOCKER / PRE-DEPLOY REQUIRED / MONITOR |
| Compliance Officer | /compliance |
Mandate, regulatory limits, LP alignment, audit trail | VIOLATION / WARNING / CLEAR |
| Macro Analyst | /macro |
Regime, cross-asset consistency, thesis falsifiability, crowding | OPPOSED / CONDITIONAL / ALIGNED |
| Agent | Command | Function | Key Output |
|---|---|---|---|
| Order Router | /order-router |
Venue, timing, order type, slippage budget | ROUTE / OUTSIZED ORDER |
| Slippage Monitor | /slippage-monitor |
Fill quality vs. model, broker attribution | ACCEPTABLE / ELEVATED / INVESTIGATE |
| Position Reconciler | /position-reconciler |
Broker vs. OMS vs. signal three-way reconciliation | CLEAN / BREAKS DETECTED |
| Rebalancer | /rebalancer |
Optimal rebalance trades, transaction cost vs. Sharpe benefit | REBALANCE / PARTIAL / UNECONOMIC |
| Roll Manager | /roll-manager |
Futures expiry calendar, roll cost, timing | ROLL SCHEDULED / URGENT ROLL |
| Agent | Command | Function | Key Output |
|---|---|---|---|
| Signal Generator | /signal-generator |
Alpha hypothesis generation from regime context | HYPOTHESIS — NOT VALIDATED |
| Backtest Designer | /backtest-designer |
Rigorous backtest specification before code is written | BACKTEST SPEC |
| Correlation Mapper | /correlation-mapper |
Factor exposure of new signals vs. existing portfolio | ADDITIVE / REDUNDANT |
| Capacity Estimator | /capacity-estimator |
AUM ceiling before signal decay | CAPACITY CEILING / CAPACITY CONSTRAINED |
| Decay Tracker | /decay-tracker |
Live Sharpe vs. backtest baseline, decay curve fitting | HEALTHY / DEGRADING / FAILED |
| Portfolio Optimizer | /portfolio-optimizer |
Risk parity / mean-variance sizing, binding constraint | REBALANCE TRIGGER / URGENT |
| Agent | Command | Function | Key Output |
|---|---|---|---|
| Macro Scanner | /macro-scanner |
Daily FRED/Kalshi/news digest, regime shift detection | Daily brief |
| Regime Classifier | /regime-classifier |
Four-dimension continuous regime state machine | REGIME_STATE {} block |
| Flow Analyst | /flow-analyst |
COT positioning, crowding, squeeze scenarios | CROWDED / ELEVATED / NEUTRAL / CONTRARIAN |
| Sentiment Tracker | /sentiment-tracker |
FinBERT news sentiment, price-sentiment divergence | NARRATIVE SHIFT / DISTRIBUTION / ACCUMULATION |
| Earnings Watcher | /earnings-watcher |
Index constituent earnings risk, implied vs. historical move | HIGH / MODERATE / LOW |
| Kalshi Reader | /kalshi-reader |
Prediction market signal extraction, consensus divergence | DIVERGENCE FLAG / signal strength ranking |
| Agent | Command | Function | Key Output |
|---|---|---|---|
| Drawdown Monitor | /drawdown-monitor |
Circuit breaker, velocity-based HALT override | MONITOR / WARN / SUSPEND / HALT |
| Vendor Monitor | /vendor-monitor |
Data feed health, silent failure detection | HEALTHY / DEGRADED / STALE / FAILED |
| Audit Logger | /audit-logger |
Pre-trade rationale completeness enforcer | COMPLETE / INCOMPLETE |
| Cash Manager | /cash-manager |
Margin utilization, cash drag, runway | WARNING 70% / CRITICAL 90% |
| NAV Calculator | /nav-calculator |
Daily NAV with price source verification | VERIFIED / UNVERIFIED |
| LP Reporter | /lp-reporter |
Monthly/quarterly LP letter drafting | DRAFT — REVIEW REQUIRED |
| Tax Tracker | /tax-tracker |
Wash sale, harvest opportunities, after-tax return | HARVEST NOW / WASH SALE RISK / CLEAN |
| Agent | Command | Function | Key Output |
|---|---|---|---|
| Quant Researcher | /quant-researcher |
Distributional assumptions, parameter stability, deflated Sharpe | VALIDATED / CONDITIONAL / INVALID |
| Infrastructure Auditor | /infrastructure-auditor |
Race conditions, idempotency, dependency integrity | PRODUCTION READY / CONDITIONAL / NOT READY |
| Fund Accountant | /fund-accountant |
P&L attribution, fee calculation, financial statements | AUDIT READY / REQUIRES REMEDIATION |
| Chief Risk Officer | /chief-risk-officer |
Portfolio VaR, crisis stress tests, board risk report | RED / AMBER / GREEN |
| Investor Relations | /investor-relations |
LP DDQ simulation, quarterly call prep, raise readiness | IR READY / CONDITIONAL / NOT READY |
| General Counsel | /general-counsel |
Regulatory status, horizon scanning, trade legal risk | LEGAL CLEAR / REVIEW REQUIRED / LEGAL HOLD |
| Head of Trading | /head-of-trading |
Broker scorecard, commission audit, prime broker fit | OPTIMIZED / REVIEW REQUIRED / RESTRUCTURE |
| Agent | Command | Function | Key Output |
|---|---|---|---|
| Alternative Data Analyst | /alternative-data-analyst |
Data tier classification, half-life, legality, cost ROI | DIFFERENTIATED EDGE / COMMODITIZED / LEGAL REVIEW |
| Derivatives Desk | /derivatives-desk |
Options overlay, hedge ratio, Greeks, expiration risk | HEDGE APPROVED / OVERPRICED / HARD BLOCK |
| Securities Lending | /securities-lending |
Borrow cost, locate risk, recall, short squeeze scenario | SHORT APPROVED / UNECONOMIC / RECALL IMMINENT |
| Factor Attribution | /factor-attribution |
Return decomposition, factor drift, replication test | ALPHA CONFIRMED / ALPHA ILLUSION / FACTOR DRIFT |
| Capital Allocator | /capital-allocator |
Risk budget across strategies, pod P&L, capacity planning | OPTIMALLY ALLOCATED / MISALLOCATED / SHUTDOWN |
| Counterparty Risk | /counterparty-risk |
Prime broker exposure, CVA, settlement, PB failure scenario | CLEAN / CONCENTRATION WARNING / HARD BLOCK |
| ESG Analyst | /esg-analyst |
Exclusion screening, carbon footprint, LP ESG compatibility | ESG COMPLIANT / EXCLUSION BREACH / INCOMPATIBLE |
| Business Development | /business-development |
LP pipeline, raise readiness, pitch audit, emerging programs | PIPELINE HEALTHY / RAISE READY / NOT READY |
| Command | What It Does |
|---|---|
/setup |
Interactive fund onboarding wizard — populates all context files, generates legal and broker checklists |
/run-pipeline |
Full 8-stage orchestrated review — primary entry point for all decisions |
/crucible |
Five-agent governance panel — GO / CONDITIONAL GO / NO-GO verdict |
/stress-test |
Portfolio through GFC 2008, COVID 2020, 2022 Rate Shock, 1994, LTCM 1998 simultaneously |
/postmortem |
Feed losing trade outcomes back through agents — identifies misses, improves thresholds |
/calibration-report |
Agent accuracy over time — false positive/negative rates, health scores, recalibration recommendations |
/export-audit |
30-day Pipeline Report history compiled for LP and regulatory distribution |
/debate |
Bull and bear case for a trade — forces steelmanning before commitment, no verdict |
Use /run-pipeline — the orchestrator runs the full stack in the correct order automatically.
/macro-scanner # regime and macro digest
/regime-classifier # updates REGIME_STATE for the session
/event-calendar # upcoming risk events in the next 30 days
/alternative-data-analyst # is the data differentiated?
/signal-generator # hypothesis formation
/signal-researcher # statistical validation
/backtest-designer # rigorous backtest spec
/correlation-mapper # factor exposure vs. existing portfolio
/capacity-estimator # AUM ceiling
/drawdown-monitor # current severity and velocity
/chief-risk-officer # portfolio-wide risk assessment
/stress-test # how bad can it get across crisis scenarios
/factor-attribution # what factor exposure is driving the loss
/capital-allocator # should risk budget be reallocated
/quant-researcher # model validity
/infrastructure-auditor # code quality and deployment readiness
/run-pipeline # full pre-deployment review
/nav-calculator # verified NAV
/fund-accountant # P&L attribution and fee calculation
/tax-tracker # harvest opportunities, wash sale check
/lp-reporter # draft LP letter
/export-audit # 30-day audit trail for records
/general-counsel # any disclosure obligations this period
Option A — GitHub Codespace (recommended, no local install)
Click the badge at the top of this page. A pre-configured environment opens in your browser with all dependencies installed. Then:
cp .env.template .env # add your ANTHROPIC_API_KEY and FRED_API_KEY
claude # open Claude Code
/setup # configure your fundOption B — Local
git clone https://github.com/arpjw/crucible-cio-team.git
cd crucible-cio-team
pip install -r requirements.txt
npm install -g @anthropic-ai/claude-code
cp .env.template .env # add API keys
python db/init.py # initialize SQLite database
python scripts/verify-fred.py # confirm live data connection
claude
/setupAfter /setup, read PLAYBOOK.md for the day-zero to day-100 operational guide.
| Verdict | Condition |
|---|---|
| NO-GO | Any HARD BLOCK (Risk), VIOLATION (Compliance), DEPLOY BLOCKER (Systems), or EXCLUSION LIST BREACH (ESG) |
| CONDITIONAL GO | Any soft flags, warnings, or Macro OPPOSED without a hard block |
| GO | All agents clear — includes mandatory post-execution monitoring |
crucible-cio-team/
├── agents/ (45 agent persona files)
├── .claude/commands/ (53 slash commands)
├── orchestrator/ (pipeline.md, context-bus.md, halt-protocol.md)
├── db/ (init.py, query.py, README.md — SQLite persistence)
├── scripts/ (update-context.py, sync-ibkr.py, verify-fred.py, verify-ibkr.py)
├── context/ (fund-mandate.md, risk-limits.md, portfolio-state.md, macro-state.md)
├── legal/ (Delaware LLC checklist, NFA guide, Series 65 plan, templates, DDQ)
├── fundraising/ (pitch deck outline, LP letter templates, targeting guide)
├── infrastructure/ (IBKR, FRED, Norgate, Kalshi setup guides, Docker, environment)
├── strategies/ (TSMOM, Carry, Macro Discretionary starter packs)
├── tests/ (manual testing guide and scenarios)
├── .devcontainer/ (Codespace config and Dockerfile)
├── AGENTS.md (complete agent registry)
├── PLAYBOOK.md (day-zero to running fund operational guide)
├── CONTRIBUTING.md (contribution guide and quality bar)
└── CLAUDE.md (Claude Code entry point)
Crucible is open for contributions. The highest-value additions are agents for strategies not yet covered: options, crypto, equity long/short, fixed income credit, family office structure.
See CONTRIBUTING.md for the quality bar, issue templates, and PR checklist. Open an Agent Proposal issue before building — the proposal template requires named formulas and escalation thresholds before any code is written.
GitHub Discussions is enabled for ideas, Q&A, and Show and Tell.
MIT. Use it, fork it, adapt it to your fund.
Built with Claude Code. The most important question before any trade is not "why should I do this" — it is "what would make this wrong."