Skip to content

AidanColvin/Adaptive-Reasoning-Intelligence-Assembly

Repository files navigation

ARIA — Adaptive Reasoning Intelligence Assembly

340-agent adversarial clinical documentation system

17 features x 4 tiers x 5 providers = 340 concurrent agents

Python 3.14 · FastAPI · React 19 · transformers 5.7.0 · Pydantic v2 · SQLite


What It Does

ARIA converts a raw clinical transcript into a validated, structured SOAP note.

It does not use a single model. It runs 340 specialized agents in parallel via asyncio. Each agent has one job. Agents from different model families argue over every extracted fact. A fact advances only if it maps to a verbatim span of the source transcript. Facts that fail this check are dropped at the P2 tier before any downstream agent sees them.

The final output is a SOAPNote Pydantic model. Every field links to the exact words in the source transcript that produced it.


Architecture

Agent Grid

17 features x 4 tiers x 5 providers = 340 total agents
Dimension Count Composition
Total agents 340 17 features x 4 tiers x 5 providers
Agents per provider 68 17 features x 4 tiers
Agents per feature 20 4 tiers x 5 providers
Named system parts 22 layers, modules, tiers, providers, UX

All 340 agents run concurrently. Full deliberation — from raw transcript to validated SOAPNote — completes in under 15 seconds.

Providers

No provider is treated as authoritative. Each fills a distinct role.

Provider Model Role Agents Env Key
OpenAI GPT-4o General clinical inference 68 OPENAI_API_KEY
Anthropic Claude 3.5 Nuance detection, safety reasoning 68 ANTHROPIC_API_KEY
xAI Grok Adversarial edge-case detection 68 GROK_API_KEY
Google AI Gemini 1.5 Deep medical context and knowledge retrieval 68 GOOGLE_AI_STUDIO_API_KEY
Cerebras Cerebras-native Low-latency adversarial tier execution 68 CEREBRAS_API_KEY

Agent Tiers

Four tiers run per provider, per feature. Each has a fixed behavioral contract. No tier can be skipped or overridden.

P1 — Specialist

  • Performs first-pass extraction from the raw transcript.
  • Optimized for high recall over high precision.
  • Produces the initial structured clinical hypothesis.
  • System prompt grounded in reasoning patterns from OpenMed/MedDialog via src/operations/distillation.py.

P2 — Attending

  • Checks every P1 claim against the source transcript.
  • A claim that cannot be matched to a verbatim or high-confidence semantic span is blocked here. It does not advance.
  • Can challenge P1 output from any provider, not only its own.
  • This is the primary fact-filtering gate.

P3 — Chief

  • Reviews the full P1 and P2 argument record within its provider family.
  • Resolves conflicting interpretations.
  • Produces a single consolidated position per provider.
  • Weights by logical grounding of the argument, not by model confidence score.

P4 — Synthesis

  • Runs after all five P3 outputs are complete.
  • Aggregates across all 340 deliberation points.
  • Produces the final SOAPNote Pydantic model.
  • Enforces that every ClinicalFact carries a populated source_quote before the model is returned.

Deliberation Protocol

Two axes run simultaneously for every feature.

Vertical — Intra-Provider

Each provider runs its own internal argument sequence before any result leaves that provider team:

P1 [extract] -> P2 [challenge] -> P3 [resolve] -> P4 [converge]

Horizontal — Inter-Provider

At each tier, all five providers see and challenge each other's outputs:

Round 1  P1 x5    All five specialists generate independent analyses in parallel
Round 2  P2 x5    All five attendings attack all P1 outputs across all providers
Round 3  P3 x5    All five chiefs evaluate the full cross-provider P1/P2 argument log
Round 4  P4 x5    Final cross-provider synthesis; all claims verified against source

Provider disagreement is treated as a signal. Convergence is the result of repeated adversarial challenge — not averaging or voting.


Source Anchoring

Every ClinicalFact carries a required source_quote field. This field holds the verbatim transcript span that supports the fact. It is not optional. Pydantic v2 rejects any ClinicalFact where this field is empty or missing.

The P2 tier enforces this at runtime. If a claim cannot be mapped to the transcript via exact match or high-confidence semantic similarity (computed by the local transformers embedding layer), the fact is dropped silently. It does not reach P3.

Data Models

File: src/data/schemas.py

from pydantic import BaseModel
from typing import List, Optional

class ClinicalFact(BaseModel):
    category: str
    fact: str
    source_quote: str         # verbatim transcript span — required, enforced by Pydantic v2
    timestamp: Optional[str]
    confidence: float

class SOAPNote(BaseModel):
    subjective: List[ClinicalFact]
    objective:  List[ClinicalFact]
    assessment: List[ClinicalFact]
    plan:       List[ClinicalFact]
    agent_deliberation_log: str   # full argument record across all 340 agents

Pydantic v2 enforces this schema at every tier boundary. An invalid model is rejected before it advances.


Local Intelligence Layer

Package: transformers 5.7.0 Location: venv/lib/python3.14/site-packages/transformers/ Execution: Runs entirely inside the venv. No external API call is made.

Three jobs run locally before any payload leaves the machine:

Named Entity Recognition (NER) Tags medical terms in the transcript stream in real time — symptoms, durations, medications, procedures. Tags are returned as JSON to the frontend to trigger live visual formatting.

PII Scrubbing Strips patient identifiers from the transcript payload. External provider API calls receive only the clinical-signal text. No patient identifier reaches an external endpoint.

Embedding Generation Computes sentence embeddings used by the P2 tier to perform semantic similarity matching between a ClinicalFact.fact string and the candidate source_quote span. This is how high-confidence non-exact matches are verified.

The transformers library exposes these three functions through src/features/ner_pipeline.py. The Whisper model inside the same package handles speech-to-text transcription for the live voice input path.


Knowledge Substrate

File: src/operations/distillation.py Dataset: OpenMed/MedDialog on Hugging Face (requires HF_TOKEN)

All 340 agents are initialized with reasoning patterns extracted from MedDialog. The pipeline:

  1. Streams gold-standard clinical dialogues from MedDialog.
  2. Extracts canonical reasoning patterns — for example: resolving conflicting lab values, handling ambiguous symptom clusters, weighing negative findings.
  3. Injects those patterns into the system prompt of every agent at startup via src/core/config.py.

Every agent's baseline reasoning reflects documented clinical logic rather than generic language model priors.


Feature Layer

17 independent clinical modules. Each lives in src/features/. Each is governed by its own 20-agent team (4 tiers x 5 providers).

Category Feature File
Performance Parallel agent execution via asyncio parallel_execution.py
Performance Sub-15s transcript processing sub15s_processing.py
Performance Cross-provider output comparison and ranking multi_provider_competition.py
Traceability Click-to-highlight transcript span mapping click_to_highlight.py
Traceability source_quote verification and storage source_quote_mapping.py
Traceability SQLAlchemy audit log to SQLite sqlite_audit_trail.py
Safety Safety gate — blocks all downstream processing stop_first_safety.py
Safety P2 adversarial critique behavioral logic adversarial_critique.py
Safety System self-evaluation and reliability scoring ai_readiness_audit.py
Reasoning Argument weighting and convergence logic deliberative_convergence.py
Reasoning 340 distinct per-agent system prompt definitions per_role_prompts.py
Reasoning Leadership briefing format and delivery leadership_briefings.py
Transparency Live deliberation streaming to Glass Box realtime_reasoning_display.py
Transparency Clinician-facing reasoning explanation staff_transparency.py
Transparency Glass Box and Clinician Dashboard UX backends src/ux/glass_box.py, src/ux/dashboard.py
Data Raw transcript to SOAPNote conversion unstructured_to_structured.py
Data Tier-boundary Pydantic v2 schema enforcement pydantic_validation.py

Web Stack

Layer Technology Version
Backend framework FastAPI with asyncio latest
ASGI server Uvicorn 0.46.0
Data validation Pydantic v2
Audit storage SQLite + SQLAlchemy latest
Local NLP transformers 5.7.0
Content hashing xxhash 3.7.0
HTTP client urllib3 2.6.3
CLI interface typer 0.25.0
Config parsing PyYAML latest
URL handling yarl 1.23.0
Type support typing_extensions 4.15.0
Type introspection typing_inspection 0.4.2
Frontend React 19 SPA latest
Frontend build Vite see vite.config.js
Python runtime CPython 3.14

Communication Channels

REST (POST): Uploads transcripts. Retrieves final SOAPNote Pydantic models.

WebSocket: Streams live agent arguments from the asyncio task pool to the frontend as they execute. Required for the Glass Box real-time deliberation view. Primary endpoint: /ws/scribe.


Encounter Studio Interface

Three panes. No global scroll. Each pane scrolls independently. Fixed layout: 30% / 30% / 40%.

Every interface element serves a clinical function. The interface renders nothing before data is ready. No placeholders. No skeleton states with dummy content.

Global Patient Banner (Fixed Top)

  • Patient name (Semibold SF Pro), MRN, age, sex.
  • Time in ED clock, top right. Turns red after 4 hours.
  • Five provider status dots. Green when idle. Pulsing when deliberating.
  • Background: #F5F5F7. Pane surfaces: white at 80% opacity with backdrop-blur.

Left Pane — Live Canvas

File: src/components/LiveCanvas.jsx

  • Circular record button, bottom center. Three states:
    • Idle: microphone icon, no animation.
    • Recording: pulsing blue ring. getUserMedia active. Audio streaming to /ws/scribe.
    • Processing: shimmer effect. asyncio task pool running.
  • Transcript renders in real time with diarization labels: MD: in dark gray, Patient: in standard weight.
  • NER tags from the local transformers layer trigger live inline formatting: symptoms bold in blue, durations bold in green. Tags arrive as JSON over the WebSocket.
  • "Import Transcript" button: routes a text block directly to src/core/orchestrator.py, bypassing voice and Whisper entirely.

File: src/api/ws_scribe.py

  • Accepts binary audio chunks.
  • Routes to Whisper inside transformers.
  • Yields text blocks to the frontend.
  • Simultaneously routes text to src/features/ner_pipeline.py for NER tagging.

Center Pane — Clinical Source

File: src/components/ClinicalSource.jsx

  • Three cards: Chief Complaint (top, high-visibility), Patient Story / HPI, Medical and Family History (two side-by-side).
  • All cards are hidden on load. No placeholder text is shown.
  • Each card fades in independently when the P1 Specialist agents for that feature stream a convergence event over the WebSocket.
  • Fade-in is triggered by a confirmed backend event, not a timer or word count threshold.

Right Pane — Intelligence Matrix

File: src/components/IntelligenceMatrix.jsx

  • Renders the validated SOAPNote Pydantic model field by field.
  • Toggle at top: [ SOAP | APSO ]. APSO reorders the display to show Assessment and Plan first. Reorder is immediate — no re-fetch, no full component re-render.
  • Each rendered fact is a clickable element bound to src/features/click_to_highlight.py.

File: src/components/GlassBox.jsx

  • Slide-out drawer. Opens only on fact click.
  • Displays the verbatim source_quote from the ClinicalFact model field.
  • Displays the one-line conflict resolution log from agent_deliberation_log. Example: Grok challenged OpenAI on MI risk; converged on UA due to negative Trop.
  • Simultaneously highlights the source span in the Live Canvas via src/features/click_to_highlight.py.

Directory Structure

Adaptive-Reasoning-Intelligence-Assembly/
│
├── run.py                                 # System entry point — launches FastAPI + Vite
├── vite.config.js                         # Vite frontend build configuration
├── .env                                   # API keys — git-ignored, never committed
├── .gitignore                             # Enforces key exclusion on every commit
├── requirements.txt                       # Python dependencies
│
├── src/
│   ├── core/
│   │   ├── config.py                      # Env loading, model config, tier config, prompt injection
│   │   └── orchestrator.py                # Launches and manages the 340-agent asyncio task pool
│   │
│   ├── agents/
│   │   ├── base.py                        # Shared agent interface, lifecycle hooks, logging
│   │   ├── specialist.py                  # P1 — first-pass extraction, high-recall mode
│   │   ├── attending.py                   # P2 — adversarial critique, source-quote gating
│   │   ├── chief.py                       # P3 — intra-provider argument synthesis
│   │   └── synthesis.py                   # P4 — cross-provider convergence, final SOAPNote output
│   │
│   ├── features/
│   │   ├── parallel_execution.py          # asyncio task pool management across 340 agents
│   │   ├── sub15s_processing.py           # Latency tracking and timeout enforcement
│   │   ├── multi_provider_competition.py  # Cross-provider output comparison and ranking
│   │   ├── click_to_highlight.py          # Maps SOAPNote fact coordinates to transcript spans
│   │   ├── source_quote_mapping.py        # Fact-to-transcript verification; drops unverifiable facts
│   │   ├── sqlite_audit_trail.py          # SQLAlchemy models; logs inputs, outputs, agent IDs, versions
│   │   ├── stop_first_safety.py           # Safety gate; blocks all downstream execution if triggered
│   │   ├── adversarial_critique.py        # P2 behavioral constraints and critique protocol
│   │   ├── ai_readiness_audit.py          # System self-scoring on output reliability
│   │   ├── deliberative_convergence.py    # Argument weighting; produces P3/P4 positions
│   │   ├── per_role_prompts.py            # Defines all 340 distinct agent system prompts
│   │   ├── leadership_briefings.py        # Formats executive summaries from SOAPNote output
│   │   ├── realtime_reasoning_display.py  # Streams deliberation events to the Glass Box WebSocket
│   │   ├── staff_transparency.py          # Formats per-fact reasoning for clinician-facing display
│   │   ├── unstructured_to_structured.py  # Converts raw transcript text to SOAPNote schema input
│   │   └── pydantic_validation.py         # Enforces schema at every tier boundary; rejects invalid output
│   │
│   ├── api/
│   │   └── ws_scribe.py                   # WebSocket /ws/scribe — audio -> Whisper -> text stream
│   │
│   ├── data/
│   │   ├── schemas.py                     # Pydantic v2: ClinicalFact, SOAPNote
│   │   └── knowledge_base/                # MedDialog-distilled reasoning patterns; loaded at init
│   │
│   ├── ux/
│   │   ├── glass_box.py                   # Backend for Glass Box deliberation stream
│   │   └── dashboard.py                   # Backend for Clinician Dashboard structured output
│   │
│   └── operations/
│       ├── distillation.py                # MedDialog streaming, pattern extraction, prompt injection
│       └── system_audit.py                # Health checks, latency logging, system maintenance
│
├── tests/
│   ├── test_agents.py                     # P1–P4 behavioral contract tests
│   ├── test_features.py                   # Per-feature output validation
│   ├── test_validation.py                 # Pydantic schema enforcement tests
│   └── test_convergence.py                # Cross-provider convergence stability tests
│
└── venv/                                  # CPython 3.14 virtual environment (git-ignored)
    ├── pyvenv.cfg
    └── lib/
        └── python3.14/
            └── site-packages/
                ├── transformers/                   # v5.7.0
                │   ├── models/                     # whisper/, wav2vec2/, bert/, etc.
                │   ├── pipelines/                  # automatic_speech_recognition, token_classification, etc.
                │   ├── quantizers/                 # bnb, gptq, awq, torchao, etc.
                │   └── utils/                      # logging, hub, import_utils, etc.
                ├── transformers-5.7.0.dist-info/
                ├── uvicorn/                        # v0.46.0
                │   ├── protocols/http/
                │   ├── protocols/websockets/
                │   ├── loops/
                │   ├── middleware/
                │   └── supervisors/
                ├── uvicorn-0.46.0.dist-info/
                ├── typer/                          # v0.25.0
                ├── typer-0.25.0.dist-info/
                ├── xxhash/                         # v3.7.0
                │   └── _xxhash.cpython-314-darwin.so
                ├── xxhash-3.7.0.dist-info/
                ├── urllib3/                        # v2.6.3
                │   ├── contrib/emscripten/
                │   ├── http2/
                │   └── util/
                ├── urllib3-2.6.3.dist-info/
                ├── yaml/                           # PyYAML
                ├── yarl/                           # v1.23.0
                │   └── _quoting_c.cpython-314-darwin.so
                ├── yarl-1.23.0.dist-info/
                ├── typing_extensions.py            # v4.15.0
                ├── typing_extensions-4.15.0.dist-info/
                ├── typing_inspection/              # v0.4.2
                ├── typing_inspection-0.4.2.dist-info/
                └── ...                             # fastapi, pydantic, sqlalchemy, aiohttp, etc.

Setup

Requirements: Python 3.14, Node.js.

# Clone
git clone <repo-url>
cd Adaptive-Reasoning-Intelligence-Assembly

# Python environment
python3.14 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Frontend
npm install

# Module resolution
export PYTHONPATH=$PYTHONPATH:.

# Launch
python3 run.py

.env file (git-ignored, must never be committed):

OPENAI_API_KEY=
ANTHROPIC_API_KEY=
GROK_API_KEY=
GOOGLE_AI_STUDIO_API_KEY=
CEREBRAS_API_KEY=
HF_TOKEN=

Security

  • API keys must not appear in Git history.
  • .gitignore enforces this on every commit.
  • venv/ is git-ignored and never committed.
  • The local transformers layer runs PII scrubbing before any payload leaves the machine.
  • No external API call is made until local NER and scrubbing complete.
  • If a provider fails mid-run, the remaining four continue. The matrix does not halt.
  • agent_deliberation_log is preserved on every SOAPNote for audit purposes.

Data Governance

Control Mechanism
Fact traceability Every ClinicalFact requires a non-empty source_quote; Pydantic v2 rejects violations
Agent-level logging Inputs, outputs, model versions, and agent IDs written to SQLite via SQLAlchemy
Tier-boundary validation Pydantic v2 validates schema at every tier transition; invalid output is rejected, not passed
PII protection Local NER strips patient identifiers before any text reaches an external API endpoint
Provider failover If one provider fails mid-deliberation, remaining four continue the full matrix
Deliberation record agent_deliberation_log string preserved on every SOAPNote output for audit
Model versioning Provider model version strings logged per API call for reproducibility

Performance Targets

Metric Target Mechanism
Processing time Under 15 seconds per transcript All 340 agents run concurrently via asyncio
Source verification Every output fact carries a transcript-backed source_quote P2 gate + Pydantic v2 enforcement
Recall High — no premature filtering at P1 P1 Specialists optimized for full structured extraction
Convergence stability Stable across all five providers Iterative cross-provider argument revision before P4 fires
Provider fault tolerance System continues with 4 of 5 providers Failover handled in src/core/orchestrator.py

Tests

pytest tests/ -v

Covers 110 parameters: agent behavioral contracts (P1–P4), per-feature output validation, Pydantic schema boundary enforcement, and cross-provider convergence stability.


AI Builder Verification Checklist

Binary conditions. Each must pass before the module is marked complete.

Layout and Visual

  • Background is exactly #F5F5F7. Pane surfaces are white at 80% opacity with backdrop-blur.
  • Layout is 30% / 30% / 40%. Global scroll is disabled. Each pane scrolls independently.
  • Patient Banner is fixed at the top with translucent styling.
  • Time in ED clock turns red at exactly 4 hours. Change is driven by elapsed time, not a static flag.
  • Five provider dots reflect live backend polling: green when idle, pulsing when asyncio tasks are active.

Left Pane — Live Canvas

  • Mercury button cycles correctly through all three states with no intermediate stuck states.
  • Audio chunks stream over WebSocket to /ws/scribe without blocking the main UI thread.
  • Transcript renders MD: and Patient: labels in real time as text arrives.
  • NER JSON tags from src/features/ner_pipeline.py trigger live inline formatting: symptoms blue, durations green.
  • "Import Transcript" bypasses getUserMedia and Whisper; routes text block directly to src/core/orchestrator.py.

Center Pane — Clinical Source

  • All three cards are hidden on load. No placeholder or skeleton content is visible.
  • Each card fades in only after the P1 Specialists for that feature stream a convergence event from the backend.
  • Cards do not appear based on a timer or word count threshold.

Right Pane — Intelligence Matrix

  • SOAPNote fields render from the Pydantic model, not from mock data.
  • SOAP/APSO toggle reorders the section array instantly with no re-fetch.
  • Clicking a ClinicalFact triggers src/features/click_to_highlight.py and opens the Glass Box drawer.
  • Glass Box displays the verbatim source_quote string from the ClinicalFact model field.
  • Glass Box displays the one-line conflict resolution entry from agent_deliberation_log.
  • The corresponding transcript span is highlighted in the Live Canvas simultaneously with the Glass Box opening.

Backend and Integration

  • Clicking "Stop" triggers src/core/orchestrator.py and all 340 agents begin executing.
  • A deliberation progress indicator updates to reflect the live status of all five provider families.
  • src/agents/synthesis.py populates the Right Pane once P4 completes.
  • No external API call is made before src/features/ner_pipeline.py completes PII scrubbing.
  • Patient View is absent from all frontend routes and all backend handlers.
  • No external tracking, analytics, or unapproved API endpoints exist in the codebase.
  • venv/ is in .gitignore and does not appear in Git history.
  • .env is in .gitignore and does not appear in Git history.

About

An adversarial intelligence assembly that orchestrates 340 concurrent agents across five provider families to transform raw transcripts into validated, source-anchored SOAP notes.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors