ARIA — Adaptive Reasoning Intelligence Assembly

340-agent adversarial clinical documentation system

17 features x 4 tiers x 5 providers = 340 concurrent agents

Python 3.14 · FastAPI · React 19 · transformers 5.7.0 · Pydantic v2 · SQLite

What It Does

ARIA converts a raw clinical transcript into a validated, structured SOAP note.

It does not use a single model. It runs 340 specialized agents in parallel via asyncio. Each agent has one job. Agents from different model families argue over every extracted fact. A fact advances only if it maps to a verbatim span of the source transcript. Facts that fail this check are dropped at the P2 tier before any downstream agent sees them.

The final output is a SOAPNote Pydantic model. Every field links to the exact words in the source transcript that produced it.

Architecture

Agent Grid

17 features x 4 tiers x 5 providers = 340 total agents

Dimension	Count	Composition
Total agents	340	17 features x 4 tiers x 5 providers
Agents per provider	68	17 features x 4 tiers
Agents per feature	20	4 tiers x 5 providers
Named system parts	22	layers, modules, tiers, providers, UX

All 340 agents run concurrently. Full deliberation — from raw transcript to validated SOAPNote — completes in under 15 seconds.

Providers

No provider is treated as authoritative. Each fills a distinct role.

Provider	Model	Role	Agents	Env Key
OpenAI	GPT-4o	General clinical inference	68	`OPENAI_API_KEY`
Anthropic	Claude 3.5	Nuance detection, safety reasoning	68	`ANTHROPIC_API_KEY`
xAI	Grok	Adversarial edge-case detection	68	`GROK_API_KEY`
Google AI	Gemini 1.5	Deep medical context and knowledge retrieval	68	`GOOGLE_AI_STUDIO_API_KEY`
Cerebras	Cerebras-native	Low-latency adversarial tier execution	68	`CEREBRAS_API_KEY`

Agent Tiers

Four tiers run per provider, per feature. Each has a fixed behavioral contract. No tier can be skipped or overridden.

P1 — Specialist

Performs first-pass extraction from the raw transcript.
Optimized for high recall over high precision.
Produces the initial structured clinical hypothesis.
System prompt grounded in reasoning patterns from OpenMed/MedDialog via src/operations/distillation.py.

P2 — Attending

Checks every P1 claim against the source transcript.
A claim that cannot be matched to a verbatim or high-confidence semantic span is blocked here. It does not advance.
Can challenge P1 output from any provider, not only its own.
This is the primary fact-filtering gate.

P3 — Chief

Reviews the full P1 and P2 argument record within its provider family.
Resolves conflicting interpretations.
Produces a single consolidated position per provider.
Weights by logical grounding of the argument, not by model confidence score.

P4 — Synthesis

Runs after all five P3 outputs are complete.
Aggregates across all 340 deliberation points.
Produces the final SOAPNote Pydantic model.
Enforces that every ClinicalFact carries a populated source_quote before the model is returned.

Deliberation Protocol

Two axes run simultaneously for every feature.

Vertical — Intra-Provider

Each provider runs its own internal argument sequence before any result leaves that provider team:

P1 [extract] -> P2 [challenge] -> P3 [resolve] -> P4 [converge]

Horizontal — Inter-Provider

At each tier, all five providers see and challenge each other's outputs:

Round 1  P1 x5    All five specialists generate independent analyses in parallel
Round 2  P2 x5    All five attendings attack all P1 outputs across all providers
Round 3  P3 x5    All five chiefs evaluate the full cross-provider P1/P2 argument log
Round 4  P4 x5    Final cross-provider synthesis; all claims verified against source

Provider disagreement is treated as a signal. Convergence is the result of repeated adversarial challenge — not averaging or voting.

Source Anchoring

Every ClinicalFact carries a required source_quote field. This field holds the verbatim transcript span that supports the fact. It is not optional. Pydantic v2 rejects any ClinicalFact where this field is empty or missing.

The P2 tier enforces this at runtime. If a claim cannot be mapped to the transcript via exact match or high-confidence semantic similarity (computed by the local transformers embedding layer), the fact is dropped silently. It does not reach P3.

Data Models

File: src/data/schemas.py

from pydantic import BaseModel
from typing import List, Optional

class ClinicalFact(BaseModel):
    category: str
    fact: str
    source_quote: str         # verbatim transcript span — required, enforced by Pydantic v2
    timestamp: Optional[str]
    confidence: float

class SOAPNote(BaseModel):
    subjective: List[ClinicalFact]
    objective:  List[ClinicalFact]
    assessment: List[ClinicalFact]
    plan:       List[ClinicalFact]
    agent_deliberation_log: str   # full argument record across all 340 agents

Pydantic v2 enforces this schema at every tier boundary. An invalid model is rejected before it advances.

Local Intelligence Layer

Package: transformers 5.7.0 Location: venv/lib/python3.14/site-packages/transformers/ Execution: Runs entirely inside the venv. No external API call is made.

Three jobs run locally before any payload leaves the machine:

Named Entity Recognition (NER) Tags medical terms in the transcript stream in real time — symptoms, durations, medications, procedures. Tags are returned as JSON to the frontend to trigger live visual formatting.

PII Scrubbing Strips patient identifiers from the transcript payload. External provider API calls receive only the clinical-signal text. No patient identifier reaches an external endpoint.

Embedding Generation Computes sentence embeddings used by the P2 tier to perform semantic similarity matching between a ClinicalFact.fact string and the candidate source_quote span. This is how high-confidence non-exact matches are verified.

The transformers library exposes these three functions through src/features/ner_pipeline.py. The Whisper model inside the same package handles speech-to-text transcription for the live voice input path.

Knowledge Substrate

File: src/operations/distillation.py Dataset: OpenMed/MedDialog on Hugging Face (requires HF_TOKEN)

All 340 agents are initialized with reasoning patterns extracted from MedDialog. The pipeline:

Streams gold-standard clinical dialogues from MedDialog.
Extracts canonical reasoning patterns — for example: resolving conflicting lab values, handling ambiguous symptom clusters, weighing negative findings.
Injects those patterns into the system prompt of every agent at startup via src/core/config.py.

Every agent's baseline reasoning reflects documented clinical logic rather than generic language model priors.

Feature Layer

17 independent clinical modules. Each lives in src/features/. Each is governed by its own 20-agent team (4 tiers x 5 providers).

Category	Feature	File
Performance	Parallel agent execution via `asyncio`	`parallel_execution.py`
Performance	Sub-15s transcript processing	`sub15s_processing.py`
Performance	Cross-provider output comparison and ranking	`multi_provider_competition.py`
Traceability	Click-to-highlight transcript span mapping	`click_to_highlight.py`
Traceability	`source_quote` verification and storage	`source_quote_mapping.py`
Traceability	SQLAlchemy audit log to SQLite	`sqlite_audit_trail.py`
Safety	Safety gate — blocks all downstream processing	`stop_first_safety.py`
Safety	P2 adversarial critique behavioral logic	`adversarial_critique.py`
Safety	System self-evaluation and reliability scoring	`ai_readiness_audit.py`
Reasoning	Argument weighting and convergence logic	`deliberative_convergence.py`
Reasoning	340 distinct per-agent system prompt definitions	`per_role_prompts.py`
Reasoning	Leadership briefing format and delivery	`leadership_briefings.py`
Transparency	Live deliberation streaming to Glass Box	`realtime_reasoning_display.py`
Transparency	Clinician-facing reasoning explanation	`staff_transparency.py`
Transparency	Glass Box and Clinician Dashboard UX backends	`src/ux/glass_box.py`, `src/ux/dashboard.py`
Data	Raw transcript to `SOAPNote` conversion	`unstructured_to_structured.py`
Data	Tier-boundary Pydantic v2 schema enforcement	`pydantic_validation.py`

Web Stack

Layer	Technology	Version
Backend framework	FastAPI with `asyncio`	latest
ASGI server	Uvicorn	0.46.0
Data validation	Pydantic	v2
Audit storage	SQLite + SQLAlchemy	latest
Local NLP	transformers	5.7.0
Content hashing	xxhash	3.7.0
HTTP client	urllib3	2.6.3
CLI interface	typer	0.25.0
Config parsing	PyYAML	latest
URL handling	yarl	1.23.0
Type support	typing_extensions	4.15.0
Type introspection	typing_inspection	0.4.2
Frontend	React 19 SPA	latest
Frontend build	Vite	see `vite.config.js`
Python runtime	CPython	3.14

Communication Channels

REST (POST): Uploads transcripts. Retrieves final SOAPNote Pydantic models.

WebSocket: Streams live agent arguments from the asyncio task pool to the frontend as they execute. Required for the Glass Box real-time deliberation view. Primary endpoint: /ws/scribe.

Encounter Studio Interface

Three panes. No global scroll. Each pane scrolls independently. Fixed layout: 30% / 30% / 40%.

Every interface element serves a clinical function. The interface renders nothing before data is ready. No placeholders. No skeleton states with dummy content.

Global Patient Banner (Fixed Top)

Patient name (Semibold SF Pro), MRN, age, sex.
Time in ED clock, top right. Turns red after 4 hours.
Five provider status dots. Green when idle. Pulsing when deliberating.
Background: #F5F5F7. Pane surfaces: white at 80% opacity with backdrop-blur.

Left Pane — Live Canvas

File: src/components/LiveCanvas.jsx

Circular record button, bottom center. Three states:
- Idle: microphone icon, no animation.
- Recording: pulsing blue ring. getUserMedia active. Audio streaming to /ws/scribe.
- Processing: shimmer effect. asyncio task pool running.
Transcript renders in real time with diarization labels: MD: in dark gray, Patient: in standard weight.
NER tags from the local transformers layer trigger live inline formatting: symptoms bold in blue, durations bold in green. Tags arrive as JSON over the WebSocket.
"Import Transcript" button: routes a text block directly to src/core/orchestrator.py, bypassing voice and Whisper entirely.

File: src/api/ws_scribe.py

Accepts binary audio chunks.
Routes to Whisper inside transformers.
Yields text blocks to the frontend.
Simultaneously routes text to src/features/ner_pipeline.py for NER tagging.

Center Pane — Clinical Source

File: src/components/ClinicalSource.jsx

Three cards: Chief Complaint (top, high-visibility), Patient Story / HPI, Medical and Family History (two side-by-side).
All cards are hidden on load. No placeholder text is shown.
Each card fades in independently when the P1 Specialist agents for that feature stream a convergence event over the WebSocket.
Fade-in is triggered by a confirmed backend event, not a timer or word count threshold.

Right Pane — Intelligence Matrix

File: src/components/IntelligenceMatrix.jsx

Renders the validated SOAPNote Pydantic model field by field.
Toggle at top: [ SOAP | APSO ]. APSO reorders the display to show Assessment and Plan first. Reorder is immediate — no re-fetch, no full component re-render.
Each rendered fact is a clickable element bound to src/features/click_to_highlight.py.

File: src/components/GlassBox.jsx

Slide-out drawer. Opens only on fact click.
Displays the verbatim source_quote from the ClinicalFact model field.
Displays the one-line conflict resolution log from agent_deliberation_log. Example: Grok challenged OpenAI on MI risk; converged on UA due to negative Trop.
Simultaneously highlights the source span in the Live Canvas via src/features/click_to_highlight.py.

Directory Structure

Adaptive-Reasoning-Intelligence-Assembly/
│
├── run.py                                 # System entry point — launches FastAPI + Vite
├── vite.config.js                         # Vite frontend build configuration
├── .env                                   # API keys — git-ignored, never committed
├── .gitignore                             # Enforces key exclusion on every commit
├── requirements.txt                       # Python dependencies
│
├── src/
│   ├── core/
│   │   ├── config.py                      # Env loading, model config, tier config, prompt injection
│   │   └── orchestrator.py                # Launches and manages the 340-agent asyncio task pool
│   │
│   ├── agents/
│   │   ├── base.py                        # Shared agent interface, lifecycle hooks, logging
│   │   ├── specialist.py                  # P1 — first-pass extraction, high-recall mode
│   │   ├── attending.py                   # P2 — adversarial critique, source-quote gating
│   │   ├── chief.py                       # P3 — intra-provider argument synthesis
│   │   └── synthesis.py                   # P4 — cross-provider convergence, final SOAPNote output
│   │
│   ├── features/
│   │   ├── parallel_execution.py          # asyncio task pool management across 340 agents
│   │   ├── sub15s_processing.py           # Latency tracking and timeout enforcement
│   │   ├── multi_provider_competition.py  # Cross-provider output comparison and ranking
│   │   ├── click_to_highlight.py          # Maps SOAPNote fact coordinates to transcript spans
│   │   ├── source_quote_mapping.py        # Fact-to-transcript verification; drops unverifiable facts
│   │   ├── sqlite_audit_trail.py          # SQLAlchemy models; logs inputs, outputs, agent IDs, versions
│   │   ├── stop_first_safety.py           # Safety gate; blocks all downstream execution if triggered
│   │   ├── adversarial_critique.py        # P2 behavioral constraints and critique protocol
│   │   ├── ai_readiness_audit.py          # System self-scoring on output reliability
│   │   ├── deliberative_convergence.py    # Argument weighting; produces P3/P4 positions
│   │   ├── per_role_prompts.py            # Defines all 340 distinct agent system prompts
│   │   ├── leadership_briefings.py        # Formats executive summaries from SOAPNote output
│   │   ├── realtime_reasoning_display.py  # Streams deliberation events to the Glass Box WebSocket
│   │   ├── staff_transparency.py          # Formats per-fact reasoning for clinician-facing display
│   │   ├── unstructured_to_structured.py  # Converts raw transcript text to SOAPNote schema input
│   │   └── pydantic_validation.py         # Enforces schema at every tier boundary; rejects invalid output
│   │
│   ├── api/
│   │   └── ws_scribe.py                   # WebSocket /ws/scribe — audio -> Whisper -> text stream
│   │
│   ├── data/
│   │   ├── schemas.py                     # Pydantic v2: ClinicalFact, SOAPNote
│   │   └── knowledge_base/                # MedDialog-distilled reasoning patterns; loaded at init
│   │
│   ├── ux/
│   │   ├── glass_box.py                   # Backend for Glass Box deliberation stream
│   │   └── dashboard.py                   # Backend for Clinician Dashboard structured output
│   │
│   └── operations/
│       ├── distillation.py                # MedDialog streaming, pattern extraction, prompt injection
│       └── system_audit.py                # Health checks, latency logging, system maintenance
│
├── tests/
│   ├── test_agents.py                     # P1–P4 behavioral contract tests
│   ├── test_features.py                   # Per-feature output validation
│   ├── test_validation.py                 # Pydantic schema enforcement tests
│   └── test_convergence.py                # Cross-provider convergence stability tests
│
└── venv/                                  # CPython 3.14 virtual environment (git-ignored)
    ├── pyvenv.cfg
    └── lib/
        └── python3.14/
            └── site-packages/
                ├── transformers/                   # v5.7.0
                │   ├── models/                     # whisper/, wav2vec2/, bert/, etc.
                │   ├── pipelines/                  # automatic_speech_recognition, token_classification, etc.
                │   ├── quantizers/                 # bnb, gptq, awq, torchao, etc.
                │   └── utils/                      # logging, hub, import_utils, etc.
                ├── transformers-5.7.0.dist-info/
                ├── uvicorn/                        # v0.46.0
                │   ├── protocols/http/
                │   ├── protocols/websockets/
                │   ├── loops/
                │   ├── middleware/
                │   └── supervisors/
                ├── uvicorn-0.46.0.dist-info/
                ├── typer/                          # v0.25.0
                ├── typer-0.25.0.dist-info/
                ├── xxhash/                         # v3.7.0
                │   └── _xxhash.cpython-314-darwin.so
                ├── xxhash-3.7.0.dist-info/
                ├── urllib3/                        # v2.6.3
                │   ├── contrib/emscripten/
                │   ├── http2/
                │   └── util/
                ├── urllib3-2.6.3.dist-info/
                ├── yaml/                           # PyYAML
                ├── yarl/                           # v1.23.0
                │   └── _quoting_c.cpython-314-darwin.so
                ├── yarl-1.23.0.dist-info/
                ├── typing_extensions.py            # v4.15.0
                ├── typing_extensions-4.15.0.dist-info/
                ├── typing_inspection/              # v0.4.2
                ├── typing_inspection-0.4.2.dist-info/
                └── ...                             # fastapi, pydantic, sqlalchemy, aiohttp, etc.

Setup

Requirements: Python 3.14, Node.js.

# Clone
git clone <repo-url>
cd Adaptive-Reasoning-Intelligence-Assembly

# Python environment
python3.14 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Frontend
npm install

# Module resolution
export PYTHONPATH=$PYTHONPATH:.

# Launch
python3 run.py

.env file (git-ignored, must never be committed):

OPENAI_API_KEY=
ANTHROPIC_API_KEY=
GROK_API_KEY=
GOOGLE_AI_STUDIO_API_KEY=
CEREBRAS_API_KEY=
HF_TOKEN=

Security

API keys must not appear in Git history.
.gitignore enforces this on every commit.
venv/ is git-ignored and never committed.
The local transformers layer runs PII scrubbing before any payload leaves the machine.
No external API call is made until local NER and scrubbing complete.
If a provider fails mid-run, the remaining four continue. The matrix does not halt.
agent_deliberation_log is preserved on every SOAPNote for audit purposes.

Data Governance

Control	Mechanism
Fact traceability	Every `ClinicalFact` requires a non-empty `source_quote`; Pydantic v2 rejects violations
Agent-level logging	Inputs, outputs, model versions, and agent IDs written to SQLite via SQLAlchemy
Tier-boundary validation	Pydantic v2 validates schema at every tier transition; invalid output is rejected, not passed
PII protection	Local NER strips patient identifiers before any text reaches an external API endpoint
Provider failover	If one provider fails mid-deliberation, remaining four continue the full matrix
Deliberation record	`agent_deliberation_log` string preserved on every `SOAPNote` output for audit
Model versioning	Provider model version strings logged per API call for reproducibility

Performance Targets

Metric	Target	Mechanism
Processing time	Under 15 seconds per transcript	All 340 agents run concurrently via `asyncio`
Source verification	Every output fact carries a transcript-backed `source_quote`	P2 gate + Pydantic v2 enforcement
Recall	High — no premature filtering at P1	P1 Specialists optimized for full structured extraction
Convergence stability	Stable across all five providers	Iterative cross-provider argument revision before P4 fires
Provider fault tolerance	System continues with 4 of 5 providers	Failover handled in `src/core/orchestrator.py`

Tests

pytest tests/ -v

Covers 110 parameters: agent behavioral contracts (P1–P4), per-feature output validation, Pydantic schema boundary enforcement, and cross-provider convergence stability.

AI Builder Verification Checklist

Binary conditions. Each must pass before the module is marked complete.

Layout and Visual

Background is exactly #F5F5F7. Pane surfaces are white at 80% opacity with backdrop-blur.
Layout is 30% / 30% / 40%. Global scroll is disabled. Each pane scrolls independently.
Patient Banner is fixed at the top with translucent styling.
Time in ED clock turns red at exactly 4 hours. Change is driven by elapsed time, not a static flag.
Five provider dots reflect live backend polling: green when idle, pulsing when asyncio tasks are active.

Left Pane — Live Canvas

Mercury button cycles correctly through all three states with no intermediate stuck states.
Audio chunks stream over WebSocket to /ws/scribe without blocking the main UI thread.
Transcript renders MD: and Patient: labels in real time as text arrives.
NER JSON tags from src/features/ner_pipeline.py trigger live inline formatting: symptoms blue, durations green.
"Import Transcript" bypasses getUserMedia and Whisper; routes text block directly to src/core/orchestrator.py.

Center Pane — Clinical Source

All three cards are hidden on load. No placeholder or skeleton content is visible.
Each card fades in only after the P1 Specialists for that feature stream a convergence event from the backend.
Cards do not appear based on a timer or word count threshold.

Right Pane — Intelligence Matrix

SOAPNote fields render from the Pydantic model, not from mock data.
SOAP/APSO toggle reorders the section array instantly with no re-fetch.
Clicking a ClinicalFact triggers src/features/click_to_highlight.py and opens the Glass Box drawer.
Glass Box displays the verbatim source_quote string from the ClinicalFact model field.
Glass Box displays the one-line conflict resolution entry from agent_deliberation_log.
The corresponding transcript span is highlighted in the Live Canvas simultaneously with the Glass Box opening.

Backend and Integration

Clicking "Stop" triggers src/core/orchestrator.py and all 340 agents begin executing.
A deliberation progress indicator updates to reflect the live status of all five provider families.
src/agents/synthesis.py populates the Right Pane once P4 completes.
No external API call is made before src/features/ner_pipeline.py completes PII scrubbing.
Patient View is absent from all frontend routes and all backend handlers.
No external tracking, analytics, or unapproved API endpoints exist in the codebase.
venv/ is in .gitignore and does not appear in Git history.
.env is in .gitignore and does not appear in Git history.

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
__tests__		__tests__
lib		lib
public		public
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
aria_monolith.py		aria_monolith.py
aria_monolith.py.save		aria_monolith.py.save
babel.config.cjs		babel.config.cjs
eslint.config.js		eslint.config.js
index.html		index.html
jest.config.cjs		jest.config.cjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
requirements.txt		requirements.txt
run.py		run.py
run_ARIA_onboarding.py		run_ARIA_onboarding.py
setup_frontend.sh		setup_frontend.sh
tailwind.config.js		tailwind.config.js
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ARIA — Adaptive Reasoning Intelligence Assembly

What It Does

Architecture

Agent Grid

Providers

Agent Tiers

Deliberation Protocol

Source Anchoring

Data Models

Local Intelligence Layer

Knowledge Substrate

Feature Layer

Web Stack

Communication Channels

Encounter Studio Interface

Global Patient Banner (Fixed Top)

Left Pane — Live Canvas

Center Pane — Clinical Source

Right Pane — Intelligence Matrix

Directory Structure

Setup

Security

Data Governance

Performance Targets

Tests

AI Builder Verification Checklist

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ARIA — Adaptive Reasoning Intelligence Assembly

What It Does

Architecture

Agent Grid

Providers

Agent Tiers

Deliberation Protocol

Source Anchoring

Data Models

Local Intelligence Layer

Knowledge Substrate

Feature Layer

Web Stack

Communication Channels

Encounter Studio Interface

Global Patient Banner (Fixed Top)

Left Pane — Live Canvas

Center Pane — Clinical Source

Right Pane — Intelligence Matrix

Directory Structure

Setup

Security

Data Governance

Performance Targets

Tests

AI Builder Verification Checklist

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages