AI Engineering Bootcamp Capstone Project
A professional-grade, Human-in-the-Loop (HITL) multi-agent intelligence platform designed to gather, ingest, analyze software repositories and generate structured insights in Node.js and Python code repositories using the Model Context Protocol (MCP), OpenAI's Agent Swarm patterns, and a real-time Director Console UI.
Developers struggle to understand unfamiliar codebases quickly. Documentation is often outdated, onboarding is slow, and identifying technical debt, security risks, or architectural drift requires significant manual effort — especially across evolving repos.
Ouroboros RIS is a multi-agent system designed to analyze software repositories and generate structured insights. Given a repository path, the system scans the codebase, builds a searchable representation, and produces reports on structure, dependencies, and technical debt.
The goal is to reduce the time engineers spend understanding unfamiliar codebases by providing automated, context-aware analysis.
flowchart TD
%% User Input
User["User Input (CLI / UI)"]
%% Core Flow
Coordinator["Coordinator Agent\n(Workflow Orchestration)"]
Scout["Scout Agent\n(Repo Parsing + Embeddings)"]
VectorDB[("Vector Store\nChromaDB")]
Analyst["Analyst Agent\n(RAG + Code Analysis)"]
%% Tools
subgraph MCP["MCP Tools"]
direction TB
Topology["repo_topology_mapper"]
Efficiency["efficiency_analyzer"]
end
%% Output
Report["Final Report"]
%% Flow Connections
User --> Coordinator
Coordinator --> Scout
Scout --> VectorDB
VectorDB --> Analyst
Analyst --> MCP
MCP --> Analyst
Analyst --> Report
%% Styling
classDef main fill:#3b82f6,stroke:#2563eb,color:#fff;
classDef storage fill:#10b981,stroke:#059669,color:#fff;
classDef tools fill:#f59e0b,stroke:#d97706,color:#fff;
classDef output fill:#8b5cf6,stroke:#7c3aed,color:#fff;
class User,Coordinator,Scout,Analyst main;
class VectorDB storage;
class Topology,Efficiency tools;
class Report output;
We use uv as a fast, modern package manager to orchestrate the environment securely.
- Multi-Agent Swarm (
openaiSDK): Native Agent Handoff / Swarm pattern connecting through OpenRouter, orchestrated by statefulsqlite3— no heavyweight LangGraph overhead. - MCP Integration (
fastmcp): A Model Context Protocol server (mcp_integration/server.py) exposingefficiency_analyzerandrepo_topology_mapper. The Analyst Agent acts as an MCP Client and invokes these tools on every run. - Guardrails & Web Search (
serper): The SWE Analyst Agent is equipped with live web search to identify CVEs and gather dependency intelligence. Input validation restricts repos to ≤250 files to preserve LLM token budgets. - State Management (
sqlite3&chromadb):SQLitetracks workflow tasks, UI cards, and markdown diff states between runs.ChromaDBstores architectural embeddings as a persistent RAG knowledge base — only updated when diffs are detected.
- Real-time UI (
fastapi/sse-starlette):/stream— streams high-level agent event cards to the Director Console./log-stream— streams raw Python logging output to the live terminal panel.
- Pushover — fire-and-forget mobile push alerts for critical events (Dev Mode access, intrusion attempts).
- Python 3.10+ / FastAPI — async backend and dual-channel SSE streaming
- OpenRouter (OpenAI Models) —
gpt-4ofor reporting,gpt-4o-minifor SWE chat - FastMCP (stdio subprocess) — dynamic async tooling via MCP protocol
- SQLite Tracker — long-running lifecycle and diff management
- ChromaDB / NLTK — topological file embeddings and RAG context retrieval
- Pushover — background-threaded mobile alerting (fire-and-forget, never blocks the event loop)
- Glassmorphism Vanilla UI — zero-build premium dashboard (dark mode, animations, two-column layout)
The system runs 4 agents in a strict conditional sequential handoff:
Boots the pipeline, manages phase transitions, emits SSE events with UTC timestamps, and maintains a background poll loop that periodically re-broadcasts active task status to keep the UI alive.
Recursively scans the repo, ignoring bloat (node_modules, .venv, __pycache__, etc.). Categorises files into four structural layers:
- Skeleton — complete directory tree (rendered as interactive file tree in UI)
- Brain — dependency/config files (
requirements.txt,package.json,pyproject.toml) - Instructions —
README.md,.env.example - Entry Points — primary execution paths (
main.py,app.py,index.js)
Responsible for repository understanding and storage.
Scans repository files while ignoring irrelevant directories Breaks files into chunks Generates embeddings using OpenAI Stores embeddings in FAISS for retrieval
Input: Local repository path (e.g. ./sample_repo) Output: Searchable vector database of code chunks with metadata
Compares the current scan against SQLite-cached prior runs using text-diff hashing:
| Run Type | Behaviour |
|---|---|
| First run | Embeds all layers into ChromaDB, establishes baseline |
| Repeat run — diffs found | Emits a DIFF card showing exact per-layer changes, flags Analyst |
| Repeat run — no diffs | Skips Analyst entirely to preserve LLM tokens |
Only activated when diffs are present or on first run. Evaluates the codebase across exactly 4 pillars:
- Security Implications — dependency CVEs; uses web search for live vulnerability data
- Missing Documentation — entry points and random file sampling against folder skeleton
- Test Coverage — skeleton-based test detection matched to detected framework
- Best Practices — entry point + file tree derived recommendations
On diff runs, the detected diffs are injected verbatim into the system prompt; the Analyst explicitly assesses what changed, the security impact, stability implications, and architectural drift. MCP client Provides analysis capabilities used by agents.
repo_topology_mapper: Builds dependency relationships between files efficiency_analyzer: Identifies large files, unused imports, and inefficiencies
Input: Code text or repository path Output: Structured JSON reports ANALYST AGENT is Responsible for reasoning and insight generation.
Retrieves relevant code from vector store Calls MCP tools for deeper analysis Uses LLM to generate structured reports
A premium real-time UI served at http://localhost:8000:
- Left Panel — Live SYS LOG terminal (streams all Python
loggingoutput in real time via SSE) - Right Panel — Director event feed: agent cards stack newest-first with UTC timestamps, expandable insights, and a Deploy Swarm input
| Type | Agent | Description |
|---|---|---|
TREE |
Gatherer | Interactive nested file tree with emoji icons, entry point & dependency badges |
DIFF |
Librarian | Per-layer diff blocks with layer badges; baseline notice on first run |
REPORT |
Analyst | 4-pillar structured report with evidence citations and collapsible system prompt |
HEARTBEAT |
Governor | Background poll status pings for long-running tasks |
The Dev Mode SWE console streams responses token-by-token directly from the LLM:
- Thinking animation: three pulsing green dots appear while the model processes
- Blinking cursor: trails the streaming text until the response completes
- Input + Execute are locked during streaming to prevent race conditions
A strictly gated side-panel console unlocking destructive SWE filesystem operations:
- Click the 🔒 Locked: Dev Mode toggle in the header
- A secure password modal appears (masked
type="password"input) - Your
.envSECRET_PHRASEis validated server-side - On success: a Pushover mobile alert fires (background thread, non-blocking)
- The side panel opens with a direct chat channel to the Analyst with
write_file,delete_file, andcreate_directorytools enabled. - Direct Commitment: Filesystem operations are applied immediately to the disk. There is no staging step; the agent acts as an autonomous SWE under your direct instruction.
- Thinking Visibility: The UI displays pulsing animations while the agent is reasoning and explicit status logs for tool execution (e.g.,
Executing tool: write_file...). - No Restrictions: Within the root directory, the agent is authorized to perform recursive deletions and create complex directory structures.
- Path Traversal Guardrails: Hardened
os.path.commonpathvalidation prevents any escape from the repository root, ensuring high-security sandbox operations.
- Persistence: Task polling states (
poll_paused) are persisted in SQLite, allowing for stateful session resumes. - UI Controls: Every
Governorcard in the feed includes a Pause/Resume toggle, allowing the Director to halt background scanning instantly. - Live Sync: Heartbeat counters automatically stop incrementing when a task is paused, providing a real-time visual indicator of the swarm's activity state.
##Task Lifecycle Management
The Director Console provides granular control over historical platform data:
- Active Protection: The most recent task (highest ID) is always protected from deletion to prevent accidental disruption of a live research swarm.
- Selective Purging: Older Governor cards (Past Polls) feature a red 🗑 Delete Records button.
- Deep Clean: Deleting a record purges all associated UI cards, architectural reports, and diff-caches from the SQLite database, and clears active Analysts from memory.
Ensure uv is installed (pip install uv or via curl):
git clone <your-repo-link>
cd Repo-Intelligent-System-RIS-
uv synccp .env.example .envFill in your credentials in .env:
# LLM via OpenRouter
OPENROUTER_API_KEY=sk-or-v1-...
MODEL="openai/gpt-4o"
# Pushover Mobile Alerts
PUSHOVER_USER=your_pushover_user_key
PUSHOVER_TOKEN=your_pushover_app_token
# Advanced Dev Mode Security Gate
SECRET_PHRASE=your-strong-passphrase
# Optional Web Research
SERPER_API_KEY=your_serper_key
# Background poll interval (seconds)
BACKGROUND_POLL_INTERVAL=30
# Other
CHROMA_PERSIST_DIR=./chroma_db
DEV_ENV=developmentRecommended — UI Director Console:
uv run python main.py run-local --repo-path ./sample_repoWhen prompted "Would you like to start the Web Dashboard?", answer Y. The browser opens automatically at http://localhost:8000 and the swarm auto-deploys immediately for the validated repo path — no re-typing required.
Headless CLI mode (no browser):
uv run python main.py run-local --repo-path ./sample_repo
# Answer N at the dashboard promptuv run python main.py run-local --repo-path ./sample_repo
| Step | Component | Action |
|---|---|---|
| 0 | CLI | Validates repo (file count, path existence) |
| 0b | CLI | Stores validated path in app.state.auto_repo_path |
| 0c | Browser | JS calls /config, pre-fills input, auto-fires Deploy |
| 1 | Governor | Creates task in SQLite, emits INFO SSE card |
| 2 | Gatherer | Scans repo → builds nested file tree → emits TREE card |
| 3 | Librarian | Diff-checks vs cache → emits DIFF or INFO card |
| 4a | Analyst | (First/Diff run only) Full 4-pillar analysis + diff examination |
| 4b | Analyst | Calls MCP tools (efficiency_analyzer, repo_topology_mapper) |
| 4c | Analyst | Web searches for CVEs if vulnerable packages detected |
| 5 | Governor | Emits REPORT card, enters AWAITING_HIL state |
| 6 | Director | Human instructs Analyst via modal input (scoped to task context) |
uv run python main.py view-cardsRepo-Intelligent-System-RIS-/
├── agents/
│ ├── governor_messenger.py # Orchestrator, SSE emitter, poll loop
│ ├── gatherer_agent.py # Repo scanner, file tree builder
│ ├── ingester_agent.py # Diff engine, ChromaDB embedder
│ └── analyst_agent.py # SWE Analyst, MCP Client, web searcher
├── guardrails/
│ └── repo_validator.py # File count & path validation
├── mcp_integration/
│ ├── server.py # FastMCP server (efficiency + topology tools)
│ └── client_bridge.py # Async stdio MCP client bridge
├── rag/
│ ├── database.py # SQLite manager (tasks, cards, md_cache)
│ └── vector_store.py # ChromaDB interface
├── schemas/
│ └── analyst_schema.py # OpenAI structured output & tool schemas
├── static/
│ ├── index.html # Director Console UI (two-column layout)
│ └── app.js # SSE listeners, streaming chat, card renderers
├── tools/
│ ├── diff_checker.py # Text diff & hash utilities
│ └── swe_actions.py # Staged filesystem write/delete execution
├── utils/
│ └── alerts.py # Pushover fire-and-forget alerting
├── main.py # FastAPI app, CLI entrypoint (typer)
├── .env.example # All required keys (no values)
└── uroboros.db # SQLite runtime state (auto-created)
| Key | Purpose |
|---|---|
OPENROUTER_API_KEY |
LLM access via OpenRouter |
MODEL |
OpenRouter model ID (e.g. "openai/gpt-4o") |
PUSHOVER_USER |
Pushover user key for mobile alerts |
PUSHOVER_TOKEN |
Pushover app token |
SECRET_PHRASE |
Passphrase to unlock Advanced Dev Mode |
SERPER_API_KEY |
Google Serper for live CVE/web research |
BACKGROUND_POLL_INTERVAL |
Heartbeat interval in seconds (default: 30) |
CHROMA_PERSIST_DIR |
ChromaDB storage path (default: ./chroma_db) |
DEV_ENV |
development or production |
HF_TOKEN |
HuggingFace token (optional, for embeddings) |
| Name | Role |
|---|---|
| Michael Onyekanma | Team Lead, Coordinator Integration & Code Quality |
| Mwende Mugao | Product, PRD, Scout Agent (repo parsing, chunking, embeddings, vector store) |
| Michael Aigbovbiosa | Analyst Agent (RAG, reporting) and Coordinator Agent (workflow orchestration) |
| Sodiq Alabi | MCP Tools (topology mapper, efficiency analyzer, API layer) |
| Nsikan Ikpoh | Presentation slides |