Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Documentation

on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:

jobs:
docs:
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install docs dependencies
run: pip install mkdocs-material

- name: Build docs
run: mkdocs build --strict

- name: Deploy to GitHub Pages
if: github.ref == 'refs/heads/main' && github.event_name != 'pull_request'
run: mkdocs gh-deploy --force
43 changes: 43 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Architecture

![DSAgt architecture](assets/architecture.png)

DSAgt wraps an unmodified agent CLI with four independently-operable layers. Each layer exposes its own MCP server so the agent discovers and invokes capabilities through the standard MCP tool protocol.

## Layers

**Tool Registry** (`dsagt-registry-server`)
The agent registers CLI tools as markdown files with YAML frontmatter under `<project>/tools/`. The registry server handles dependency installation via `uv run --with` and wraps every execution with `dsagt-run` for provenance capture. The agent discovers tools via `search_registry`.

**Knowledge Base** (`dsagt-knowledge-server`)
Semantic search over six independently-partitioned ChromaDB collections. Three are global (populated by `dsagt setup-kb`); three are per-project (filled automatically during use). Background jobs handle long ingest operations. The agent searches via `kb_search`, ingests via `kb_ingest`, and saves user-confirmed facts via `kb_remember`.

**Provenance** (`dsagt-run`)
A thin wrapper invoked by the registry server around every tool execution. Records the command, arguments, exit code, duration, file counts, and truncated stderr to `<project>/trace_archive/<record_id>.json` and emits an OTLP span to MLflow. The agent calls `reconstruct_pipeline` to render the trace archive as a reproducible bash script or Snakemake workflow.

**Observability** (MLflow + OTLP)
MLflow runs locally at a port pinned at `dsagt init` time. All four layers emit OTLP HTTP spans to MLflow's `/v1/traces` endpoint. The agent's own LLM-call traces land in the same store when you export the `OTEL_EXPORTER_OTLP_ENDPOINT` printed by `dsagt init`.

## Project Layout

```
~/dsagt-projects/<name>/
dsagt_config.yaml # project configuration
tools/ # registered CLI tool specs (markdown + YAML frontmatter)
tools/code/ # agent-written tool scripts
skills/ # agent skills (SKILL.md + reference docs)
trace_archive/ # tool execution records (JSON, from dsagt-run)
mlflow/ # MLflow traces, metrics, artifacts
kb_index/ # knowledge base vector collections
explicit_memories.yaml # user-confirmed facts

# Per-agent runtime config (one of, generated by dsagt init):
# claude: CLAUDE.md, .mcp.json
# goose: goose.yaml, .goosehints
# codex: AGENTS.md, .codex-data/config.toml
# opencode: AGENTS.md, opencode.json
# roo: .roomodes, .roo/mcp.json
# cline: .clinerules/, cline_mcp_settings.json
```

Projects are registered in `~/.dsagt/projects.yaml` so `dsagt mlflow <name>` and `dsagt info <name>` work from any directory. The data layer is agent-agnostic — re-running `dsagt init <same-name> --agent <other>` switches agent platforms while preserving all accumulated knowledge and traces.
Binary file added docs/assets/architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
49 changes: 49 additions & 0 deletions docs/cli.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# CLI Reference

All commands are available after running `uv sync` and activating the virtual environment (`source .venv/bin/activate`).

## Project Management

| Command | Description |
|---------|-------------|
| `dsagt init <name> --agent <platform> [--location <path>] [--mlflow-port N]` | Create a project; write per-agent MCP config; print the launch one-liner |
| `dsagt list` | List all projects with agent, status, and path |
| `dsagt info <name> [--json]` | Resolved config (with source per value) and a session/error summary |
| `dsagt mv <name> <new-location>` | Move a project to a new location |
| `dsagt rm <name> [-y] [--keep-files]` | Unregister a project and optionally delete its directory |

## Session Lifecycle

| Command | Description |
|---------|-------------|
| `dsagt mlflow <name>` | Start MLflow for a project and print OTel routing exports |
| `dsagt stop <name>` | Stop the MLflow daemon |
| `dsagt memory --project <name>` | Distill new traces from MLflow into episodic memory |

## Setup

| Command | Description |
|---------|-------------|
| `dsagt setup-kb [--collection <name>]` | Build the shared core knowledge base collections |
| `dsagt smoke-test [--agent claude\|goose\|codex\|opencode]` | End-to-end install verification |

## Project Location

The default project location is `~/dsagt-projects/<name>/`. Override with `--location`:

```bash
dsagt init my-project --agent claude --location /data/runs # /data/runs/my-project/
dsagt init my-project --agent claude --location . # ./my-project/
```

## Server Commands

These are launched automatically by `dsagt init` via the per-agent MCP config and are not typically run directly.

| Command | Description |
|---------|-------------|
| `dsagt-registry-server` | Tool registry MCP server |
| `dsagt-knowledge-server` | Knowledge base MCP server |
| `dsagt-run` | Provenance-capturing tool execution wrapper |
| `dsagt-proxy` | LiteLLM proxy server (proxy mode only) |
| `dsagt-setup-kb` | Core knowledge base setup (called by `dsagt setup-kb`) |
47 changes: 47 additions & 0 deletions docs/developer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# Developer Guide

Material for contributors and users who are working beyond the default `dsagt init` → `dsagt mlflow` → agent flow.

## Tests

```bash
uv run python -m pytest -m "not integration" # unit tests, no creds required
uv run python -m pytest -m integration -v # integration tests (require .env)
```

Integration tests read endpoint and key values from `.env` at the repo root. Copy `.env.example` to `.env` and fill in your values.

For per-flow hand-tests (CLI, proxy mode, VS Code extensions), see the scripts under [`tests/smoke_test/manual_runs/`](https://github.com/AI-ModCon/dsagt/tree/main/tests/smoke_test/manual_runs/).

## Proxy Mode

`dsagt init` followed by `dsagt start <project> --enable-proxy` spawns a LiteLLM proxy in front of your agent's LLM calls. This adds:

- Full LLM-call traces (request bodies, tool-use blocks, response payloads) in MLflow for agents whose native OTel does not emit those payloads (codex, opencode).
- Cache-breakpoint injection on outgoing requests (Anthropic prompt caching).
- Sidechannel detection for agent-internal title-generator / session-namer calls.
- Model-name aliasing — useful when an agent CLI hardcodes a model whitelist incompatible with your gateway's served names (cline, roo).

Proxy mode reads upstream LLM credentials from `.env` or the shell. See [`tests/smoke_test/manual_runs/proxy_walkthrough.md`](https://github.com/AI-ModCon/dsagt/blob/main/tests/smoke_test/manual_runs/proxy_walkthrough.md) for the full setup walkthrough.

## Troubleshooting

**Agent command not found.** The agent CLI is not installed or is not on PATH. See the [supported agents table](index.md#supported-agents).

**MCP servers not connecting.** Verify uv resolves the server commands:

```bash
uv run which dsagt-registry-server
uv run which dsagt-knowledge-server
```

If missing, reinstall: `uv sync --reinstall`.

**MLflow UI empty.** Confirm MLflow is running for the right project:

```bash
dsagt info <name> # shows the pinned port
curl http://localhost:<mlflow_port>
```

**Claude keychain conflict.** If `claude` will not authenticate against a non-default gateway, run `claude /logout` to clear the macOS Keychain OAuth token, then re-export `ANTHROPIC_BASE_URL` / `ANTHROPIC_API_KEY` and re-launch.
44 changes: 44 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# DSAgt

**D**ata**S**mith **Ag**en**t** — AI-assisted data pipeline builder.

DSAgt connects an MCP-compatible AI coding agent to tool registration, a semantic knowledge base, execution provenance, and observability infrastructure. It provides data-pipeline scaffolding around your existing agent CLI or VS Code extension (Claude Code, Goose, Codex, and others).

## Supported Agents

| Agent | Install | Verify |
|-------|---------|--------|
| [Claude Code](https://github.com/anthropics/claude-code) | `npm i -g @anthropic-ai/claude-code` | `claude --version` |
| [Goose](https://github.com/block/goose) | See [Goose docs](https://github.com/block/goose#installation) | `goose --version` |
| [Codex](https://github.com/openai/codex) | `npm i -g @openai/codex` | `codex --version` |
| [opencode](https://github.com/sst/opencode) | See [opencode docs](https://opencode.ai/docs/) | `opencode --version` |
| [Roo Code](https://github.com/RooCodeInc/Roo-Code) | `npm i -g @roo-code/cli` | `roo --version` |
| [Cline](https://github.com/cline/cline) | `npm i -g cline` | `cline --version` |

## Prerequisites

- Python 3.12–3.13
- [uv](https://github.com/astral-sh/uv)
- One of the supported agent platforms above, installed and authenticated against your LLM provider

## Installation

```bash
git clone https://github.com/AI-ModCon/dsagt.git
cd dsagt
uv sync
source .venv/bin/activate
```

## Key Capabilities

| Layer | What it does |
|-------|-------------|
| **Tool Registry** | Register CLI tools as markdown specs; the agent discovers and runs them via `search_registry` |
| **Knowledge Base** | Semantic search over indexed document collections (ChromaDB + FAISS) |
| **Provenance** | `dsagt-run` wrapper records every tool execution to `trace_archive/` and MLflow |
| **Explicit Memory** | User-confirmed facts persisted to YAML and the knowledge base |
| **Episodic Memory** | Session distillation via outlier detection over MLflow traces |
| **Observability** | Full OTLP tracing to a local MLflow instance |

See the [Quick Start](quickstart.md) to try all of these in a single session.
40 changes: 40 additions & 0 deletions docs/knowledge-base.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Knowledge Base

DSAgt maintains six independently-partitioned ChromaDB collections. The first three are global (under `~/.dsagt/kb_index/`, populated by `dsagt setup-kb`); the last three are per-project (under `<project>/kb_index/`, populated automatically during use).

## Collections

| Collection | Source | Populated by |
|---|---|---|
| **Tool Specs** | Bundled CLI tool specs in `src/dsagt/tools/` | `dsagt setup-kb` |
| **Skills** | Bundled skill workflows in `src/dsagt/skills/` | `dsagt setup-kb` |
| **Domain Knowledge** | NeMo Curator + AIDRIN reference corpora; user-ingested docs | `dsagt setup-kb` + agent's `kb_ingest` |
| **Explicit Memory** | User-confirmed facts | Agent's `kb_remember` (also written to `<project>/explicit_memories.yaml`) |
| **Episodic Memory** | Distilled facts from MLflow traces | `dsagt memory --project <name>` |
| **Tool Use Records** | `dsagt-run` execution traces | `dsagt-run` wrapper writes JSON to `<project>/trace_archive/`; indexed by `dsagt memory` |

## Explicit Memory

Explicit memories are facts the user confirms during a session. The agent saves them via `kb_remember`, which writes to both the ChromaDB collection and `<project>/explicit_memories.yaml`. The agent fetches them via `kb_get_memories` on demand (typically when you ask it to recall something) — they are not auto-loaded at session start.

## Episodic Memory

`dsagt memory --project <name>` distills new traces from the project's MLflow store into episodic memory using per-category outlier detection over embedding centroids. Run this after each session to accumulate cross-session memory.

## Search

The agent searches all collections via `kb_search` (knowledge MCP server) and writes via `kb_ingest` / `kb_remember`. Tool Specs and Skills are queried through specialized routes (`search_registry`, `search_skills`) over the same backend.

Hybrid search (dense embeddings + sparse BM25 via Reciprocal Rank Fusion) is on by default per collection route. Cross-encoder reranking is optional.

## Setup

```bash
dsagt setup-kb # all global collections (local embedder)
dsagt setup-kb --collection nemo_curator
dsagt setup-kb --embedding-backend api \
--embedding-base-url <url> \
--embedding-api-key <key>
```

The Tool Specs and Skills collections are wiped and rebuilt on every `setup-kb` run — re-run after upgrading DSAgt to pick up new bundled assets.
38 changes: 38 additions & 0 deletions docs/mcp-servers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# MCP Servers

DSAgt exposes its capabilities through two MCP servers. Both are launched automatically by `dsagt init` and configured in the per-agent runtime file (`.mcp.json` for Claude Code, `goose.yaml` for Goose, etc.).

## Registry Server

**Command:** `dsagt-registry-server`

Handles tool registration, dependency installation, and tool discovery.

| Tool | Description |
|------|-------------|
| `search_registry` | Semantic search over registered tool specs |
| `save_tool_spec` | Register a new CLI tool as a markdown file with YAML frontmatter |
| `install_dependencies` | Install tool dependencies via `uv run --with` |
| `reconstruct_pipeline` | Render the trace archive as a bash script or Snakemake workflow |

Tools are markdown files with YAML frontmatter under `<project>/tools/`. Executables are wrapped with `dsagt-run` for provenance and `uv run --with` for Python dependencies.

## Knowledge Server

**Command:** `dsagt-knowledge-server`

Semantic search and ingestion over indexed document collections.

| Tool | Description |
|------|-------------|
| `kb_search` | Search across one or more knowledge collections |
| `kb_ingest` | Index a file or directory into a named collection (runs in background for large corpora) |
| `kb_remember` | Save a user-confirmed fact to explicit memory |
| `kb_get_memories` | Retrieve explicit memories for the current project |
| `search_skills` | Discover agent skill workflows |

### Backend

The default embedding backend is local (`sentence-transformers`, CPU-only, no API key needed). Switch to `embedding.backend: api` in `dsagt_config.yaml` to route through a hosted embedder via LiteLLM. Cross-encoder reranking is available via `knowledge.rerank: true`.

Hybrid search (dense + sparse BM25) is on by default and controlled per-route via the `hybrid` flag.
45 changes: 45 additions & 0 deletions docs/observability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Observability

DSAgt provides end-to-end trace visibility through a local MLflow instance. All internal layers emit OTLP HTTP spans to MLflow's `/v1/traces` endpoint.

## Starting MLflow

```bash
dsagt mlflow <project-name>
```

Prints the MLflow UI URL and the `export` block for routing agent OTel output. The port is pinned at `dsagt init` time and listed by `dsagt info <name>`.

## Trace Coverage

| Source | Span type | Contents |
|--------|-----------|----------|
| Knowledge base | `kb.search`, `kb.embed`, `kb.index_search`, `kb.rerank` | Per-phase timing trees |
| Tool executions | `tool.execute` | Exit code, duration, file counts, truncated stderr. Full payload in `trace_archive/<record_id>.json` |
| Registry events | `save_tool_spec`, `install_dependencies`, `reconstruct_pipeline` | Span metadata |
| Native agent OTel | LLM call spans | Coverage varies by agent (see below) |

### Agent OTel Coverage

Export the variables printed by `dsagt mlflow` before launching your agent:

| Agent | Coverage |
|-------|----------|
| claude | Full request/response payloads |
| goose | Full request/response payloads |
| codex | Token counts and tool names |
| opencode | None natively |

Every span carries the project's `session.id` for filtering in the MLflow trace view.

## Provenance and Reconstruction

Tool execution records on disk (`trace_archive/<record_id>.json`) provide the canonical provenance chain. The agent calls `reconstruct_pipeline` to render the archive as a reproducible bash script or Snakemake workflow.

## Stopping MLflow

```bash
dsagt stop <project-name>
```

Releases the port and stops the gunicorn workers. The PID is stored in `<project>/.runtime`.
Loading
Loading