Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,8 @@ These are settled. Do not revisit without explicit discussion.
- **Prometheus metrics** — `MetricsCollector` in `metrics.py` follows the TraceCollector observer pattern. Records `agent_requests_total`, `agent_request_duration_seconds`, `agent_model_call_duration_seconds`, `agent_tool_call_total`, `agent_tokens_total`. Exposed at `GET /metrics` in Prometheus text format. Optional `[metrics]` extra (`prometheus_client`). `NullMetricsCollector` when disabled.
- **OTEL trace export** — `OTELTraceStore` in `otel.py` wraps an inner `TraceStore` with OpenTelemetry span export via OTLP. Span IDs are deterministically hashed (SHA-256) from internal string IDs. Monotonic-to-wallclock conversion anchored on `Trace.started_at`. Optional `[otel]` extra (`opentelemetry-sdk`, `opentelemetry-exporter-otlp-proto-grpc`). Configure via `traces.exporter: otel` in agent.yaml.
- **Distributed trace propagation** — W3C Trace Context (`traceparent` header) extracted from incoming requests and injected into outgoing `RemoteNode` HTTP calls. `propagation.py` provides `extract_trace_context()` and `inject_trace_context()`. `TraceCollector` accepts `parent_trace_id`/`parent_span_id` to join distributed traces. `RemoteNode.set_trace_context()` injects headers.
- **Server module structure** — `fipsagents.server` is a proper package: `app.py` (OpenAIChatServer), `models.py` (request/response schemas), `sessions.py` (session stores), `tracing.py` (trace model + stores), `collector.py` (TraceCollector), `metrics.py` (Prometheus metrics), `otel.py` (OTEL export), `propagation.py` (W3C Trace Context), `sqlite.py` (shared connection manager). `__init__.py` re-exports `OpenAIChatServer`, `ChatCompletionRequest`, `ChatMessage`.
- **Server module structure** — `fipsagents.server` is a proper package: `app.py` (OpenAIChatServer), `models.py` (request/response schemas), `sessions.py` (session stores), `tracing.py` (trace model + stores), `collector.py` (TraceCollector), `metrics.py` (Prometheus metrics), `otel.py` (OTEL export), `propagation.py` (W3C Trace Context), `sqlite.py` (shared connection manager), `files.py` (FileStore + FileRecord), `bytes_store.py` (BytesStore), `parser.py` (FileParser), `scanner.py` (VirusScanner), `feedback.py`, `pricing.py`, `budget.py`, `http.py`. `__init__.py` re-exports `OpenAIChatServer`, `ChatCompletionRequest`, `ChatMessage`.
- **File upload subsystem** — server-layer only, mirrors the sessions/traces pattern. Four ABCs: `FileStore` (metadata), `BytesStore` (raw bytes), `FileParser` (extract text by MIME), `VirusScanner` (HTTP sidecar). `FileStore` and `BytesStore` are deliberately split per [ADR-0001](docs/adr/0001-s3-bytes-backend.md): `SqliteFileStore` and `PostgresFileStore` compose with any `BytesStore` (`NullBytesStore` / `LocalFsBytesStore` / `S3BytesStore`). Multi-replica deployments use `bytes_backend.type: s3`; single-replica edge/dev keeps `local_fs` on a PVC. Parser dispatch is two-tier — `PlaintextParser` (text/* + JSON/YAML/XML, no extra deps) → `DoclingParser` (binary formats, opt-in via `[files]` extra ~5–6 GB). Optional extras: `[files]` (Docling), `[s3]` (aioboto3 ~30 MB). `OpenAIChatServer` builds and owns all four; closed on shutdown. Endpoints: `POST/GET/DELETE/LIST /v1/files`, plus `file_ids: [...]` on `ChatCompletionRequest` resolves to system messages with extracted text injected before the agent runs. `SqliteFileStore` metadata DB co-locates with bytes via `FilesConfig.sqlite_path` (chart sets `FILES_SQLITE_DB_PATH=<mount>/.metadata/agent.db` when persistence is on). Chunking + pgvector retrieval for large files is the next bullet of #100, designed in [ADR-0002](docs/adr/0002-large-file-chunking-pgvector.md), tracked in #137 (blocked on the ADR).
- **`probe_role_support()`** is a diagnostic utility in `fipsagents.baseagent.diagnostics` -- probes whether a deployed model supports a given message role (e.g. `developer`). Template inspection (best-effort, checks vLLM model metadata) + canary completion (prompt token delta). Not on the hot path.
- **`ThinkTagParser`** in `fipsagents.baseagent.reasoning` -- streaming parser that separates `<think>…</think>` blocks from content deltas. Auto-enabled for Granite and DeepSeek models (via `create_reasoning_parser(model_name)`). Wired in `setup()` step 11 and `astep_stream`. Falls back gracefully when vLLM's `--reasoning-parser` already handles extraction server-side.
- **`McpServerConfig`** supports two YAML-configurable transports: HTTP (`url`) and stdio (`command`/`args`/`env`/`cwd`). Pydantic validator enforces exactly one. `connect_mcp()` also accepts FastMCP server objects for in-process transport (programmatic, not YAML).
Expand Down
1 change: 0 additions & 1 deletion packages/fipsagents/tests/test_files_endpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,6 @@ def test_empty_files_sqlite_path_falls_back_to_storage(self, tmp_path):
# Default behavior — covered by the existing _build_server_with_files
# fixture which sets storage.sqlite_path and leaves files.sqlite_path
# empty. Just confirm uploads land at the storage path.
from pathlib import Path
server = _build_server_with_files(tmp_path)
storage_db = tmp_path / "agent.db"
with TestClient(server.app) as client:
Expand Down