Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/contract/agent-interface.md
Original file line number Diff line number Diff line change
Expand Up @@ -385,6 +385,12 @@ real-LLM driver `agent_tools.py`):
None of these are large lifts. The biggest addition is the `escurel`
meta-skill, which is just one markdown file.

> **Status (delivered).** Every row above now ships in the Rust
> gateway: `neighbours`, `validate`, the CRDT write path + `update_page`
> fallback, `run_stored_query` over MCP, the mandatory auto-shipped
> `escurel` meta-skill, and `search(granularity='block'|'page')` with
> the frontmatter `filter`.

## Decisions locked (2026-05-17)

The four open questions above were resolved in the design
Expand Down
7 changes: 5 additions & 2 deletions docs/operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,15 @@ on first boot of a fresh host (see [Node loss](#node-loss--fresh-host)).
| `GET /healthz` | Liveness. Always `200 OK` while the process is up; dependency-free. Wire this to the Nomad/Consul check. |
| `GET /readyz` | Readiness. `200` only when LaneStore + indexer + embedder are all up; `503` with a per-component JSON body otherwise. A degraded embedder (model failed to load) shows here as `{"components":{"embedder":false}}` — the process still serves liveness and read traffic. |
| `GET /version` | The build version (`VERSION` / `ESCUREL_VERSION`). |
| `GET /metrics` | Prometheus exposition. `escurel_up`, `escurel_requests_total{route,status}`, plus the OTel-exported request metrics. Scrape via the tailnet-only `escurel-metrics` Consul service. |
| `GET /metrics` | Prometheus exposition on a **dedicated listener** (`ESCUREL_OBSERVABILITY_METRICS_LISTEN`, default `:9090`) — *not* the main HTTP port. Exposes `escurel_up`, `escurel_requests_total{route,status}`, and the per-tool families `escurel_tool_calls{tenant,tool,transport,status}`, `escurel_tool_latency_ms`, `escurel_live_sessions_open`, `escurel_audit_drift`. Scrape via the tailnet-only `escurel-metrics` Consul service. |

Logs are structured JSON on stdout with `ts`, `level`, `msg`, `app`,
`env`, `version`, `request_id` (per [`spec/platform.md`](spec/platform.md)).
Every `/mcp` request carries an `X-Request-Id` (inbound header honoured,
else a fresh ULID) threaded into a `mcp.request` span.
else a fresh ULID) threaded into a `mcp.request` span (which also carries
`transport` + `trace_id`). Each `tools/call` emits a `tool.completed`
record adding `tenant`, `tool`, `subject`, `status`, and `duration_ms` —
the per-call audit line.

## The admin surface

Expand Down
16 changes: 8 additions & 8 deletions docs/spec/dx.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Downstream-app integration contract

**Status:** Proposal. Locked items move into the table in [`README.md`](README.md#locked-design-decisions); open items live at the bottom of this file.
**Status:** Delivered (contract honoured by `escurel-client` + `escurel-test-support`). Locked items move into the table in [`README.md`](README.md#locked-design-decisions); open items live at the bottom of this file.
**Scope:** The contract escurel commits to for *applications built on top of escurel* — specifically, what their integration test harness can rely on. The rest of the spec describes the service from the operator's and implementer's seat; this doc describes it from the seat of someone wiring escurel into another product's tests.

The motivating shape is concrete: a new application — frontend + backend — that uses escurel as its store and chains through triton (the DataZoo agent-ingress gateway) to its agents. The integration test the application's harness needs to write is:
Expand Down Expand Up @@ -35,14 +35,14 @@ The escurel workspace already contains the *primitives* a downstream test needs;
| Typed MCP test client | Raw JSON-RPC `POST /mcp` in `tests/mcp.rs` (`call_tool`). | `McpTestClient` in `escurel-test-support`, wrapping `escurel-client`. |
| Recipe for `escurel + X` chaining | Not present. | §"Chaining recipe" below. |

The implementation of `escurel-test-support` and `escurel-client` is a separate milestone (see §"Implementation status"). This doc fixes the *contract* so the implementing PRs and the first consuming application can land in parallel.
`escurel-test-support` and `escurel-client` are now implemented (see §"Implementation status"); this doc remains the *contract* both honour.

## Test-process façade

A downstream test imports one crate (`escurel-test-support` as a `dev-dependency`) and uses one type to bring escurel up. The contract is:

```rust
// not yet implemented; this is the committed shape.
// the shipped shape (crates/escurel-test-support).

pub struct EscurelProcess { /* opaque */ }

Expand Down Expand Up @@ -242,13 +242,13 @@ What it does **not** guarantee:

## Implementation status

Not yet implemented. This document fixes the contract; the implementing milestone delivers:
**Delivered.** All three pieces ship in the workspace:

1. **`crates/escurel-test-support/`** — `EscurelProcess`, `Opts`, `AuthMode`, `FixtureBuilder`, `McpTestClient`. Reuses the helpers already in `tests/auth_quota.rs` and `tests/mcp.rs`.
2. **`crates/escurel-client/`** — typed wrapper around `escurel-proto`'s tonic codegen, with HTTP and gRPC transports.
3. **`examples/echo-app/`** (or a sibling repo) — a minimal application demonstrating the full chaining recipe above, with its `tests/e2e.rs` as the executable proof that the contract holds.
1. **`crates/escurel-test-support/`** — `EscurelProcess`, `Opts`, `AuthMode`, `FixtureBuilder`, `McpTestClient`. Drives the gateway's own no-mock integration tests.
2. **`crates/escurel-client/`** — typed wrapper around `escurel-proto`'s tonic codegen (exercised by `crates/escurel-client/tests/client_roundtrip.rs`).
3. **`examples/echo-app/`** — a minimal application demonstrating the chaining recipe above, with its `tests/e2e.rs` as the executable proof that the contract holds.

The order is `escurel-client` → `escurel-test-support` (which depends on it) → example app. The example app's `tests/e2e.rs` is the acceptance test for this contract: if it does not read roughly like the §"Chaining recipe" snippet above, the contract has drifted from the implementation and one of them needs to move.
The dependency order is `escurel-client` → `escurel-test-support` (which depends on it) → example app. The example app's `tests/e2e.rs` is the acceptance test for this contract: if it drifts from the §"Chaining recipe" snippet above, the contract has diverged from the implementation and one of them needs to move.

## Open questions

Expand Down
18 changes: 14 additions & 4 deletions docs/spec/platform.md
Original file line number Diff line number Diff line change
Expand Up @@ -250,10 +250,20 @@ A small set of OTel-conventional metrics:
| `escurel.storage_bytes` | gauge | `tenant`, `lane` (`markdown` / `duckdb` / `external_ducklake`) |
| `escurel.audit_drift` | gauge | `tenant`, `category` (`mn-d` markdown-not-in-duckdb, `i-no-m` indexed-but-no-markdown) |

Exported via OTLP **and** scraped at `/metrics` on a separate
port (default `:9090`) for Prometheus operators who don't run
an OTLP collector. The `/metrics` endpoint is a thin Prometheus
text-format adapter over the same OTel metrics SDK.
Scraped at `/metrics` on a dedicated listener (default `:9090`,
tailnet-only — see [`operations.md`](../operations.md)). The live
gateway renders these through a Prometheus registry, so the wire
names are `_`-separated: `escurel.tool_calls` is exposed as
`escurel_tool_calls`, etc. Trace spans are exported via OTLP;
metric OTLP export is not yet wired.

**Implemented today:** `escurel_tool_calls`,
`escurel_tool_latency_ms`, `escurel_live_sessions_open`, and
`escurel_audit_drift`, plus the gateway-level `escurel_up` and
`escurel_requests_total{route,status}`. The remaining
histograms/gauges in the table above (`write_lock_wait_ms`,
`embed_batch_size`, `embed_queue_depth`, `storage_bytes`) are
**reserved** — specified here, not yet populated.

### Logs

Expand Down
Loading