UI data contract (v0.1)

The UI must not depend on raw internal logs or ad-hoc shapes. It consumes ui-export output as the primary input. This document specifies expected folder layouts, required files, relationships, and schema version handling.

Primary input: ui-export bundle

Run:

labtrust ui-export --run <dir> --out <ui_bundle.zip>

The bundle contains normalized, UI-ready JSON:

File	Description
index.json	Episodes, tasks, baselines; file refs (results path, log path, receipts path) per episode. When the run dir contains coordination pack or lab report output, includes `coordination_artifacts`: list of `{ path, label }`; those files are also included in the zip under `coordination/`.
events.json	All step outcomes in one array: normalized gate fields (status, blocked_reason_code, violations, emits, token_consumed, t_s, agent_id, action_type, event_id). Optionally chunked by episode in future.
receipts_index.json	List of receipt locations: task/label → path and list of receipt filenames (e.g. receipt_specimen_S1.v0.1.json).
reason_codes.json	Full reason code registry (code → namespace, severity, description, etc.) so UI does not parse policy YAML.

Acceptance: UI can depend on ui-export output as primary input, not raw internal logs.

Expected folder layouts (run directories)

--run <dir> accepts either a labtrust_runs run or a package-release output directory.

1. labtrust_runs (quick-eval)

Typical path: labtrust_runs/quick_eval_YYYYMMDD_HHMMSS/.

Path	Description
`throughput_sla.json`, `adversarial_disruption.json`, `multi_site_stat.json`	Results files (schema results.v0.2). One file per task; each may contain multiple episodes.
`logs/throughput_sla.jsonl`, `logs/adversarial_disruption.jsonl`, `logs/multi_site_stat.jsonl`	Episode log (JSONL): one line per step; same order as steps in run.
`summary.md`	Human-readable summary (optional).

Relationships:

For each X.json in run root, there may be logs/X.jsonl (task id derived from filename: e.g. throughput_sla.json → task throughput_sla).
Episodes in X.json are ordered; the i-th episode corresponds to the same run that produced the lines in logs/X.jsonl (when num_episodes is 1, the whole JSONL is one episode).
No receipts directory in plain quick-eval; receipts_index.json in the ui-export will be empty or omit this run’s receipts.

2. package-release (paper_v0.1)

Typical path: <out>/ from labtrust package-release --profile paper_v0.1 --out <out>.

Path	Description
`_baselines/`	Official baselines: `results/*.json`, `summary.csv`, `summary.md`, `metadata.json`.
`_study/`	Study run: `manifest.json`, `results/`, `logs/` (per condition), `figures/`.
`_repr/<task>/`	Representative run per task: `episodes.jsonl`, `results.json`.
`receipts/<task>/`	Receipts and EvidenceBundle.v0.1 per task (e.g. `receipts/throughput_sla/EvidenceBundle.v0.1/`, `receipts/throughput_sla/receipt_*.v0.1.json`).
`FIGURES/`, `TABLES/`	Plots and summary tables.
`metadata.json`, `RELEASE_NOTES.md`	Run metadata.

Relationships:

Episodes / tasks: From _repr/<task>/results.json and _repr/<task>/episodes.jsonl; from _baselines/results/*.json (task from filename); from _study/results/ and _study/logs/ (condition_ids from manifest).
Receipts: For each receipts/<task>/, list EvidenceBundle.v0.1/*.json and any receipt_*.v0.1.json in receipts/<task>/; link to episode by task (and optionally condition_id for study).
Baselines: From _baselines/results/*.json and _baselines/metadata.json; baseline names from metadata or filenames.

Inferring relationships:

Task → results: results.json or <TaskName>.json; schema version in schema_version (e.g. 0.2).
Task → log: Same directory as results: episodes.jsonl or logs/<TaskName>.jsonl.
Task → receipts: receipts/<task>/; receipt files match receipt_*.v0.1.json or live inside EvidenceBundle.v0.1/.
Event → episode: Events in events.json can carry episode_key (e.g. task + episode_index) so UI can group by episode.

3. Run dirs with coordination pack or lab report

When --run <dir> is a directory that contains coordination security pack output or a lab report (e.g. from labtrust run-coordination-security-pack plus labtrust build-lab-coordination-report, or labtrust run-official-pack --include-coordination-pack which writes into coordination_pack/), ui-export scans for these artifacts and adds them to the bundle.

Path (relative to run dir)	Description
`pack_summary.csv`	One row per cell (scale x method x injection).
`pack_gate.md`	PASS/FAIL/not_supported per cell.
`SECURITY/coordination_risk_matrix.csv`, `.md`	Method x injection x phase outcomes.
`LAB_COORDINATION_REPORT.md`	Single lab report with scope, decision, artifact table.
`COORDINATION_DECISION.v0.1.json`, `.md`	Chosen method per scale.
`summary/sota_leaderboard.md`	SOTA leaderboard (main): compact table with key metrics and optional run metadata.
`summary/sota_leaderboard_full.md`	SOTA leaderboard (full metrics): all aggregated numeric columns.
`summary/sota_leaderboard_full.csv`	Full leaderboard in CSV form for programmatic use.
`summary/method_class_comparison.md`	Method-class comparison (throughput, violations, blocks, resilience, attack_success_rate, stealth, n_cells).
`graphs/sota_key_metrics.html`	Primary chart: key metrics by method (throughput, resilience, safety, security) in one view.
`graphs/throughput_by_method.html`	Throughput (mean) by method.
`graphs/violations_by_method.html`	Violations (mean) by method.
`graphs/resilience_by_method.html`	Resilience score by method.
`graphs/method_class_comparison.html`	Method-class comparison (throughput and resilience by class).

When present, index.json includes coordination_artifacts: a list of { "path": "<rel>", "label": "..." } for each found file. Graph HTML files are generated at export time from pack_summary.csv (or summary_coord.csv) and included under coordination/graphs/. Paths may be under coordination_pack/ when the run is an official pack with --include-coordination-pack. The same files are included in the zip under the prefix coordination/ (e.g. coordination/pack_summary.csv, coordination/summary/sota_leaderboard_full.md) so the UI can link to or load them without reading the raw run dir.

SOTA leaderboard (main and full) and method-class comparison

Main leaderboard (summary/sota_leaderboard.md, .csv): Single table with the most important hospital-lab metrics per method: throughput_mean, throughput_std, violations_mean, blocks_mean, resilience_score_mean, resilience_score_std, p95_tat_mean, on_time_rate_mean, critical_compliance_mean, attack_success_rate_mean, stealth_success_rate_mean, n_cells. When pack_manifest.json exists, the Markdown includes a Run metadata line (seed_base, git_sha) at the top.
Full leaderboard (summary/sota_leaderboard_full.md, .csv): All aggregated numeric metrics per method; columns depend on the data source (pack_summary vs summary_coord). Use for detailed analysis (security detection/containment, comm, LLM economics). See Hospital lab key metrics.
Method-class comparison (summary/method_class_comparison.md, .csv): Same metrics aggregated by coordination class (e.g. kernel_schedulers, centralized, llm), including blocks_mean and attack_success_rate_mean. The UI may show the main leaderboard by default and link to the full leaderboard and method-class comparison for drill-down.

Coordination graphs (UI bundle)

When the run contains pack_summary.csv (or summary_coord.csv), ui-export generates self-contained HTML charts (Chart.js via CDN) and adds them to the bundle under coordination/graphs/:

Primary: graphs/sota_key_metrics.html — one state-of-the-art chart with four normalized metrics (throughput, resilience, safety, security) per method for at-a-glance comparison.
Additional: graphs/throughput_by_method.html, graphs/violations_by_method.html, graphs/resilience_by_method.html, graphs/method_class_comparison.html for single-metric and method-class views.

Required files and how to infer relationships

Need	Source
List of tasks	From result filenames (e.g. `throughput_sla.json`) or from `_repr/`, `_baselines/results/`, `_study/results/`.
Episodes per task	From `results.json` / `TaskX.json` → `episodes` array; length = number of episodes.
Step-level outcomes	From episode log JSONL; each line = one step. ui-export normalizes these into `events.json` with stable field names.
Receipts per task	From `receipts/<task>/` and `EvidenceBundle.v0.1/` contents; list in `receipts_index.json`.
Reason code labels	From `reason_codes.json` (exported from policy); key = code, value = { namespace, severity, description, ... }.

index.json (logical shape):

ui_bundle_version: string (e.g. "0.1"). Always present.
run_type: "quick_eval" | "package_release" | "full_pipeline". Always present.
tasks: list of task ids. Always present (may be empty).
episodes: list of episode objects. Always present (may be empty).
baselines: list of baseline ids. Always present (may be empty).
coordination_artifacts (optional): list of { "path": "<rel>", "label": "..." } when run dir contains pack_summary.csv, LAB_COORDINATION_REPORT.md, or related files; paths are relative to run dir; files are also in the zip under coordination/.
pipeline_mode, llm_backend_id, llm_model_id, allow_network (optional): present when run is from official pack or full pipeline.
receipts_note (optional): present for full_pipeline when there are no receipts (explains why receipts_index is empty).
coord_telemetry (optional): present when episode logs have coord_decisions.jsonl.

Episode object (each entry in episodes):

task: string. Required.
episode_index: number. Required.
episode_key: string (e.g. "<task>_<episode_index>"). Optional but emitted by backend.
results_ref: string (path relative to run dir). Required.
log_ref: string or null (path to episode log JSONL, or null when no log). Must accept null for full_pipeline and quick_eval without logs.
receipts_ref: string or null (path to receipts dir, or null). Must accept null for runs without receipts.

Frontend validation: The UI bundle loader must treat log_ref and receipts_ref as optional or nullable (string | null). Do not require them to be non-empty strings, or validation will fail for bundles from full_pipeline or LLM live official pack runs.

events.json:

Array of normalized events; each has: t_s, agent_id, action_type, status, blocked_reason_code, emits, violations, token_consumed, event_id (if present), and optional episode_key / task / episode_index for grouping.

receipts_index.json:

Array of { "task", "path", "receipt_files": [...] }; path is relative to run or bundle root; receipt_files are filenames (e.g. receipt_specimen_S1.v0.1.json).

reason_codes.json:

{ "version": "0.1", "codes": { "<code>": { "namespace", "severity", "description", ... } } }. Same shape as registry; UI uses it for display and validation.

Schema version handling rules

UI bundle schema: The ui-export output (index, events, receipts_index, reason_codes) is versioned. Current version: 0.1. The bundle MAY include a top-level ui_bundle_version (e.g. in index.json) so the UI can reject unknown versions.
Results: Results files follow results.v0.2 (or v0.3). UI must accept schema_version and ignore extra fields; do not assume fields beyond the contract.
Receipts: Receipt files follow receipt.v0.1; EvidenceBundle follows evidence_bundle_manifest.v0.1. UI must not rely on internal log shapes—only on ui-export’s receipts_index.json and the receipt schema for displayed fields.
Extensible only: New schema versions (e.g. results.v0.3) add optional fields only; required fields and semantics of v0.2 remain. UI should be tolerant of missing optional fields.
Stable field names: Normalized gate outcomes in events.json use fixed names (status, blocked_reason_code, violations, emits, token_consumed). New gate fields are added as optional keys; existing keys are not renamed or removed in v0.1.

Summary

Run layouts: labtrust_runs (quick_eval_*) and package-release (paper_v0.1) are the two supported run directory shapes.
Relationships: Task → results file, task → log file, task → receipts dir; episodes from results episodes array; steps from JSONL → normalized into events.json.
Schema rules: UI bundle v0.1; results v0.2/v0.3 extensible only; stable event field names; reason_codes and receipts_index supplied so UI does not parse policy or raw logs.
Acceptance: UI uses ui-export output as primary input; raw internal logs are not part of the UI contract.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UI data contract (v0.1)

Primary input: ui-export bundle

Expected folder layouts (run directories)

1. labtrust_runs (quick-eval)

2. package-release (paper_v0.1)

3. Run dirs with coordination pack or lab report

SOTA leaderboard (main and full) and method-class comparison

Coordination graphs (UI bundle)

Required files and how to infer relationships

Schema version handling rules

Summary

FilesExpand file tree

ui_data_contract.md

Latest commit

History

ui_data_contract.md

File metadata and controls

UI data contract (v0.1)

Primary input: ui-export bundle

Expected folder layouts (run directories)

1. labtrust_runs (quick-eval)

2. package-release (paper_v0.1)

3. Run dirs with coordination pack or lab report

SOTA leaderboard (main and full) and method-class comparison

Coordination graphs (UI bundle)

Required files and how to infer relationships

Schema version handling rules

Summary