Status: Exploratory / Draft This repository is active design work and subject to significant rewriting. It captures an evolving specification for packaging research evidence as an RO-Crate profile. Nothing here should be treated as stable.
A specification and tooling for creating evidence bundles — self-contained data packages that bundle a single research observation with its visualization, underlying data, methods, and machine-readable metadata.
Evidence bundles are defined as a profile of RO-Crate 1.2. All metadata lives in a single ro-crate-metadata.json, combining standard RO-Crate packaging with the dge: (Discourse Graph Evidence) vocabulary for domain-specific scientific semantics (observable, method, system, evidence statement).
This means every evidence bundle is a valid RO-Crate — interoperable with the FAIR ecosystem (Zenodo, DataCite, Google Dataset Search) — while also carrying the epistemological structure of a discourse graph finding.
Design discussion and open questions live in the discourse graph:
A complete, validated evidence bundle conforming to the v0.2 spec:
- EVD 1: Issue-to-experiment conversion rate — reference implementation with all required and recommended elements
This bundle demonstrates concrete choices for every spec field (observable, method, system, provenance, metrics) and passes validate_bundle(). Use it as a template when creating new bundles.
Working examples produced from the MATSUlab discourse graph analysis:
- MATSUlab evidence bundles (EVD1, EVD5, EVD6, EVD7)
Note: these predate the v0.2 spec and use the older dual-file format (
evidence.jsonld+ro-crate-metadata.json).
spec/ # NORMATIVE — the spec and its validation schema
evidence-bundle-spec.md # Evidence Bundle RO-Crate Profile spec (v0.2 draft)
evidence-bundle.schema.json # JSON Schema for ro-crate-metadata.json validation
design/ # RATIONALE — framing and motivation, non-normative
phoenix-architecture-mapping.md # Phoenix Architecture mapping (why the spec is shaped this way)
phoenix-evidence-bundle-mapping.svg # Visual diagram of the Phoenix Control Loop mapping
examples/ # REFERENCE IMPLEMENTATION
evidence_bundles/
evd1-conversion-rate/ # Reference bundle (v0.2 conformant)
src/ # tooling
evidence_bundle.py # Generic packaging library (BundleManifest + validation)
create_evidence_bundle.py # Per-EVD manifest builders for MATSUlab bundles
.claude/commands/
bundle-evidence.md # Claude Code skill for AI-orchestrated bundle creation
The spec/ folder is the source of truth for what an evidence bundle is. The design/ folder explains why the spec is shaped the way it is — read it for motivation, not for format details.
Evidence bundles are RO-Crate 1.2 packages with the dge: vocabulary for scientific semantics:
| Field | Description |
|---|---|
| Observation statement | Past-tense finding (dge:evidenceStatement on Root Data Entity) |
| Data artifact | At least one visualization (figure file entity with caption) |
| Observable | What was measured (dge:Observable contextual entity) |
| Model system | The system or dataset studied (dge:System contextual entity) |
| Method | How it was measured (dge:Method contextual entity with instrument refs) |
| Provenance | CreateAction entity with Person + SoftwareApplication agents |
| Summary metrics | variableMeasured with PropertyValue entities |
| Grounding context | Caveats, limitations, interpretive context (in README) |
A BundleManifest dataclass and generic packaging functions:
from evidence_bundle import BundleManifest, create_evidence_bundle, validate_bundle
# Build a manifest (all prose pre-composed)
manifest = BundleManifest(
evd_number=8,
slug="my-finding",
evidence_statement="...",
...
)
# Package it
bundle_dir = create_evidence_bundle(manifest, output_dir)
# Validate
errors = validate_bundle(bundle_dir)CLI:
python src/evidence_bundle.py validate path/to/bundle/
python src/evidence_bundle.py create manifest.json --output-dir output/A /bundle-evidence slash command that orchestrates the full workflow: discover artifacts, compose prose (captions, descriptions, grounding context), build a manifest, call the Python library, and validate.
Scientific papers are compiled artifacts. The actual sources of scientific knowledge are individual observations — the data, the methods, the grounding context. Evidence bundles make this explicit: they are the modular components that get compiled into scientific arguments, not the other way around.
This follows Chad Fowler's Phoenix Architecture: just as specifications are the durable asset in software (not code), evidence bundles are the durable asset in science (not papers). Update a bundle; the system knows which claims are affected. Rewrite the paper; the evidence persists. See design/phoenix-architecture-mapping.md for the full mapping and design/phoenix-evidence-bundle-mapping.svg for a visual diagram.