Trusted Autonomy T&E assurance-case runtime for evidence-backed AI/autonomous system evaluation.
IX-Autonomy-Assurance-Case-Runtime is a local, open-source reference runtime for making AI/autonomy evaluation artifacts traceable, reviewable, bounded, evidence-backed, and defensible.
It is aimed at a practical assurance gap:
When an AI/autonomous system acts, can the action be traced back through mission need, requirement, scenario, hazard/control, runtime telemetry, safety-gate decision, evidence bundle, assurance claim, human authority, ledger, and exportable review package?
This repository builds that chain as deterministic Python records, validators, readiness gates, rollups, and tests.
This repository is the official source for IX-Autonomy-Assurance-Case-Runtime.
Do not trust unofficial mirrors, ZIP files, installer downloads, or rehosted copies unless provenance is independently verified.
This project is an original work by Bryce Lovell and is licensed under the Apache License, Version 2.0.
This is a local reference implementation for Trusted Autonomy test, evaluation, assurance-case traceability, and review workflow experiments.
It provides:
- assurance-case domain records,
- mission-thread and scenario modeling,
- deterministic evidence bundles,
- traceability graph validation,
- runtime safety-gate decisions,
- human authority and review workflow records,
- degradation and monitoring records,
- telemetry source/schema/adapter records,
- scenario campaign planning and run reports,
- policy pack and waiver evidence checks,
- framework crosswalk and evidence coverage checks,
- signed provenance records and verification posture,
- export package and redaction validation,
- assurance dossier trace-closure checks,
- claim guardrails to prevent overstatement,
- federal/IC/DoD-style evaluation profile mapping,
- prototype maturity and readiness rollups,
- command-line interface,
- JSON schemas,
- synthetic examples,
- test coverage for the local runtime model.
This is not:
- an official government system,
- an official federal, IC, DoD, or agency-endorsed tool,
- a certified safety system,
- a production autonomy controller,
- a weapons-control system,
- a cybersecurity authorization package,
- an authority-to-operate package,
- a classified-data workflow,
- a procurement-acceptance package,
- a deployment-readiness claim,
- a replacement for formal test and evaluation,
- a replacement for legal, safety, security, or acquisition review,
- a replacement for human command authority,
- a claim that any real autonomous system is safe.
A passing report means the local runtime artifacts passed the implemented checks. It does not certify, authorize, approve, endorse, deploy, or accept a real-world system.
The runtime is designed around this review chain:
mission need
-> requirement
-> scenario
-> hazard / control
-> runtime telemetry
-> degradation / monitoring posture
-> policy and safety-gate decision
-> evidence bundle
-> provenance record
-> assurance claim
-> human review / authority state
-> ledger / report / export package
-> assurance dossier / evaluation profile
The goal is not to make autonomy look trustworthy by assertion. The goal is to force the relevant records, links, checks, and limitations into reviewable form.
The repo includes a local maturity gate in prototype_target.py, prototype_readiness.py, and prototype_rollup.py.
The current model uses:
- 40% baseline local reference-runtime maturity before the serious-prototype capability families are complete,
- 80% serious open-source prototype target when the original nine required capability families are complete,
- 100% local prototype maturity ceiling when all twelve local capability families are complete.
These percentages are internal project maturity markers only. They are not certification, authority to operate, operational deployment readiness, procurement acceptance, agency acceptance, or official endorsement.
The original serious-prototype target is reached by the following nine capability families:
registry-layerpolicy-pack-engineframework-crosswalkssigned-provenancetelemetry-adaptersscenario-campaign-runnermonitoring-incidentsreview-workflowaudit-report-export
The extended local model adds three hardening families:
assurance-dossierclaim-guardrailsfederal-evaluation-profile
When all twelve are complete, the local prototype model can roll up to 100% local prototype maturity while still refusing claims of certification, authority, field readiness, procurement acceptance, or official agency acceptance.
Models systems, models, use cases, deployments, lifecycle state, risk tiers, telemetry source references, and evidence bundle references.
Readiness checks verify catalog consistency, required evidence, lifecycle posture, and whether the registry-layer capability can count toward prototype maturity.
Models policy packs, rules, decisions, subject/action categories, authority requirements, waivers, and waiver evidence.
Readiness checks evaluate policy requests, denial conditions, review/waiver requirements, waiver evidence coverage, and whether the policy-pack-engine capability is complete.
Models framework objectives, control mappings, coverage status, expected artifact types, and evidence expectations.
Evidence coverage checks verify referenced bundles and expected evidence kinds. Readiness checks prevent framework alignment from being treated as official compliance or endorsement.
Models artifact digests, signer identities, signatures, attestations, manifest verification, and provenance readiness.
Readiness checks require verified, audit-facing artifact provenance and preserve the boundary between local provenance and external trust authority.
Models telemetry sources, schemas, schema fields, replay records, normalized envelopes, adapter policies, and adapter reports.
Readiness checks require at least one accepted runtime-usable normalized envelope and validate that telemetry can support reviewable runtime evaluation.
Models scenario campaigns, campaign objectives, scenario roles, tags, acceptance thresholds, stop rules, run inputs, run reports, and campaign evidence.
Readiness checks require run evidence and acceptance posture strong enough to support the scenario-campaign-runner capability.
Models monitoring snapshots, drift records, incidents, revalidation triggers, and evidence-backed monitoring trails.
Readiness checks ensure current snapshots, handled incidents, satisfied revalidation triggers, and evidence coverage exist before monitoring can count as complete.
Models human review workflows, findings, signoffs, dissent, authority posture, and review evidence.
Readiness checks ensure human authority and review state remain visible rather than being hidden behind automated outputs.
Models export package manifests, artifact references, redaction rules, package status, package format, evidence references, provenance references, and disclaimers.
Readiness checks require machine-readable review packages with evidence, provenance, redaction coverage, and clear non-official prototype language.
Models trace-closure packages that connect mission threads, requirements, scenarios, hazards, controls, evidence, human review, export packages, and provenance references.
Readiness checks verify whether the review trail is closed enough to support local dossier-level assurance claims.
Models evidence-backed claims, audiences, risk levels, review status, prohibited phrase rules, release packages, and non-endorsement limitations.
Readiness checks prevent the repo from overstating what local evidence can prove.
Models public-sector evaluation concerns, profile mappings, required artifacts, completed capabilities, evidence references, and disclaimer posture.
Readiness checks map local prototype artifacts to federal/IC/DoD-style evaluation concerns without claiming official acceptance or endorsement.
A simplified runtime path looks like this:
Scenario Catalog + Runtime Telemetry
|
v
Degradation Engine
|
v
Runtime Safety Gate
|
v
Scenario Runner
|
v
Evidence Bundle
|
v
Verification Engine
|
v
Assurance Report
|
v
Run Ledger
The expanded assurance path adds:
Registry + Policy + Framework Crosswalks
|
v
Telemetry Adapter + Campaign Runner + Monitoring Trail
|
v
Review Workflow + Export Package
|
v
Assurance Dossier + Claim Guardrails + Federal Evaluation Profile
|
v
Prototype Readiness / Rollup Gate
Install the package in editable mode, then use the ix-assurance command.
ix-assurance validate-case --case examples/degraded-navigation-case.json
ix-assurance run-scenario \
--catalog examples/degraded-navigation-catalog.json \
--telemetry examples/telemetry-degraded-navigation.json \
--case-id CASE-NAV-001 \
--scenario-id SCN-NAV-001 \
--run-id RUN-NAV-001
ix-assurance run-scenario \
--catalog examples/degraded-navigation-catalog.json \
--telemetry examples/telemetry-degraded-navigation.json \
--case-id CASE-NAV-001 \
--scenario-id SCN-NAV-001 \
--run-id RUN-NAV-001 \
--default-degradation
ix-assurance verify-bundle --bundle examples/evidence-bundle.json
ix-assurance validate-ledger --ledger examples/run-ledger.json
ix-assurance audit-traceability \
--case examples/degraded-navigation-case.json \
--catalog examples/degraded-navigation-catalog.json \
--mission-need examples/mission-need.json \
--requirements examples/requirements.json \
--scenario-id SCN-NAV-001 \
--claim-id CLM-NAV-001
ix-assurance export-report \
--report examples/report.json \
--format markdown
python -m pip install --upgrade pip
python -m pip install -e ".[dev]"
python -m ruff check .
python -m mypy src tests
python -m pytest
The local test suite is intended to keep the runtime deterministic, typed, and conservative about claims.
The included synthetic example uses a degraded-navigation case.
It asks whether a reference autonomy function enters safe-hold when navigation confidence becomes unsafe.
The scenario links:
mission need
-> requirement
-> critical navigation hazard
-> navigation confidence gate
-> degraded-navigation scenario
-> runtime telemetry
-> safe-hold evidence
-> assurance claim
src/ix_autonomy_assurance_case_runtime/
assurance_case.py
assurance_dossier.py
assurance_dossier_validation.py
assurance_dossier_readiness.py
authority.py
claim_guardrails.py
claim_guardrails_validation.py
claim_guardrails_readiness.py
cli.py
contracts.py
degradation.py
evidence.py
export_package.py
export_package_validation.py
export_package_readiness.py
federal_evaluation_profile.py
federal_evaluation_profile_validation.py
federal_evaluation_profile_readiness.py
framework_crosswalk.py
framework_crosswalk_evidence.py
framework_crosswalk_readiness.py
ledger.py
monitoring.py
monitoring_validation.py
monitoring_readiness.py
policy.py
policy_evaluator.py
policy_waiver_evidence.py
policy_readiness.py
prototype_target.py
prototype_readiness.py
prototype_rollup.py
provenance.py
provenance_verifier.py
provenance_readiness.py
registry.py
registry_catalog.py
registry_evidence.py
registry_readiness.py
reporting.py
review_workflow.py
review_workflow_validation.py
review_workflow_readiness.py
runner.py
safety_gate.py
scenario_campaigns.py
scenario_campaign_validation.py
scenario_campaign_runner.py
scenario_campaign_readiness.py
scenarios.py
telemetry.py
telemetry_adapter.py
telemetry_readiness.py
traceability.py
verification.py
examples/
degraded-navigation-case.json
degraded-navigation-catalog.json
evidence-bundle.json
mission-need.json
report.json
requirements.json
run-ledger.json
telemetry-degraded-navigation.json
schemas/
assurance-case.schema.json
evidence-bundle.schema.json
report.schema.json
run-ledger.schema.json
scenario-catalog.schema.json
docs/
ARCHITECTURE.md
TESTING.md
THREAT_MODEL.md
USAGE.md
-
Evidence over assertion. Claims need support paths, evidence records, and validation surfaces.
-
Conservative under uncertainty. Missing, stale, invalid, or degraded data should not quietly become approval.
-
Traceability is first-class. Runtime behavior should connect back to mission need, requirement, hazard, control, scenario, evidence, review, and claim.
-
Human authority remains visible. The system models when autonomy may act, when review is required, when dissent exists, and when authority is denied.
-
Claims stay bounded. Local prototype maturity is not certification, authority to operate, deployment readiness, procurement acceptance, agency acceptance, or official endorsement.
-
Integrity is local and explicit. Hashes, signatures, ledgers, and provenance records support local review. They do not replace external accreditation, legal review, or formal safety certification.
Apache License, Version 2.0.
See LICENSE and NOTICE.
Bryce Lovell