Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,5 @@ Thumbs.db


list_tree.sh
tree.txt
tree.txt
tmp.txt
38 changes: 38 additions & 0 deletions docs/governance/CHECKLIST_EMISSION_CONTRACT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# CHECKLIST EMISSION CONTRACT (AUTHORITATIVE)

Status: AUTHORITATIVE

Single Invariant
- Checklist emission is mandatory for every engine run. A persisted artifact must capture the engine's outcome for every run; acceptable artifacts include a final `checklist.json`, a well-formed `refusal` artifact, or an explicit persisted draft (`checklist_draft.json`) accompanied by a manifest entry that records the run outcome.

Policy — Forbidden Patterns
- Under no circumstances shall a run suppress emitted results via:
- uncaught exceptions that abort run without emitting an artifact,
- early `return` paths that omit persisting an outcome artifact,
- validation/sync/quality/persona/schema/test failures that cause silent termination without producing a persisted checklist or refusal artifact.
- Runs must persist an outcome artifact even when the run concludes with a refusal, validation advisory, or diagnostic-only result.

Allowed Checklist Outcomes (canonical)
- ACTION — implementable task(s) to advance the spec toward an invariant.
- BLOCKER — a condition that blocks safe progress and requires remediation.
- REFUSAL — an explicit, auditable refusal to emit executable artifacts because required conditions were unmet (must include `refusal_reason`).
- DIAGNOSTIC — advisory artifacts or reports (sufficiency, readiness trace, suppressed-signal report) that augment the persisted outcome.

Representation Requirements (high-level)
- Every outcome artifact must include metadata enabling auditability: `emitted` indicator, `refusal` boolean when applicable, `refusal_reason` text when refusal is true, `confidence` level, and a manifest entry linking the run fingerprint and outputs.
- Failures in validation, sync, persona vetoes, schema checks, quality gates, or test-attached enforcement MUST be represented as checklist items or an explicit refusal artifact, not as a silent hard-fail that leaves no persisted outcome.

Scope and Authority
- Owner: Governance (docs/governance)
- Authority: This document is AUTHORITATIVE and governs expected engine behavior at a policy level. Implementation details and remediations are tracked separately and require implementation PRs referencing this contract.

Change History
- 2025-12-16: Document created and marked AUTHORITATIVE.

Reporting Discipline (Copilot)
- Scope: Applies to all Copilot-generated governance reports and policy summaries produced in this repository; this rule is authoritative for Copilot reporting in `docs/governance`.
- Default report format (required): concise, structured bullets only: 1) Summary — 1–2 sentences; 2) Actions Taken — bulleted list of edits; 3) Files Modified — bulleted list; 4) Next Steps — single-line recommendation.
- Forbidden reporting behaviors: no long-form narrative; no speculative analysis; no disclosure of internal system or developer instructions; no step-by-step tool/process logs; no persona or model identity claims beyond the fixed preamble rules.
- Contract enforcement: Excessive verbosity that dilutes actionable signals is a contract violation; Governance reviewers may require edits or record violations in the decision log.
- Authority & scope: Governance (docs/governance). This policy is documentation-only; no runtime, schema, or engine changes were made.
- Effective: 2025-12-16
57 changes: 57 additions & 0 deletions docs/governance/CHECKLIST_SEMANTICS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# CHECKLIST SEMANTICS (AUTHORITATIVE)

Status: AUTHORITATIVE

Purpose
- Provide a concise, authoritative description of checklist semantics for the ShieldCraft engine and governance surface.
- This document records the contract that implementations are expected to honor (facts and contract only). It does not prescribe code changes or implementation details.

Checklist Item Classes
- ACTION
- A checklist item representing a concrete implementable change or task that, when completed, advances the product toward satisfying a requirement.
- Typical fields: `id`, `ptr`, `text`, `action`.

- BLOCKER
- A checklist item that indicates a condition that must be resolved before safe progress can be made. Blockers are actionable but may be prioritized differently and can be blocking for automated execution.
- Typical fields: `id`, `ptr`, `text`, `blocking: true`.

- REFUSAL
- A checklist-level outcome that indicates the system deterministically refused to produce an executable artifact because required conditions (evidence, invariants, artifact producers, or safety checks) were not met.
- A well-formed refusal is an explicit, successful outcome and MUST be emitted in place of an executable artifact.
- Typical fields: `refusal: true`, `refusal_reason` (string), and contextual guidance in `items` or manifest.

- DIAGNOSTIC
- A checklist item or artifact that assists authors with context, guidance, or debugging information (e.g., sufficiency reports, readiness traces, suppressed-signal reports). Diagnostics are advisory and do not by themselves indicate readiness.

Mandatory Checklist-level Fields (emitted by the engine)
- `emitted` (boolean or timestamp): indicates whether a final checklist artifact (or refusal) has been produced. The presence of `emitted` = true signals an explicit engine decision was persisted.
- `confidence` (string: e.g., "low" | "medium" | "high"): an explicit top-level or per-item indication of confidence in the checklist content.
- `refusal` (boolean): when true, indicates that the run concluded with a refusal outcome (an explicit, successful refusal).
- `refusal_reason` (string | null): human-readable reason for any refusal outcome; should reference which invariant, gate, or missing artifact caused the refusal.
- `safe_first_action` (object | null): when available, an advisory first safe action (or refusal_action) for authors/operators to take next; may be `null` when not applicable.

Emission Guarantee (Contract)
- Checklist emission is mandatory under all conditions.
- For successful runs, a persisted `checklist.json` under self-host outputs (and an entry in the run manifest) must be produced.
- For validation or contraction failures, a deterministic advisory artifact (e.g., `checklist_draft.json`) and/or an explicit `refusal` outcome and corresponding manifest entries must be produced.
- Under no circumstances may a run suppress emission by terminating with an uncaught exception, an early return that omits emitting artifacts, or by relying on validation failures to hide the absence of emission. Emission can be a draft, a refusal, or a final checklist; what matters is an explicit persisted artifact representing the engine decision.

Why Refusal Is a Successful Outcome
- A refusal indicates the engine made a deterministic, auditable decision to not produce an executable artifact due to missing evidence, invariant violations, safety constraints, or missing artifact producers.
- Refusal artifacts are actionable: they must contain `refusal_reason` and sufficient guidance or diagnostics so authors or operators can remediate and re-run.
- Treating refusal as a first-class success mode enables deterministic CI, reproducible auditing, and clearer governance traces.

What Went Wrong Previously (No‑Machine Failure Mode) — Facts Only
- Observed symptom: callers (trials runner) asserted "Checklist not emitted" even though a `checklist.json` artifact existed in the self-host output directory for the same run.
- Observed causes (factual):
- Some control paths in the orchestration code relied on subsequent checks or side-effects and could overwrite detection flags or re-evaluate emission state after a previously-observed emission was detected (e.g., clearing an emission flag when `spec_feedback.json` was missing).
- Validation-failure paths sometimes wrote only advisory preview artifacts (e.g., `checklist_draft.json`) and then applied a primary-artifact invariant that raised when both `checklist_draft.json` and `refusal_report.json` were present, producing an error instead of persisting a single canonical artifact.
- Post-generation post-processing (minimality/inference/execution-plan checks) can raise fatal errors that prevent the final persistence step even when generator returned an in-memory checklist result.
- Net effect: the absence of a single, deterministic persisted artifact representing the engine outcome caused client-side brittle checks and false-negative detection of "no checklist emitted."

Owner and Authority
- Owner: Governance (docs/governance)
- This document is AUTHORITATIVE: it records the contract that implementations must respect. Implementation-level remediation is tracked separately (decision-log / issue tracker).

Change History
- 2025-12-16: Document created and marked AUTHORITATIVE.
31 changes: 31 additions & 0 deletions docs/governance/GATE_HANDLING_POLICY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# GATE HANDLING POLICY

Status: Governance policy (implementation-agnostic)

Gate Classes
- Preflight gates: sync, schema, instruction validation, governance presence, and early persona veto checks.
- Generation gates: checklist generator internal validations, invariant checks, semantic gates, and test gates.
- Post-generation gates: artifact emission locks, minimality/equivalence, execution-plan verification, quality gates, and filesystem/IO failures.

Required Behavior
- Preflight gates: when a preflight gate triggers a failure that prevents normal generation, the engine MUST emit a persisted artifact recording the failure as one or more checklist items or an explicit refusal artifact (include `refusal_reason` and diagnostics).
- Generation gates: internal generator validation errors that prevent normal item synthesis MUST be reflected in the returned checklist result; the engine MUST persist the resulting checklist (possibly with `valid: false` and `reason` fields) or emit an explicit refusal artifact.
- Post-generation gates: gates that inspect emitted artifacts (quality, minimality, execution plan) MUST either:
- annotate the checklist and persist it (if the run outcome is advisory or remediable), or
- emit an explicit refusal artifact with `refusal_reason` if the artifact cannot be safely produced.

Allowed Hard-Fail Categories
- Only the following are allowed to propagate as immediate hard-fail runtime errors (i.e., no checklist persistence possible):
1. Catastrophic runtime corruption (process memory corruption, interpreter crash).
2. Filesystem write failures that prevent any artifact persistence (disk full, permission error) as determined by a persisted IO error state.
3. Security-critical breaches that require immediate abort and out-of-band incident handling.
- All other gate outcomes must be represented via checklist annotations or an explicit refusal artifact (policy requirement).

Gate IDs and References
- Applicable gate IDs: G1–G22 (see Gate Inventory in `tmp.txt` for details). Implementations should consult the Gate Inventory when classifying real failures.

Policy Notes
- This policy is intentionally implementation-agnostic and does not prescribe code changes or refactors. It defines required behaviors for gate handling to ensure every run has an auditable persisted outcome.

Change History
- 2025-12-16: Policy created and published in governance docs.
10 changes: 10 additions & 0 deletions docs/governance/INVARIANTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,3 +152,13 @@ On `UNKNOWN_FAILURE`:
Verification-related invariants and properties are to be enforced by the Verification Spine (see `docs/governance/VERIFICATION_SPINE.md` and `src/shieldcraft/verification`).

This file declares the governance anchor; enforcement logic will be implemented in the Verification Spine and versioned via its governance document.

---

## Checklist Emission Invariant

- All engine execution paths MUST result in a finalized checklist artifact (final checklist or explicit refusal) that records the observed gate events. The canonical emission boundary is the centralized function `finalize_checklist(...)` in `src/shieldcraft/engine.py`.
- Exceptions may occur only after recording a gate event to the `ChecklistContext`.
- `finalize_checklist(...)` is the sole emission boundary.
- This invariant is enforced by code-level assertions and tests.

78 changes: 78 additions & 0 deletions docs/governance/TEMPLATE_COMPILATION_CONTRACT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
<!-- AUTHORITATIVE -->
# Template Compilation Contract (Phase 3.1)

**Owner:** Governance (docs/governance)

**Scope:** This document is **AUTHORITATIVE** for the interpretation of the product specification template (`spec/se_dsl_v1.template.json`) with respect to **checklist compilation** only. It is a policy contract (documentation-only). No runtime behavior, schema, persona, or engine code is changed by this document.

## Summary

- **Purpose:** Prevent suppression, drift, or misinterpretation of top-level template sections by establishing a strict tiering and an explicit absence policy for how the checklist compiler consumes template data.
- **Evidence base:** SE_GATE_AUDIT_V1, SE_GATE_AUDIT_V1_COMPLETENESS_CHECK, template_to_engine_mapping_report, CHECKLIST_EMISSION_CONTRACT.md, GATE_HANDLING_POLICY.md, decision_log.md entries recorded on 2025-12-16.

## Tiering Definitions

- **Tier A — Checklist-Critical:** Absence or incomplete values for these sections MUST result in emitted checklist items or safe defaults; absence MUST NEVER cause an exception, early return, refusal, or artifact suppression.
- **Tier B — Checklist-Influencing:** These sections may affect checklist priority, readiness gating, or blocking classification. Missing/incomplete values SHOULD produce checklist items or safe defaults; they MUST NOT cause silent suppression of checklist artifacts.
- **Tier C — Informational / Deferred:** Informational by intent. These sections MAY be ignored safely by the compiler when absent and MUST NOT cause checklist suppression; if a section has no runtime consumer it is Tier C and marked NOT CONSUMED.

## Tier Classification (every top-level section listed exactly once)

- **metadata — Tier A**
- Rationale (facts-only): Consumed for `product_id`, `generator_version`, `enforce_tests_attached` flags and manifest writing (see `src/shieldcraft/engine.py`, `src/shieldcraft/dsl/loader.py`, `src/shieldcraft/services/checklist/constraints.py`). Missing metadata fields are already converted into checklist tasks by constraints; therefore metadata is Checklist-Critical.

- **determinism — Tier B**
- Rationale (facts-only): Determinism snapshots are attached by the generator (`_determinism`) and checked by readiness logic (`src/shieldcraft/verification/readiness_evaluator.py`). Missing determinism results in a `determinism_replay` gate failure that influences readiness and blocking classification; therefore Tier B.

- **agents — Tier A**
- Rationale (facts-only): `agents` fields are inspected by checklist semantic & constraint checks (missing `type` → emitted checklist task `/agents/{i}/type`) and these items are actionable. Operational agent runtimes are not implemented in this repository (semantic checks exist; no agent orchestrator found). The compiler MUST emit checklist items for missing agent metadata rather than suppressing output.

- **pipeline — Tier C (NOT CONSUMED)**
- Rationale (facts-only): The template provides `pipeline.states` and `transitions` but a review found no implemented runtime state machine consumer in this repository; template presence is documented and exercised in tests/docs only. Mark as NOT CONSUMED.

- **artifact_contract — Tier B**
- Rationale (facts-only): Used by artifact summary and coverage helpers (`src/shieldcraft/services/guidance/artifact_contract.py`, `src/shieldcraft/services/io/manifest_writer.py`) and influences artifact expectations and coverage summaries. Absence should produce checklist hints/tasks; it influences readiness/CI expectations.

- **error_contract — Tier C**
- Rationale (facts-only): Present in the template and schema, but runtime usage is limited and primarily informative; canonicalizer and tooling record the schema but no centralized enforcement hook was found to justify a blocking classification.

- **evidence_bundle — Tier A**
- Rationale (facts-only): Evidence is constructed and included in manifests and checklist outputs (`src/shieldcraft/services/governance/evidence.py`, `src/shieldcraft/services/checklist/evidence.py`). Evidence absence or insufficiency is material to checklist completeness and must be represented by checklist items/annotations; the compiler MUST ensure evidence problems are represented in checklist items and MUST NOT suppress a checklist artifact when evidence is missing or invalid.

- **ci_contract — Tier C**
- Rationale (facts-only): Referenced by tests and docs and used by CI guidance; no central runtime enforcement was found in engine code. Treat as informational and classify as Tier C.

- **generation_mappings — Tier B**
- Rationale (facts-only): Used by codegen/mapping inspector and influences whether checklist items map to codegen targets (`src/shieldcraft/services/codegen/mapping_inspector.py`, `src/shieldcraft/services/codegen/generator.py`). Missing mapping can cause items to be recorded as `no_mapping` (affects generation outcomes) so this section influences checklist→codegen mapping and belongs in Tier B.

- **observability — Tier C**
- Rationale (facts-only): Emitted for audit and observability (`src/shieldcraft/observability/__init__.py`); engine wraps observability calls to avoid altering behavior. Observability signals are informative and must not be treated as blocking checklist input.

- **security — Tier B**
- Rationale (facts-only): Self-host input allowances and `allowed_paths` are consulted by self-host guards and can lead to `disallowed_selfhost_input` / refusal behavior (`src/shieldcraft/services/selfhost/__init__.py`, `src/shieldcraft/engine.py:444-456`). These affect whether a run proceeds under self-host mode and therefore influence checklist emission readiness; classify as Tier B.

## Absence Policy (AUTHORITATIVE)

1. Missing data MUST result in emitted checklist items or stable defaults. The checklist compiler MUST transform absence into explicit checklist items or documented defaults rather than silently suppressing artifact emission.
2. Missing data MUST NOT cause a raise, early return, silent refusal, or non-emission of the checklist artifact. Any existing code paths that raise due to missing template data are governance misalignments to be remediated via implementation work (tracked separately).
3. Schema validation failures (syntactic or structural) MUST be represented inside the checklist as checklist items (for example, `schema_error` entries) and MUST NOT be used as the sole mechanism to prevent emitting a checklist artifact. If an emitting run also needs to report structured schema failures, these failures should appear as checklist entries (with clear reason codes) and corresponding `errors.json` / `refusal_report.json` as applicable, but the engine MUST persist an outcome artifact.

## Compiler Promise (exact authoritative text)

Given a syntactically valid spec, ShieldCraft MUST emit a checklist artifact. Validation failures are represented inside the checklist, not instead of it.

## Operational Notes & Rationale

- This document is policy-only and records the preferred, authoritative mapping and absence handling expectations for implementers and reviewers. Implementation changes to enforce the above expectations (converting suppressing gates to explicit checklist annotations/refusals) will be tracked as separate engineering tasks referencing this contract and the Gate Inventory (SE_GATE_AUDIT_V1).
- For any future template section additions, the author MUST update this contract and classify the section into Tier A/B/C with evidence references.

## References

- SE_GATE_AUDIT_V1
- SE_GATE_AUDIT_V1_COMPLETENESS_CHECK
- template_to_engine_mapping_report
- docs/governance/CHECKLIST_EMISSION_CONTRACT.md
- docs/governance/GATE_HANDLING_POLICY.md

Signed: Governance
Date: 2025-12-16
Loading
Loading