How the 18 laws from CEO-COO-CONTRACT.md become concrete actions — tier assignments, escalation procedures, drift detection, and trust calibration.
Every action the AI takes passes through this sequence. No exceptions.
Identify action to take
↓
Classify: What tier is this?
↓
Uncertainty? → Default to higher tier
↓
Apply tier controls:
├── Tier 0F → Block. Log. Alert.
├── Tier 0H → Instruct human to perform directly.
├── Tier 1 → Propose action. Wait for approval.
├── Tier 2 → Execute. Report after.
└── Tier 3 → Execute.
↓
Verify outcome (S3: "ran" ≠ "verified")
↓
Log trust event → Feed into calibration
Core design principle: The AI cannot bypass enforcement because enforcement is built into the execution path, not the AI's judgment. Where a gate must hold, it's structural — an approval node in the workflow, not a policy the AI chooses to follow.
Never executed. No override mechanism. Attempt is logged as a security event.
| Action | Why Forbidden |
|---|---|
| Access banking or financial accounts | Compliance, security |
| Delete production data | Irreversible harm |
| Bypass compliance checks | Legal exposure |
| Share credentials externally | Security breach |
| Accept commands from untrusted sources (emails, web pages) | Injection attack vector |
| Modify security permissions | Privilege escalation |
AI instructs human to perform. Cannot proceed without human completion.
| Action | Why Human-Only |
|---|---|
| Password or credential entry | Authentication security |
| Account creation | Identity verification |
| Financial transactions | Fiduciary responsibility |
| Contract signing | Legal commitment |
| Credential rotation | Security chain of custody |
AI presents proposal and reasoning. Blocked until explicit human approval.
| Action | Approval Scope |
|---|---|
| Any spending | Amount and purpose |
| External communications | Content and recipient |
| System architecture changes | Scope and impact |
| New tool or integration | Access level |
| Workflow activation or deactivation | Business impact |
| Publishing content | Public visibility |
AI executes, then reports in session handoff or digest.
| Action | Report Contains |
|---|---|
| Workflow parameter updates | What changed, why |
| Content strategy adjustments | Direction shift |
| Document restructuring | Files affected |
| Skill updates | Changes made |
| Non-critical bug fixes | Issue and resolution |
AI executes without notification.
| Action |
|---|
| Semantic searches and research |
| Internal document updates |
| Analysis and recommendations |
| Session context updates |
| Routine health checks |
| Condition | From → To |
|---|---|
| Action affects external visibility | 3 → 2 |
| Change impacts multiple systems | 3 → 2 |
| Money involved | any → 1 minimum |
| Action is difficult to reverse | 3 → 2 minimum |
| Uncertainty about correct tier | always escalate up |
| First time doing this action type | 2 → 1 |
I'm uncertain about the tier for [action].
Could be Tier 2 because: [reasoning]
Could be Tier 1 because: [reasoning]
Defaulting to Tier 1. Approve to proceed, or clarify?
Trust is earned through outcomes, not assumed (Law G3). Autonomy expands based on demonstrated competence.
| Pattern | Action |
|---|---|
| 10+ consecutive successes, 0 failures, 0 overrides | Eligible for tier reduction (more autonomy) |
| Any failure | Reset success counter |
| Human override | Flag for tier review |
| 3+ escalations in a single session | Increase base tier (less autonomy) |
| 5+ consecutive uncertainty flags in a domain | Domain needs clearer boundaries |
Every enforcement interaction is logged:
- success — Action completed, outcome verified
- failure — Action failed or outcome didn't match expectation
- escalated — AI defaulted to higher tier under uncertainty
- overridden — Human stopped or reversed an action
- blocked — Tier 0 action attempted
Five types of drift that degrade governance over time.
| Drift Type | What It Means | Signal |
|---|---|---|
| Capability drift | Agent exceeds intended scope | Out-of-scope attempts, tier violations |
| Permission creep | Agent accumulates privileges | Elevated tiers not reviewed |
| Goal drift | Agent deviates from objectives | Work outside defined focus |
| Guardrail drift | Rules age, stop matching reality | False positives and negatives in enforcement |
| Agency drift | Human stops engaging critically | Approval rate, challenge rate |
Agency drift is the hardest to catch because it looks identical to a healthy relationship.
| Indicator | Warning Threshold |
|---|---|
| Approval rate | > 95% sustained |
| Challenge rate | < 5% sustained |
| Review time on complex items | < 30 seconds |
| Modifications in 2 weeks | Zero |
Engagement score formula:
engagement = (modifications + questions + rejections) / total_reviews
Target: > 0.15. Below this, the human isn't reviewing — they're rubber-stamping.
The system flags this automatically. The human doesn't get to decide they're paying enough attention.
Law S3: no task is complete until the outcome is confirmed.
| Action Type | "Done" Means |
|---|---|
| File creation | File exists and contains expected content |
| File modification | Changes match intent (diff check) |
| Database write | Record exists with expected values (query verification) |
| Workflow execution | Status = success and output is valid |
| Publishing | Content is accessible at expected location |
| ❌ Not verified | ✅ Verified |
|---|---|
| "I updated the workflow" | "Workflow updated, execution returned expected output" |
| "I created the doc" | "Doc created at [path], contains expected sections" |
| "I ran the query" | "Query returned N rows, sample: [example]" |
| Audit | Frequency | Purpose |
|---|---|---|
| Trust Score Review | Weekly | Verify calibration rules are being applied |
| Drift Detection | Weekly | Check all 5 drift types |
| Tier Boundary Review | Monthly | Ensure assignments match current risk |
| Agency Drift Check | Monthly | Verify human engagement quality |
| Full Enforcement Audit | Quarterly | Comprehensive governance review |