feat(resources): #1239 Phase 3 — typed ResourceError::DiskCapacity refusal at production hot paths

**Follow-on to #1239 Phase 1 (PR #1297) + Phase 2 (broker singleton bootstrap).**

Phase 1 surfaced disk-tier pressure data; Phase 2 bootstraps the broker + alerts on >threshold. This card adds the **typed runtime refusal** so hot-path operations don't barrel into ENOSPC before the broker has a chance to alert.

## Acceptance

1. **New typed error variant** `ResourceError::DiskCapacity { tier: String, used_bytes: u64, capacity_bytes: u64 }` in whichever crate owns the central error types.
2. **Audit + plumb the refusal** at each production hot path that can fail with no-space:
   - Model pull (`hf_hub::api::sync::Api::repo().get(...)`) before downloading GGUF/safetensors.
   - Container start (`docker compose up`, wherever it's invoked from Rust).
   - Image build (`docker build`, if invoked from continuum-core).
   - GGUF artifact resolve (`crate::model_registry::artifacts::resolve_gguf_for_model`).
3. **Refusal logic**: each site queries the broker (or directly calls `DockerTierPool::snapshot_stats()`), compares projected post-op usage against capacity, and refuses with `ResourceError::DiskCapacity` if it would push past 95% (configurable threshold).
4. **Tests** with a mocked `DockerTierPool` returning controllable capacity/usage — assert refusal at threshold, success below threshold.
5. **TS-side** surfaces the typed error to the user (chat message: "Can't pull qwen3-coder: Docker disk would hit 96%, prune first") instead of an opaque "operation failed."

## Hardest piece

Touching production hot paths is the riskiest part of this 3-phase series. Recommend landing Phase 2 first so the alert sink is observable; Phase 3 then has empirical data on which hot paths actually hit the threshold most often, informing the refusal-vs-warning policy.

## Why this matters

The 2026-05-14 incident (Docker.raw silently grew to fill the whole disk) is the exact failure mode this refusal prevents. Phase 1 = observability. Phase 2 = alerting. Phase 3 = refusal. All three are needed for the substrate to actually act on disk pressure rather than just surface it.

Lane: alpha flywheel #1272 lane 4 (substrate) or wherever #1239 sat.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(resources): #1239 Phase 3 — typed ResourceError::DiskCapacity refusal at production hot paths #1300

Acceptance

Hardest piece

Why this matters

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(resources): #1239 Phase 3 — typed ResourceError::DiskCapacity refusal at production hot paths #1300

Description

Acceptance

Hardest piece

Why this matters

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions