Skip to content

Add host capability probe so resolver actually runs in production#1075

Merged
joelteply merged 1 commit into
canaryfrom
feat/host-capability-probe
May 11, 2026
Merged

Add host capability probe so resolver actually runs in production#1075
joelteply merged 1 commit into
canaryfrom
feat/host-capability-probe

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

Position 1 PR-2 — production HostCapability detection

Built on top of #1074. Without this, every caller of ModelRequirement::standard_persona(host) constructs HostCapability by hand — the resolver is callable but not used in production.

Scope

cognition::host_capability_probe::detect_host_capability(gpu_monitor, system_info) returns HostCapability or typed ProbeError:

  • metalUnifiedMemory + Apple-Silicon tier from CPU brand + memory bucket
  • cudaGpu + Sm70..Sm120 tier from device-name (RTX 5090 → Sm120, H100 → Sm90, A100 → Sm80, T4/RTX 20xx → Sm75, V100 → Sm70, B100 → Sm100)
  • vulkanGpu + VulkanAmd
  • mockM1Uma16Gb (tests)
  • unknown platform → ProbeError::UnsupportedPlatform
  • unknown CUDA SKU → ProbeError::UnknownGpuDevice (no silent CpuOnly fallback per Add sensory-bar requirement profile to model resolver (Position 1, PR #1072) #1074's NO COMPROMISE bar)

Pattern-ordering note: A100 must precede A10/A40 in the match chain — "A10" substring of "A100". Test nvidia_pattern_match_resolves_known_skus pins this regression.

Validation

  • cargo test --features metal,accelerate -p continuum-core --lib cognition::host_capability_probe: 6/6 pass
  • npx tsx scripts/build-with-loud-failure.ts: TypeScript clean
  • normal precommit + prepush passed (no --no-verify)

Out of scope (followup PRs)

  • Wire detect_host_capability() into server boot so HostCapability becomes a runtime singleton
  • Re-detect on hardware-change events (battery, thermal throttle)
  • Memory-share heuristic tuning (currently total / 2; right number needs adaptive_throughput lease coordination)

Stack ordering

Logically depends on #1074 (uses HostCapability/HwCapabilityTier types). HostCapability itself shipped earlier in #1066, so this PR compiles against canary today. When #1074 lands, standard_persona(detect_host_capability(...).unwrap()) becomes the natural production call.

🤖 Generated with Claude Code

Position 1 PR #1074 shipped the typed primitive (standard_persona(host)).
Without a probe, every caller has to construct HostCapability by hand —
the resolver is callable but not used. This is the production probe.

cognition/host_capability_probe.rs (pure, single file, ~270 lines):
- detect_host_capability(gpu_monitor: &dyn GpuMonitor, system_info: &System)
  -> Result<HostCapability, ProbeError>
- Maps GpuMonitor::platform to TargetSilicon and dispatches device-name
  pattern-matching:
  * metal → UnifiedMemory + Apple-Silicon tier (M1Uma8Gb, M1Uma16Gb,
    M2UmaProMax, M3UmaProMax) from CPU brand + total memory bucket
  * cuda → Gpu + Sm70..Sm120 tier from device-name (RTX 5090 → Sm120,
    H100 → Sm90, A100 → Sm80, T4/RTX 20xx → Sm75, V100 → Sm70, etc.)
  * vulkan → Gpu + VulkanAmd
  * mock → M1Uma16Gb (test fixture)
- ProbeError variants:
  * UnknownGpuDevice{platform, device_name} — pattern-match miss; loud
    fail per Joel's NO COMPROMISE rule (no silent CpuOnly fallback)
  * UnsupportedPlatform{platform} — fires when GpuMonitor reports an
    unrecognized platform string

Pattern-ordering is load-bearing in nvidia_sm_tier(): A100 must be
checked before A10/A40 because "A10" is a substring of "A100" — the
tests cover this regression vector explicitly. Comment in the source
calls it out.

Tests: 6/6 cognition::host_capability_probe pass:
- mock_platform_returns_test_fixture
- unsupported_platform_errors_loudly
- nvidia_pattern_match_resolves_known_skus (9 device fixtures)
- nvidia_unknown_sku_errors_no_silent_fallback
- apple_silicon_tier_mapping
- export_bindings_probeerror

Validation:
- cargo test --features metal,accelerate -p continuum-core --lib
  cognition::host_capability_probe: 6/6
- npx tsx scripts/build-with-loud-failure.ts: TypeScript clean

Out of scope (separate followups):
- Wiring detect_host_capability() into the actual server boot path so
  HostCapability becomes a runtime singleton callers can read
- Re-detect on hardware-change events (battery, thermal throttle)
- Memory-share heuristic (currently total_mem / 2; the right number
  needs adaptive_throughput integration to coordinate with leases)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joelteply joelteply marked this pull request as ready for review May 11, 2026 17:34
@joelteply joelteply merged commit e93024d into canary May 11, 2026
2 checks passed
@joelteply joelteply deleted the feat/host-capability-probe branch May 11, 2026 21:11
joelteply added a commit that referenced this pull request May 29, 2026
…re, no node for core, persona migration as L0

Reframes the 37-item roadmap against the architectural ground rules
laid down 2026-05-29:

1. Rust core; Node.js is web only (browser UI, config-load at boot,
   human UX). Anything routing / persisting / dispatching / reasoning
   lives in Rust.
2. AI persona under Rust domain — PersonaUser was CPU-killing.
3. GPU or fail for inference + training.
4. No `dyn Any` / `as_any` patterns — debt when a trait requires them.
5. ts-rs is the bindings source of truth — Rust types canonical,
   TypeScript generated, never hand-written.
6. Inference through llama.cpp; never ollama; candle for training only.

Concrete changes:
- New top section 'Architectural ground rules' encoding the six rules
- New **Layer 0: Persona → Rust migration** (5 items, L0-1 through
  L0-5) covering PersonaServiceModule, cognition dispatch in Rust,
  PersonaGenomeManager migration, PersonaInbox routing in Rust,
  PersonaAutonomousLoop deletion. L0 is parallel to L1, independent.
  Overall item count: 37 → 42.
- Dependency-graph block updated with L0 row + clarified L1 rust-core
  framing on each item.
- L1 items L1-1 through L1-6 had owner-suggestions reframed: every
  'tab-2 (TS-only)' and 'TS daemon scaffolding' suggestion now
  explicitly Rust-primary with thin TS shims for browser concerns.
  Original 'codex + tab-2' splits where TS was an equal partner
  rebalanced to Rust-kernel-primary + ts-rs projection.
- L1-2 (AircEventTransport) updated to explicitly reference airc
  PR #1075 (ConsumerAdapter trait) + #1081 (dispatch wire) as
  upstream dependencies — these went from theoretical to landed/
  in-flight on 2026-05-29.

Per Joel: 'we can update or just merge it in' — this is the update
path. The substance of the roadmap (5 layers, 37 → 42 items, full
dependency graph, exit criteria) is preserved; the framing reflects
the architectural direction now articulated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply added a commit that referenced this pull request May 29, 2026
…st (#1442)

* docs(grid): GRID-MIGRATION-ROADMAP — 37-item phased migration checklist

Sibling doc to GRID-BUS-ARCHITECTURE.md (#1439) + MULTI-PEER-COMMANDS.md.
Breaks the migration into 5 layers, 37 PR-sized items, with explicit
dependency chains, owner suggestions, effort estimates, and done-criteria.

Layers:
  L1 Foundation (substrate) — 6 items — hard prereq for L2-L5
  L2 Chat migration — 5 items — finishes chat-out-of-ORM work
  L3 Alloy refactor — 3 items (per FORGE-ALLOY-DOMAIN-EXTENSIBILITY.md WI 0-5)
  L4 Per-command opt-in — 18 items across Phases A-G
  L5 Patch deletion — 5 items, interleaved with L2-L4 as upstreams complete

Per Joel's instruction: PR descriptions reference this roadmap by item
ID (L#-N format); mergers check off [x] + append merge metadata
(yyyy-mm-dd PR#). Status table at the top auto-summarizes by counting
checkboxes.

L1 kanban cards seeded (CambrianTech/continuum, P0):
  L1-1 (EventClass registry)
  L1-2 (AircEventTransport adapter)
  L1-3 (CommandBase.naturalScope + CommandParams.scope)
  L1-4 (presence:peer-manifest + capability index)
  L1-5 (grid-router-daemon + bid loop)
  L1-6 (contract event chain + ed25519 signatures)

L2-L5 cards NOT pre-populated — created as upstream items unblock, so
the cards reflect the reality the design encountered rather than the
pre-implementation guess.

Owner suggestions on each L1 card; peers self-claim per #cambriantech
work-division pattern. Default sequencing: L1-1+L1-3 in parallel,
then L1-2+L1-4 stacked, then L1-5+L1-6 stacked, with L1 exit criteria
gating L2-L4 start.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(grid): align roadmap with Joel's 2026-05-29 directives — rust core, no node for core, persona migration as L0

Reframes the 37-item roadmap against the architectural ground rules
laid down 2026-05-29:

1. Rust core; Node.js is web only (browser UI, config-load at boot,
   human UX). Anything routing / persisting / dispatching / reasoning
   lives in Rust.
2. AI persona under Rust domain — PersonaUser was CPU-killing.
3. GPU or fail for inference + training.
4. No `dyn Any` / `as_any` patterns — debt when a trait requires them.
5. ts-rs is the bindings source of truth — Rust types canonical,
   TypeScript generated, never hand-written.
6. Inference through llama.cpp; never ollama; candle for training only.

Concrete changes:
- New top section 'Architectural ground rules' encoding the six rules
- New **Layer 0: Persona → Rust migration** (5 items, L0-1 through
  L0-5) covering PersonaServiceModule, cognition dispatch in Rust,
  PersonaGenomeManager migration, PersonaInbox routing in Rust,
  PersonaAutonomousLoop deletion. L0 is parallel to L1, independent.
  Overall item count: 37 → 42.
- Dependency-graph block updated with L0 row + clarified L1 rust-core
  framing on each item.
- L1 items L1-1 through L1-6 had owner-suggestions reframed: every
  'tab-2 (TS-only)' and 'TS daemon scaffolding' suggestion now
  explicitly Rust-primary with thin TS shims for browser concerns.
  Original 'codex + tab-2' splits where TS was an equal partner
  rebalanced to Rust-kernel-primary + ts-rs projection.
- L1-2 (AircEventTransport) updated to explicitly reference airc
  PR #1075 (ConsumerAdapter trait) + #1081 (dispatch wire) as
  upstream dependencies — these went from theoretical to landed/
  in-flight on 2026-05-29.

Per Joel: 'we can update or just merge it in' — this is the update
path. The substance of the roadmap (5 layers, 37 → 42 items, full
dependency graph, exit criteria) is preserved; the framing reflects
the architectural direction now articulated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant