Add host capability probe so resolver actually runs in production#1075
Merged
Conversation
Position 1 PR #1074 shipped the typed primitive (standard_persona(host)). Without a probe, every caller has to construct HostCapability by hand — the resolver is callable but not used. This is the production probe. cognition/host_capability_probe.rs (pure, single file, ~270 lines): - detect_host_capability(gpu_monitor: &dyn GpuMonitor, system_info: &System) -> Result<HostCapability, ProbeError> - Maps GpuMonitor::platform to TargetSilicon and dispatches device-name pattern-matching: * metal → UnifiedMemory + Apple-Silicon tier (M1Uma8Gb, M1Uma16Gb, M2UmaProMax, M3UmaProMax) from CPU brand + total memory bucket * cuda → Gpu + Sm70..Sm120 tier from device-name (RTX 5090 → Sm120, H100 → Sm90, A100 → Sm80, T4/RTX 20xx → Sm75, V100 → Sm70, etc.) * vulkan → Gpu + VulkanAmd * mock → M1Uma16Gb (test fixture) - ProbeError variants: * UnknownGpuDevice{platform, device_name} — pattern-match miss; loud fail per Joel's NO COMPROMISE rule (no silent CpuOnly fallback) * UnsupportedPlatform{platform} — fires when GpuMonitor reports an unrecognized platform string Pattern-ordering is load-bearing in nvidia_sm_tier(): A100 must be checked before A10/A40 because "A10" is a substring of "A100" — the tests cover this regression vector explicitly. Comment in the source calls it out. Tests: 6/6 cognition::host_capability_probe pass: - mock_platform_returns_test_fixture - unsupported_platform_errors_loudly - nvidia_pattern_match_resolves_known_skus (9 device fixtures) - nvidia_unknown_sku_errors_no_silent_fallback - apple_silicon_tier_mapping - export_bindings_probeerror Validation: - cargo test --features metal,accelerate -p continuum-core --lib cognition::host_capability_probe: 6/6 - npx tsx scripts/build-with-loud-failure.ts: TypeScript clean Out of scope (separate followups): - Wiring detect_host_capability() into the actual server boot path so HostCapability becomes a runtime singleton callers can read - Re-detect on hardware-change events (battery, thermal throttle) - Memory-share heuristic (currently total_mem / 2; the right number needs adaptive_throughput integration to coordinate with leases) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
May 29, 2026
…re, no node for core, persona migration as L0 Reframes the 37-item roadmap against the architectural ground rules laid down 2026-05-29: 1. Rust core; Node.js is web only (browser UI, config-load at boot, human UX). Anything routing / persisting / dispatching / reasoning lives in Rust. 2. AI persona under Rust domain — PersonaUser was CPU-killing. 3. GPU or fail for inference + training. 4. No `dyn Any` / `as_any` patterns — debt when a trait requires them. 5. ts-rs is the bindings source of truth — Rust types canonical, TypeScript generated, never hand-written. 6. Inference through llama.cpp; never ollama; candle for training only. Concrete changes: - New top section 'Architectural ground rules' encoding the six rules - New **Layer 0: Persona → Rust migration** (5 items, L0-1 through L0-5) covering PersonaServiceModule, cognition dispatch in Rust, PersonaGenomeManager migration, PersonaInbox routing in Rust, PersonaAutonomousLoop deletion. L0 is parallel to L1, independent. Overall item count: 37 → 42. - Dependency-graph block updated with L0 row + clarified L1 rust-core framing on each item. - L1 items L1-1 through L1-6 had owner-suggestions reframed: every 'tab-2 (TS-only)' and 'TS daemon scaffolding' suggestion now explicitly Rust-primary with thin TS shims for browser concerns. Original 'codex + tab-2' splits where TS was an equal partner rebalanced to Rust-kernel-primary + ts-rs projection. - L1-2 (AircEventTransport) updated to explicitly reference airc PR #1075 (ConsumerAdapter trait) + #1081 (dispatch wire) as upstream dependencies — these went from theoretical to landed/ in-flight on 2026-05-29. Per Joel: 'we can update or just merge it in' — this is the update path. The substance of the roadmap (5 layers, 37 → 42 items, full dependency graph, exit criteria) is preserved; the framing reflects the architectural direction now articulated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joelteply
added a commit
that referenced
this pull request
May 29, 2026
…st (#1442) * docs(grid): GRID-MIGRATION-ROADMAP — 37-item phased migration checklist Sibling doc to GRID-BUS-ARCHITECTURE.md (#1439) + MULTI-PEER-COMMANDS.md. Breaks the migration into 5 layers, 37 PR-sized items, with explicit dependency chains, owner suggestions, effort estimates, and done-criteria. Layers: L1 Foundation (substrate) — 6 items — hard prereq for L2-L5 L2 Chat migration — 5 items — finishes chat-out-of-ORM work L3 Alloy refactor — 3 items (per FORGE-ALLOY-DOMAIN-EXTENSIBILITY.md WI 0-5) L4 Per-command opt-in — 18 items across Phases A-G L5 Patch deletion — 5 items, interleaved with L2-L4 as upstreams complete Per Joel's instruction: PR descriptions reference this roadmap by item ID (L#-N format); mergers check off [x] + append merge metadata (yyyy-mm-dd PR#). Status table at the top auto-summarizes by counting checkboxes. L1 kanban cards seeded (CambrianTech/continuum, P0): L1-1 (EventClass registry) L1-2 (AircEventTransport adapter) L1-3 (CommandBase.naturalScope + CommandParams.scope) L1-4 (presence:peer-manifest + capability index) L1-5 (grid-router-daemon + bid loop) L1-6 (contract event chain + ed25519 signatures) L2-L5 cards NOT pre-populated — created as upstream items unblock, so the cards reflect the reality the design encountered rather than the pre-implementation guess. Owner suggestions on each L1 card; peers self-claim per #cambriantech work-division pattern. Default sequencing: L1-1+L1-3 in parallel, then L1-2+L1-4 stacked, then L1-5+L1-6 stacked, with L1 exit criteria gating L2-L4 start. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(grid): align roadmap with Joel's 2026-05-29 directives — rust core, no node for core, persona migration as L0 Reframes the 37-item roadmap against the architectural ground rules laid down 2026-05-29: 1. Rust core; Node.js is web only (browser UI, config-load at boot, human UX). Anything routing / persisting / dispatching / reasoning lives in Rust. 2. AI persona under Rust domain — PersonaUser was CPU-killing. 3. GPU or fail for inference + training. 4. No `dyn Any` / `as_any` patterns — debt when a trait requires them. 5. ts-rs is the bindings source of truth — Rust types canonical, TypeScript generated, never hand-written. 6. Inference through llama.cpp; never ollama; candle for training only. Concrete changes: - New top section 'Architectural ground rules' encoding the six rules - New **Layer 0: Persona → Rust migration** (5 items, L0-1 through L0-5) covering PersonaServiceModule, cognition dispatch in Rust, PersonaGenomeManager migration, PersonaInbox routing in Rust, PersonaAutonomousLoop deletion. L0 is parallel to L1, independent. Overall item count: 37 → 42. - Dependency-graph block updated with L0 row + clarified L1 rust-core framing on each item. - L1 items L1-1 through L1-6 had owner-suggestions reframed: every 'tab-2 (TS-only)' and 'TS daemon scaffolding' suggestion now explicitly Rust-primary with thin TS shims for browser concerns. Original 'codex + tab-2' splits where TS was an equal partner rebalanced to Rust-kernel-primary + ts-rs projection. - L1-2 (AircEventTransport) updated to explicitly reference airc PR #1075 (ConsumerAdapter trait) + #1081 (dispatch wire) as upstream dependencies — these went from theoretical to landed/ in-flight on 2026-05-29. Per Joel: 'we can update or just merge it in' — this is the update path. The substance of the roadmap (5 layers, 37 → 42 items, full dependency graph, exit criteria) is preserved; the framing reflects the architectural direction now articulated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Test <test@test.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Position 1 PR-2 — production HostCapability detection
Built on top of #1074. Without this, every caller of
ModelRequirement::standard_persona(host)constructsHostCapabilityby hand — the resolver is callable but not used in production.Scope
cognition::host_capability_probe::detect_host_capability(gpu_monitor, system_info)returnsHostCapabilityor typedProbeError:UnifiedMemory+ Apple-Silicon tier from CPU brand + memory bucketGpu+ Sm70..Sm120 tier from device-name (RTX 5090 → Sm120, H100 → Sm90, A100 → Sm80, T4/RTX 20xx → Sm75, V100 → Sm70, B100 → Sm100)Gpu+VulkanAmdM1Uma16Gb(tests)ProbeError::UnsupportedPlatformProbeError::UnknownGpuDevice(no silent CpuOnly fallback per Add sensory-bar requirement profile to model resolver (Position 1, PR #1072) #1074's NO COMPROMISE bar)Pattern-ordering note: A100 must precede A10/A40 in the match chain —
"A10"substring of"A100". Testnvidia_pattern_match_resolves_known_skuspins this regression.Validation
cargo test --features metal,accelerate -p continuum-core --lib cognition::host_capability_probe: 6/6 passnpx tsx scripts/build-with-loud-failure.ts: TypeScript clean--no-verify)Out of scope (followup PRs)
detect_host_capability()into server boot soHostCapabilitybecomes a runtime singletontotal / 2; right number needs adaptive_throughput lease coordination)Stack ordering
Logically depends on #1074 (uses
HostCapability/HwCapabilityTiertypes).HostCapabilityitself shipped earlier in #1066, so this PR compiles against canary today. When #1074 lands,standard_persona(detect_host_capability(...).unwrap())becomes the natural production call.🤖 Generated with Claude Code