feat: Add system subclusters and kernel facet service#803
Open
feat: Add system subclusters and kernel facet service#803
Conversation
f58f8c0 to
079a10a
Compare
Member
Author
|
@cursor review |
Contributor
Member
Author
|
@cursor review |
2d79ecf to
0a66993
Compare
Member
Author
|
@cursor review |
rekmarks
commented
Feb 5, 2026
Comment on lines
+23
to
+28
| // TODO: Remove this define block and add a process shim to VatSupervisor | ||
| // workerEndowments instead. This injects into ALL bundles but is only needed | ||
| // for libraries like immer that check process.env.NODE_ENV. | ||
| define: { | ||
| 'process.env.NODE_ENV': JSON.stringify('production'), | ||
| }, |
Member
Author
There was a problem hiding this comment.
Going to address in a follow-up. Requires changes to how we bundle vats with Vite best not added to this PR.
rekmarks
commented
Feb 6, 2026
Comment on lines
+124
to
+127
| if (isConsoleForwardMessage(message)) { | ||
| handleConsoleForwardMessage(message); |
Member
Author
There was a problem hiding this comment.
We accidentally broke omnium when we introduced console forwarding because we forgot to instrument its background and offscreen for it, causing the stream handler to blow up.
Contributor
There was a problem hiding this comment.
Sounds like we need some omnium e2e tests
rekmarks
commented
Feb 6, 2026
Comment on lines
+298
to
+305
| // Map of allowed global names to their values | ||
| const allowedGlobals: Record<string, unknown> = { | ||
| Date: globalThis.Date, | ||
| }; |
Member
Author
Stack
Managed by gh-stack |
Implement system vats that are launched at kernel initialization and have access to privileged kernel services. Key changes: - Add SystemVatConfig type and getSystemVatRoot method to Kernel - Launch system vats after queue starts to avoid deadlock - Terminate and relaunch existing system vat subclusters on restart - Add bootstrap-vat.js for Omnium system services with CapletController - Add baggage-backed storage adapter for vat persistence - Pass systemVats config via URL params from offscreen to kernel worker - Update background.ts to use system vat for caplet operations - Add process.env.NODE_ENV replacement in vat bundler for SES compatibility - Simplify kernel-facet.ts by removing SystemVatManager - Add duplicate name check in KernelServiceManager.registerKernelServiceObject Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rename bootstrap-vat.js to bootstrap-vat.ts with full type annotations - Export Baggage type from baggage-adapter.ts - Make logger optional throughout controller hierarchy - Simplify defineMethods to take array of method names instead of object map - Update background.ts to use simplified method names (install, uninstall, etc.) - Update package.json build script to reference .ts file Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add proper TypeScript types for KernelFacet, BootstrapServices, VatParameters - Use types from @MetaMask/ocap-kernel (Baggage, ClusterConfig, etc.) - Remove JSDoc type annotations in favor of TypeScript types Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Throw errors instead of silently recovering when a persisted system subcluster has an empty vats array or missing root object. These conditions indicate database corruption and should fail fast. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…estart The controller vat created a new PromiseKit on every initialization but only resolved it in bootstrap(). Since bootstrap() is not called during resuscitation (kernel restart), all caplet methods would hang. Fix by restoring kernelFacet from baggage and initializing the CapletController immediately in buildRootObject when available. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add a TODO comment noting that the define block should be replaced with a process shim in VatSupervisor workerEndowments. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix baggage adapter to use actual delete() instead of null tombstones - Rename root to rootObject in KernelFacetLaunchResult for clarity - Add subclusterId format validation in Kernel.getSubcluster() - Add duplicate system subcluster name detection at kernel init - Clarify comment in controller-vat resuscitation path Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Async kernel service invocations can cause multiple concurrent connection attempts when processing many messages, which triggers the default rate limiter. Increase maxConnectionAttemptsPerMinute to avoid interference with the queue limit test. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace the KernelFacade type and makeKernelFacade factory (from kernel-browser-runtime) with KernelFacet and makeKernelFacet (from ocap-kernel). The kernel facet is now a thin delegate layer over the kernel, with the only additions being ping() and getVatRoot(). Key changes: - Add missing methods to KernelFacet (ping, pingVat, getSystemSubclusterRoot, reset, queueMessage) - Add Kernel.provideFacet() for idempotent facet creation, replacing the boolean flag and #ensureKernelFacetRegistered() - Move throw-on-missing logic for getSystemSubclusterRoot into Kernel - Rename bootstrapRootKref to rootKref in SubclusterLaunchResult - Remove KernelFacade type, makeKernelFacade, KernelFacetLaunchResult, and LaunchResult from kernel-browser-runtime - Update all consumers (omnium-gatherum, extension) to use KernelFacet Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Kernel.getPresence(kref, iface = 'Kernel Object') as a public method that wraps kslot(). Remove getVatRoot from KernelFacet and replace it with getPresence, which is now a delegated dependency rather than a standalone kslot call. Update controller-vat.ts to call E(kernelFacet).getPresence(kref, 'vatRoot') instead of E(kernelFacet).getVatRoot(kref). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace individual method declarations with a spread of the deps
object. Since every method except ping() is a direct delegate, the
facet is now just `makeDefaultExo('kernelFacet', { ...deps, ping })`.
Simplify tests accordingly — use plain functions instead of vi.fn()
mocks (which get frozen by harden()).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…string, VatId> Replace the array-based vat storage with a name-keyed record, making the vat name→ID relationship explicit and eliminating the fragile index-based bootstrap vat lookup in Kernel.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… CapTP integration test Delegate each vi.fn() mock through a wrapper function before passing to makeKernelFacet, so harden() freezes the wrappers instead of the original mock instances, keeping vitest call tracking intact. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
reloadSubcluster() creates a new subcluster with a new ID, but was not updating #systemSubclusterRoots or the persisted systemSubcluster.* mappings. This left stale mappings that caused 'has no bootstrap vat' errors on subsequent kernel restarts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…o SubclusterManager System subcluster state and logic (persist/restore/cleanup mappings, launch new named subclusters, track roots) belongs in SubclusterManager which already owns subcluster CRUD, termination, and reload. This moves ~140 lines out of Kernel.ts into SubclusterManager, keeping Kernel as a thin orchestration layer that delegates to its managers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…registration The kernelFacet kernel service now takes ko3, shifting all vat root ko IDs by 1. Update hardcoded ko references in control-panel, object-registry, and remote-comms e2e tests accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…usters reloadAllSubclusters bypasses reloadSubcluster and has its own loop that calls addSubcluster + launchVatsForSubcluster directly, so it never updated the in-memory systemSubclusterRoots map or persisted mappings. After a reload-all, getSystemSubclusterRoot() would return stale krefs pointing to deleted objects. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace local Baggage type definition with the one exported from ocap-kernel, which includes keys() for native iteration. This eliminates the manual __storage_keys__ tracking in the baggage adapter. Also replace local LaunchResult type with SubclusterLaunchResult from ocap-kernel, and remove dead resuscitation guard in controller-vat bootstrap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Construct a fallback Logger in controller-vat when vatPowers.logger is not provided, ensuring a real Logger is always passed downstream. This makes the logger property non-optional in ControllerConfig, Controller, ControllerStorage, and CapletController, eliminating optional chaining on logger calls throughout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nally Instead of the caller manually binding each method, makeKernelFacet now takes the kernel instance directly and iterates over a const array of method names to bind them. This reduces the call site in Kernel.ts from 12 lines to 1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the positional (resetStorage, mnemonicOrOptions) parameters with a single options object. resetStorage defaults to true since nearly every call site uses that value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Apply vitest eslint config to `**/test/**/*` in addition to `**/*.test.ts` files, so non-test-named files under test directories also get the right rules. Remove now-unnecessary eslint-disable comments in system-vat.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The omnium.caplet type was declared as Promisified<CapletControllerFacet> but the implementation routes through queueMessage, returning raw CapData instead of deserialized values. Replace with explicit method signatures using QueueMessageResult, and add the missing callCapletMethod and getCapletRoot methods. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
On restart with an empty systemSubclusters array, the kernel facet was never registered because provideFacet() was guarded by configs.length > 0. Persisted run queue items targeting the kernel facet kref would cause invokeKernelService to throw, crashing the kernel queue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
48ff5d1 to
31868d8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds support for "system subclusters" - statically declared subclusters that are launched at kernel initialization and persist across kernel restarts. System subclusters can receive powerful kernel services not available to normal vats via the
KernelFacet. In summary:globalsconfig to allow vats to receive specific globals (likeDate) in their SES Compartmentname -> idmapping for subcluster vats, which facilitates identifying the bootstrap vat of a launched subclusterNote
High Risk
Touches core kernel initialization, persistence, service invocation, and CapTP bootstrap semantics; mistakes could break subcluster restore/cleanup, message routing, or introduce deadlocks/race conditions during kernel service calls.
Overview
Adds system subclusters to
@metamask/ocap-kernel: a newsystemSubclustersinit option persists a name→subcluster mapping in the store, restores or deletes orphaned system subclusters on boot (without starting their vats), launches new ones after the run-queue starts, exposesgetSystemSubclusterRoot, and clears this state onreset.Replaces the CapTP-exposed
KernelFacadewith aKernelFacetkernel service (makeKernelFacet), wiring CapTP bootstrap tokernel.provideFacet()and making kernel-service invocation non-blocking (promise-chained) to avoid crank deadlocks; this cascades into API changes likeSubclusterLaunchResult.rootKref(renamed frombootstrapRootKref) and a subclustervatsrecord keyed by vat name.Expands vat configuration with an allowlisted
globalsarray (e.g.,Date) that injects selected globals into worker endowments, and adds a Vite define forprocess.env.NODE_ENVwhen bundling vats.Updates Omnium Gatherum to run controllers via a kernel-launched controller vat/system subcluster and route caplet operations through
queueMessage, adjusts extension/runtime typings and E2E expectations for shifted krefs, and adds NodeJS E2E coverage for system subcluster lifecycle/persistence/reload behavior.Written by Cursor Bugbot for commit 31868d8. This will update automatically on new commits. Configure here.