feat(messaging): migrate enrollment to manifest hooks#4248
Conversation
Signed-off-by: San Dang <sdang@nvidia.com>
Signed-off-by: San Dang <sdang@nvidia.com>
Signed-off-by: San Dang <sdang@nvidia.com>
# Conflicts: # src/lib/messaging/manifest/types.test.ts
## Summary Adds the phase-1 messaging manifest compiler that converts channel manifests into a serializable sandbox messaging plan. The compiler resolves channel inputs through env keys and interactive enrollment hooks, then delegates credential, policy, render, build-step, state-update, and health-check planning to small pure engines. ## Related Issue Fixes #3994 ## Changes - Add `ManifestCompiler` with interactive enrollment-hook input resolution and env-key input initialization. - Add compiler plan engines for credential bindings, network policy, agent render fragments, build steps, state updates, and health checks. - Expand `SandboxMessagingPlan` and related manifest plan types to the top-level plan shape required by #3994. - Add coverage for built-in Telegram/Discord/Slack/WeChat/WhatsApp plans, Hermes WeChat policy aliasing, non-interactive env input behavior, secret-free JSON plans, disabled channels, and a synthetic non-built-in channel. ## Type of Change - [x] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - [ ] `npx prek run --all-files` passes - [ ] `npm test` passes - [x] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [ ] Docs updated for user-facing behavior changes - [ ] `make docs` builds without warnings (doc changes only) - [ ] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) Additional verification performed: - `npm test -- --project cli src/lib/messaging` passes. - `npm run typecheck:cli` passes. - `npm run lint -- src/lib/messaging` passes with the existing unrelated warning in `src/lib/onboard/child-exit-tracker.test.ts`. - `git diff --check` passes. - `npm run source-shape:check` passes. - `npx prek run --all-files` and the normal pre-push hook were attempted and currently fail in unrelated full CLI doctor/debug/snapshot tests outside the messaging compiler changes. --- Signed-off-by: San Dang <sdang@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added manifest compilation system for messaging channels with support for multiple agents and workflows * Implemented credential binding and authentication management * Added network policy configuration and agent rendering capabilities * Introduced health check and build step planning * Added state persistence and hydration management * Implemented placeholder resolution for sandbox names and credentials * **Tests** * Added comprehensive test suite validating compilation behavior, credential handling, and plan serialization <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/4069?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: San Dang <sdang@nvidia.com>
Signed-off-by: San Dang <sdang@nvidia.com>
Signed-off-by: San Dang <sdang@nvidia.com>
Signed-off-by: San Dang <sdang@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (6)
🚧 Files skipped from review as they are similar to previous changes (6)
📝 WalkthroughWalkthroughMigrates messaging enrollment to manifest-driven hooks and planner workflows: new manifest fields and utilities, shared config-prompt/token-paste hooks, Slack credential validation, WeChat host-QR runtime, compiler/planner changes, onboarding refactor to return SandboxMessagingPlan, policy-channel persistence, and broad test/e2e updates. ChangesMessaging Channel Enrollment Manifest Migration
Estimated code review effort 🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
…messaging-enrollment # Conflicts: # src/lib/messaging/applier/index.ts # src/lib/messaging/applier/setup-applier.test.ts # src/lib/messaging/channels/discord/manifest.ts # src/lib/messaging/channels/manifests.test.ts # src/lib/messaging/channels/slack/manifest.ts # src/lib/messaging/channels/telegram/hooks/get-me-reachability.test.ts # src/lib/messaging/channels/telegram/hooks/get-me-reachability.ts # src/lib/messaging/channels/telegram/hooks/index.ts # src/lib/messaging/channels/telegram/manifest.ts # src/lib/messaging/channels/wechat/hooks/ilink-login.ts # src/lib/messaging/channels/wechat/hooks/implementations.test.ts # src/lib/messaging/channels/wechat/hooks/index.ts # src/lib/messaging/channels/wechat/manifest.ts # src/lib/messaging/channels/whatsapp/manifest.ts # src/lib/messaging/compiler/manifest-compiler.test.ts # src/lib/messaging/compiler/manifest-compiler.ts # src/lib/messaging/compiler/workflow-planner.test.ts # src/lib/messaging/hooks/builtins.ts # src/lib/messaging/hooks/common/index.ts # src/lib/messaging/hooks/common/token-paste.test.ts # src/lib/messaging/hooks/common/token-paste.ts # src/lib/messaging/hooks/hook-runner.test.ts # src/lib/messaging/hooks/hook-runner.ts # src/lib/messaging/hooks/types.ts # src/lib/messaging/manifest/types.ts
Selective E2E Results — ✅ All requested jobs passedRun: 26767258521
|
…messaging-enrollment # Conflicts: # src/lib/actions/sandbox/policy-channel.ts # src/lib/onboard.ts # src/lib/onboard/messaging-channel-setup.test.ts # src/lib/onboard/messaging-channel-setup.ts # test/channels-add-preset.test.ts # test/onboard-messaging.test.ts
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/lib/onboard.ts (1)
1122-1153:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRe-check the PID before escalating the gateway kill.
This path only tests
isPidAlive(pid)afterSIGTERM, and the initial guard here is weaker thanisDockerDriverGatewayProcessAlive()because it skipsrequireDockerDriverEnv. A recycled PID can therefore be mistaken for the recorded Docker-driver gateway and getSIGKILLed.🛡️ Suggested guard
function terminateDockerDriverGatewayProcess(pid: number): boolean { - if (!isPidAlive(pid)) { + const isExpectedGateway = () => + isDockerDriverGatewayProcess(pid, resolveOpenShellGatewayBinary(), { + requireDockerDriverEnv: shouldRequireDockerDriverEnv(), + }); + if (!isPidAlive(pid) || !isExpectedGateway()) { return false; } try { process.kill(pid, "SIGTERM"); for (let i = 0; i < 10; i += 1) { - if (!isPidAlive(pid)) break; + if (!isPidAlive(pid) || !isExpectedGateway()) break; sleepSeconds(1); } - if (isPidAlive(pid)) process.kill(pid, "SIGKILL"); + if (isPidAlive(pid) && isExpectedGateway()) { + process.kill(pid, "SIGKILL"); + } return true; } catch { return false; } } function stopDockerDriverGatewayProcess(): boolean { const pid = getDockerDriverGatewayPid(); if (pid === null || !isPidAlive(pid)) { clearDockerDriverGatewayRuntimeFiles(); return false; } - if (!isDockerDriverGatewayProcess(pid, resolveOpenShellGatewayBinary())) { + if (!isDockerDriverGatewayProcess(pid, resolveOpenShellGatewayBinary(), { + requireDockerDriverEnv: shouldRequireDockerDriverEnv(), + })) { clearDockerDriverGatewayRuntimeFiles(); return false; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/onboard.ts` around lines 1122 - 1153, terminateDockerDriverGatewayProcess currently escalates from SIGTERM to SIGKILL based only on isPidAlive; update the logic so stopDockerDriverGatewayProcess re-validates the PID is the gateway before escalation by calling isDockerDriverGatewayProcessAlive (or isDockerDriverGatewayProcess with requireDockerDriverEnv) right after the SIGTERM wait loop and before sending SIGKILL; locate terminateDockerDriverGatewayProcess and stopDockerDriverGatewayProcess and change the post-term check to verify the PID matches the recorded Docker-driver gateway (using isDockerDriverGatewayProcessAlive/isDockerDriverGatewayProcess and resolveOpenShellGatewayBinary) and only send SIGKILL when that check confirms it is still the gateway, while retaining getDockerDriverGatewayPid and clearDockerDriverGatewayRuntimeFiles behavior.src/lib/messaging/hooks/common/token-paste.ts (1)
109-109:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winNormalize the credential-store fallback in token paste
The env value is normalized, but the
readCredentialfallback is used raw; this is also howgetMessagingToken()handles credentials (normalizeCredentialValue(process.env[envKey]) || getCredential(envKey)), so stored tokens containing trailing whitespace/\rcould fail thefield.formatvalidation and be skipped.🛡️ Proposed fix
- let token = normalizeCredentialValue(env[field.envKey]) || readCredential(field.envKey); + let token = + normalizeCredentialValue(env[field.envKey]) || + normalizeCredentialValue(readCredential(field.envKey));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/messaging/hooks/common/token-paste.ts` at line 109, The token fallback uses a raw readCredential value which can contain trailing whitespace/CR and fail validation; update the token assignment (the token variable in token-paste.ts) to normalize both sources (env[field.envKey] and readCredential(field.envKey)) — i.e., use normalizeCredentialValue(...) for the env lookup and also call normalizeCredentialValue(readCredential(field.envKey)) as the fallback so stored credentials are trimmed/massaged before applying field.format validation (consistent with getMessagingToken behavior).
🧹 Nitpick comments (1)
src/lib/messaging/compiler/manifest-compiler.ts (1)
358-374: ⚡ Quick winConsider documenting the enrollment hook execution logic.
shouldRunEnrollmentHookimplements nuanced conditional logic:
- Always runs
tokenPastehandlers (line 362)- Always runs hooks with no outputs (line 365)
- For hooks with required outputs, runs if any required output is unavailable (lines 367-370)
- Always runs hooks with only config outputs (line 372)
- Otherwise runs if any output is unavailable (line 373)
This logic is sound but non-obvious. A brief inline comment explaining the rationale would help maintainability.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/messaging/compiler/manifest-compiler.ts` around lines 358 - 374, Add a concise inline comment above the shouldRunEnrollmentHook function explaining its enrollment-execution rules: always run handlers ending with ".tokenPaste", run when there are no outputs, when any required output (from hook.outputs) is missing, always run if all outputs are config-only, otherwise run when any non-config output is missing; reference the helper isHookOutputAvailable and the types ChannelHookSpec and SandboxMessagingInputReference to clarify the checks and intent for future maintainers.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/lib/messaging/hooks/common/token-paste.ts`:
- Line 109: The token fallback uses a raw readCredential value which can contain
trailing whitespace/CR and fail validation; update the token assignment (the
token variable in token-paste.ts) to normalize both sources (env[field.envKey]
and readCredential(field.envKey)) — i.e., use normalizeCredentialValue(...) for
the env lookup and also call
normalizeCredentialValue(readCredential(field.envKey)) as the fallback so stored
credentials are trimmed/massaged before applying field.format validation
(consistent with getMessagingToken behavior).
In `@src/lib/onboard.ts`:
- Around line 1122-1153: terminateDockerDriverGatewayProcess currently escalates
from SIGTERM to SIGKILL based only on isPidAlive; update the logic so
stopDockerDriverGatewayProcess re-validates the PID is the gateway before
escalation by calling isDockerDriverGatewayProcessAlive (or
isDockerDriverGatewayProcess with requireDockerDriverEnv) right after the
SIGTERM wait loop and before sending SIGKILL; locate
terminateDockerDriverGatewayProcess and stopDockerDriverGatewayProcess and
change the post-term check to verify the PID matches the recorded Docker-driver
gateway (using isDockerDriverGatewayProcessAlive/isDockerDriverGatewayProcess
and resolveOpenShellGatewayBinary) and only send SIGKILL when that check
confirms it is still the gateway, while retaining getDockerDriverGatewayPid and
clearDockerDriverGatewayRuntimeFiles behavior.
---
Nitpick comments:
In `@src/lib/messaging/compiler/manifest-compiler.ts`:
- Around line 358-374: Add a concise inline comment above the
shouldRunEnrollmentHook function explaining its enrollment-execution rules:
always run handlers ending with ".tokenPaste", run when there are no outputs,
when any required output (from hook.outputs) is missing, always run if all
outputs are config-only, otherwise run when any non-config output is missing;
reference the helper isHookOutputAvailable and the types ChannelHookSpec and
SandboxMessagingInputReference to clarify the checks and intent for future
maintainers.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 54568311-e710-4e76-b416-1115a5725df0
📒 Files selected for processing (25)
scripts/install.shsrc/lib/actions/sandbox/policy-channel.tssrc/lib/messaging/applier/setup-applier.test.tssrc/lib/messaging/channels/manifests.test.tssrc/lib/messaging/channels/slack/hooks/credential-validation.test.tssrc/lib/messaging/channels/slack/hooks/credential-validation.tssrc/lib/messaging/channels/slack/hooks/index.tssrc/lib/messaging/channels/slack/hooks/validate-credentials.test.tssrc/lib/messaging/channels/slack/hooks/validate-credentials.tssrc/lib/messaging/channels/slack/manifest.tssrc/lib/messaging/compiler/manifest-compiler.test.tssrc/lib/messaging/compiler/manifest-compiler.tssrc/lib/messaging/compiler/workflow-planner.test.tssrc/lib/messaging/hooks/builtins.tssrc/lib/messaging/hooks/common/token-paste.test.tssrc/lib/messaging/hooks/common/token-paste.tssrc/lib/messaging/hooks/hook-runner.test.tssrc/lib/onboard.tssrc/lib/onboard/messaging-channel-setup.test.tssrc/lib/sandbox/channels.tstest/channels-add-preset.test.tstest/e2e/test-channels-stop-start.shtest/e2e/test-messaging-providers.shtest/install-docker-group-reexec.test.tstest/onboard-messaging.test.ts
💤 Files with no reviewable changes (7)
- test/install-docker-group-reexec.test.ts
- test/e2e/test-messaging-providers.sh
- test/e2e/test-channels-stop-start.sh
- src/lib/sandbox/channels.ts
- test/onboard-messaging.test.ts
- test/channels-add-preset.test.ts
- src/lib/onboard/messaging-channel-setup.test.ts
🚧 Files skipped from review as they are similar to previous changes (3)
- scripts/install.sh
- src/lib/messaging/channels/manifests.test.ts
- src/lib/actions/sandbox/policy-channel.ts
Selective E2E Results — ❌ Some jobs failedRun: 26816072600
|
Selective E2E Results — ❌ Some jobs failedRun: 26816072600
|
Signed-off-by: San Dang <sdang@nvidia.com>
Selective E2E Results — ✅ All requested jobs passedRun: 26931388430
|
## Summary Persist manifest messaging plans through channel lifecycle operations so add, stop, start, remove, and rebuild can carry the new architecture state in `SandboxEntry` while legacy registry fields continue to work. ## Related Issue Fixes #4535 Refs #3896 ## Changes - Added `MessagingWorkflowPlanner` helpers that merge a compiled add-channel plan into a stored sandbox plan and mutate stored plans for stop/start/remove/rebuild. - Updated `channels add`, `channels stop`, `channels start`, and `channels remove` to write `SandboxEntry.messaging.plan` without removing legacy registry updates. - Staged stored manifest plans during rebuild through the existing messaging plan env path. - Added planner tests for add merge, stop/start mutation, remove pruning, rebuild staging from stored plans, and no-compile behavior when no stored plan exists. ## Type of Change - [x] Code change (feature, bug fix, or refactor) - [ ] Code change with doc updates - [ ] Doc only (prose changes, no code sample modifications) - [ ] Doc only (includes code sample changes) ## Verification - [ ] `npx prek run --all-files` passes - [ ] `npm test` passes - [x] Tests added or updated for new or changed behavior - [x] No secrets, API keys, or credentials committed - [ ] Docs updated for user-facing behavior changes - [ ] `npm run docs` builds without warnings (doc changes only) - [ ] Doc pages follow the [style guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md) (doc changes only) - [ ] New doc pages include SPDX header and frontmatter (new pages only) --- Signed-off-by: San Dang <sdang@nvidia.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes * **New Features** * Improved messaging channel management with manifest-driven configuration for Discord, Telegram, Slack, WeChat, and WhatsApp * Support for multiple authentication modes including token-based and QR code enrollment * Enhanced channel validation and reachability checks during setup * **Bug Fixes** * More reliable credential handling and credential binding resolution * Better error messaging and validation for channel configuration * **Tests** * Expanded test coverage for channel enrollment workflows and manifest validation <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: San Dang <sdang@nvidia.com>
Signed-off-by: San Dang <sdang@nvidia.com>
Selective E2E Results — ✅ All requested jobs passedRun: 26944524013
|
Selective E2E Results — ✅ All requested jobs passedRun: 26944897258
|
Selective E2E Results — ❌ Some jobs failedRun: 26957346522
|
2 similar comments
Selective E2E Results — ❌ Some jobs failedRun: 26957346522
|
Selective E2E Results — ❌ Some jobs failedRun: 26957346522
|
Summary
Migrates messaging enrollment from the legacy ad hoc onboard implementation to the manifest-driven workflow compiler path.
The PR keeps the existing operator-facing enrollment UX, but moves the source of truth for channel setup into channel manifests and registered messaging hooks. Telegram,
Discord, Slack, WeChat, and WhatsApp now declare their enrollment prompts, config inputs, credential bindings, policy needs, render targets, and setup hooks through the
manifest system.
3 key changes
messaging-channel-setup.ts, with shared manifest-style channelsTest results
OpenClaw + Hermes
Discord
Telegram
WeChat
Slack
Related Issue
Closes #4247 #4535
Changes
messaging-channel-setup.tsand shared helpers intomessaging/utils.ts.MessagingWorkflowPlannerresolves hooks from the manifest registry.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)Signed-off-by: San Dang sdang@nvidia.com
Summary by CodeRabbit
New Features
Improvements