Skip to content

feat(bridge): add Slack Socket Mode integration#1129

Open
caco26i wants to merge 13 commits intoNVIDIA:mainfrom
caco26i:feat/slack-bridge
Open

feat(bridge): add Slack Socket Mode integration#1129
caco26i wants to merge 13 commits intoNVIDIA:mainfrom
caco26i:feat/slack-bridge

Conversation

@caco26i
Copy link
Copy Markdown

@caco26i caco26i commented Mar 30, 2026

Summary

This PR adds support for a Slack Socket Mode integration, mirroring the existing Telegram bridge functionality. It allows users to interact with their OpenClaw sandbox directly from Slack.

Changes

  • Added scripts/slack-bridge.js to handle Slack Socket Mode connections and forward messages to the sandbox.
  • Updated scripts/start-services.sh to start and stop the Slack bridge alongside the Telegram bridge when SLACK_BOT_TOKEN and SLACK_APP_TOKEN are present.
  • Updated bin/nemoclaw.js to load credentials from ~/.nemoclaw/credentials.json before starting services.
  • Removed Slack tokens from being passed as environment variables to the sandbox in bin/lib/onboard.js to prevent credential exposure.

Type of Change

  • Code change for a new feature, bug fix, or refactor.
  • Code change with doc updates.
  • Doc only. Prose changes without code sample modifications.
  • Doc only. Includes code sample changes.

Testing

  • npx prek run --all-files passes (or equivalently make check).
  • npm test passes.
  • make docs builds without warnings. (for doc-only changes)

Checklist

General

Code Changes

  • Formatters applied — npx prek run --all-files auto-fixes formatting (or make format for targeted runs).
  • Tests added or updated for new or changed behavior.
  • No secrets, API keys, or credentials committed.
  • Doc pages updated for any user-facing behavior changes (new commands, changed defaults, new features, bug fixes that contradict existing docs).

Summary by CodeRabbit

  • New Features

    • Slack bridge: live Slack messages forward to the agent and threaded responses are posted back (chunking, session reset, per-channel cooldown).
  • Improvements

    • Slack bridge shown in service status and stopped/started with services; reports “running” or “not started (missing tokens)”.
    • Persisted credentials are loaded into the environment and Slack credentials can be injected into sandbox runs when available.

Signed-off-by: NemoClaw Bot <bot@example.com>
Made-with: Cursor
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a Slack Socket Mode bridge service, injects persisted Slack credentials into sandbox environments and process.env, and updates service start/stop/status flows to conditionally manage and report the new slack-bridge based on available Slack tokens and validated sandbox names.

Changes

Cohort / File(s) Summary
Credential & env handling
bin/lib/onboard.js, bin/nemoclaw.js
onboard now conditionally injects SLACK_APP_TOKEN, SLACK_BOT_TOKEN, and NEMOCLAW_OPENCLAW_SLACK_GATEWAY into sandbox env when present. bin/nemoclaw.js calls ensureApiKey(), loads persisted credentials into process.env if missing, and conditionally prefixes SANDBOX_NAME=<defaultSandbox> for start-services.sh --stop/--status when the sandbox name matches /^[a-zA-Z0-9._-]+$/.
Slack bridge script
scripts/slack-bridge.js
New executable Socket Mode bridge that validates SLACK_APP_TOKEN/SLACK_BOT_TOKEN/NVIDIA_API_KEY, enforces per-channel cooldown and busy gate, spawns SSH using a temporary ssh-config to run nemoclaw-start openclaw agent (derived --session-id), applies a 120s timeout, filters agent output, and posts threaded responses split into 3000-char chunks.
Service orchestration
scripts/start-services.sh
Start/stop/status updated to include slack-bridge; it is started only when both SLACK_BOT_TOKEN and SLACK_APP_TOKEN are set. Status reports Slack bridge as “bridge running” or “not started (missing tokens)”; stop explicitly stops slack-bridge when applicable.

Sequence Diagram

sequenceDiagram
    participant User as Slack User
    participant Slack as Slack API
    participant Bridge as slack-bridge.js
    participant SSH as SSH Client
    participant Sandbox as Nemoclaw Sandbox
    participant Agent as OpenClaw Agent

    User->>Slack: send message / mention
    Slack->>Bridge: Socket Mode event
    Bridge->>Bridge: validate tokens, sandbox, cooldown/gate
    Bridge->>SSH: write temp ssh-config & spawn ssh
    SSH->>Sandbox: run `nemoclaw-start openclaw agent --session-id ...`
    Sandbox->>Agent: deliver message input
    Agent-->>Sandbox: produce stdout/stderr
    Sandbox-->>SSH: return process output
    SSH-->>Bridge: relay stdout/stderr
    Bridge->>Bridge: filter/format, split into 3000-char chunks
    Bridge->>Slack: chat.postMessage (threaded, chunked)
    Slack->>User: display response
    Bridge->>SSH: cleanup temp config & close
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble keys and hop the stream,

A tiny bridge unites Slack and dream.
Tokens tucked in sandbox nests so neat,
Agents answer, threaded and complete,
I clean configs, then twitch my feet.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main feature addition: Slack Socket Mode integration via a new bridge, which is the primary objective of this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
scripts/slack-bridge.js (5)

225-228: Reconnection logic lacks exponential backoff.

The fixed 3-second delay on WebSocket close could cause rapid reconnection attempts if there's an authentication or server issue, potentially triggering rate limits.

♻️ Suggested improvement with exponential backoff
+let reconnectAttempts = 0;
+const MAX_RECONNECT_DELAY = 60000;

 ws.addEventListener("close", () => {
   console.log("Socket Mode connection closed. Reconnecting...");
-  setTimeout(connectSocketMode, 3000);
+  const delay = Math.min(3000 * Math.pow(2, reconnectAttempts), MAX_RECONNECT_DELAY);
+  reconnectAttempts++;
+  setTimeout(connectSocketMode, delay);
 });

+// Reset reconnect attempts on successful connection
+ws.addEventListener("open", () => {
+  reconnectAttempts = 0;
   console.log("Connected to Slack Socket Mode.");
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/slack-bridge.js` around lines 225 - 228, The reconnection on
WebSocket close currently uses a fixed 3s delay (ws.addEventListener("close"
...) calling setTimeout(connectSocketMode, 3000)), which can cause rapid retries
and rate limits; replace the fixed timeout with an exponential backoff strategy
inside the close handler that uses a backoff counter (e.g., reconnectAttempt or
backoffMs) keyed to connectSocketMode, doubling the delay on each failure up to
a maximum (and applying jitter), reset the counter on a successful connection
(e.g., in the ws.open or successful handshake path), and ensure you cap the
delay and optionally log attempts to avoid infinite tight-loop reconnects.

95-97: Inline require("fs") calls — import once at the top.

The file system module is required inline multiple times. Import it once at the top for consistency and minor performance improvement.

♻️ Proposed fix
 const https = require("https");
+const fs = require("fs");
 const { execFileSync, spawn } = require("child_process");

 // Then replace inline requires:
-    const confDir = require("fs").mkdtempSync("/tmp/nemoclaw-slack-ssh-");
+    const confDir = fs.mkdtempSync("/tmp/nemoclaw-slack-ssh-");
     const confPath = `${confDir}/config`;
-    require("fs").writeFileSync(confPath, sshConfig, { mode: 0o600 });
+    fs.writeFileSync(confPath, sshConfig, { mode: 0o600 });

     // ...

-      try { require("fs").unlinkSync(confPath); require("fs").rmdirSync(confDir); } catch { /* ignored */ }
+      try { fs.unlinkSync(confPath); fs.rmdirSync(confDir); } catch { /* ignored */ }

Also applies to: 114-114

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/slack-bridge.js` around lines 95 - 97, Replace inline require("fs")
calls by importing fs once at the top and using that reference throughout: add a
single top-level binding (e.g., const fs = require('fs')) and update usages of
require("fs").mkdtempSync and require("fs").writeFileSync to fs.mkdtempSync and
fs.writeFileSync respectively (also update the other occurrence around the
ssh-bridge code, e.g., the call referenced at line ~114). This keeps the unique
symbols confDir and confPath unchanged while consolidating the fs import for
consistency and minor performance improvement.

39-42: activeSessions Map is unused — message history is not being tracked.

The comment on line 39 states this is for channelId → message history, but the Map is only cleared on reset command (line 191). No message history is ever stored, so multi-turn context is not preserved across agent calls.

Either implement the session tracking or remove the dead code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/slack-bridge.js` around lines 39 - 42, The activeSessions Map
declared as activeSessions is never used so multi-turn context isn't preserved;
either implement session tracking by storing and updating message history into
activeSessions keyed by channelId whenever a message is received or sent (and
include that history when calling the agent), and keep the existing reset logic
that clears the Map, or remove the activeSessions declaration and related
comment entirely; locate places where messages are handled (e.g., the message
handler function that sends/receives to the agent and the reset command logic)
and either push each incoming/outgoing message into
activeSessions.get(channelId) (initializing an array if absent) and read that
array when constructing agent calls, or delete activeSessions and its comment if
session state is intentionally unsupported.

91-97: execFileSync can throw — wrap in try/catch for better error handling.

If openshell sandbox ssh-config fails (e.g., sandbox doesn't exist), the unhandled exception will crash the promise chain without a meaningful error message to the user.

🛡️ Proposed fix
 function runAgentInSandbox(message, sessionId) {
   return new Promise((resolve) => {
-    const sshConfig = execFileSync(OPENSHELL, ["sandbox", "ssh-config", SANDBOX], { encoding: "utf-8" });
+    let sshConfig;
+    try {
+      sshConfig = execFileSync(OPENSHELL, ["sandbox", "ssh-config", SANDBOX], { encoding: "utf-8" });
+    } catch (err) {
+      resolve(`Failed to get SSH config for sandbox '${SANDBOX}': ${err.message}`);
+      return;
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/slack-bridge.js` around lines 91 - 97, The call to execFileSync in
runAgentInSandbox can throw and is currently unhandled; change the Promise
constructor to new Promise((resolve, reject)) and wrap the
execFileSync(OPENSHELL, ["sandbox", "ssh-config", SANDBOX], ...) call (and the
subsequent mkdtempSync/writeFileSync sequence) in a try/catch so any error is
caught and the promise is rejected with the caught error (or a wrapped error
with context). Update error handling to include the thrown error message and
ensure any partially created temp files/dirs are cleaned up in the catch block;
reference function runAgentInSandbox and the execFileSync/ mkdtempSync/
writeFileSync calls to locate the code to modify.

251-251: Hardcoded model name in banner may be misleading.

The banner displays nvidia/nemotron-3-super-120b-a12b regardless of the actual model configured for the sandbox. This could confuse users about which model is actually serving their requests.

Consider either removing the model line or querying the actual configured model dynamically.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/slack-bridge.js` at line 251, The banner contains a hardcoded model
line (the console.log that prints "  │  Model:   
nvidia/nemotron-3-super-120b-a12b       │") which can be misleading; replace
that static string with a dynamic value from the configured model (e.g., read
the model name from the runtime config, env var, or the variable used to
initialize the sandbox like modelName/config.model) or remove the line entirely;
locate the console.log call in the banner-printing code (search for the exact
hardcoded string) and change it to interpolate the actual model value with a
sensible fallback such as "unknown" if not set.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/slack-bridge.js`:
- Around line 102-105: The child process created with spawn("ssh", ...) uses a
non-functional timeout option — modify the code around the spawn call to
implement a manual timeout: after creating proc, start a setTimeout that calls
proc.kill() (and logs an error) after the desired ms, and store the timer id;
also attach listeners on proc ('exit' or 'close' and 'error') to clear the timer
when the process ends and to handle cleanup; ensure you do not rely on the spawn
option "timeout" anymore and update any references to proc to handle termination
and potential partial stdout/stderr.
- Around line 172-173: The inner declaration const event = msg.payload.event
shadows the addEventListener callback's event parameter; rename that inner
variable (e.g., payloadEvent or slackEvent) and update all subsequent references
in the block to use the new name so the callback parameter remains unshadowed
and code clarity is preserved (locate the occurrence where msg.payload.event is
assigned inside the addEventListener handler and replace the identifier and its
uses).

---

Nitpick comments:
In `@scripts/slack-bridge.js`:
- Around line 225-228: The reconnection on WebSocket close currently uses a
fixed 3s delay (ws.addEventListener("close" ...) calling
setTimeout(connectSocketMode, 3000)), which can cause rapid retries and rate
limits; replace the fixed timeout with an exponential backoff strategy inside
the close handler that uses a backoff counter (e.g., reconnectAttempt or
backoffMs) keyed to connectSocketMode, doubling the delay on each failure up to
a maximum (and applying jitter), reset the counter on a successful connection
(e.g., in the ws.open or successful handshake path), and ensure you cap the
delay and optionally log attempts to avoid infinite tight-loop reconnects.
- Around line 95-97: Replace inline require("fs") calls by importing fs once at
the top and using that reference throughout: add a single top-level binding
(e.g., const fs = require('fs')) and update usages of require("fs").mkdtempSync
and require("fs").writeFileSync to fs.mkdtempSync and fs.writeFileSync
respectively (also update the other occurrence around the ssh-bridge code, e.g.,
the call referenced at line ~114). This keeps the unique symbols confDir and
confPath unchanged while consolidating the fs import for consistency and minor
performance improvement.
- Around line 39-42: The activeSessions Map declared as activeSessions is never
used so multi-turn context isn't preserved; either implement session tracking by
storing and updating message history into activeSessions keyed by channelId
whenever a message is received or sent (and include that history when calling
the agent), and keep the existing reset logic that clears the Map, or remove the
activeSessions declaration and related comment entirely; locate places where
messages are handled (e.g., the message handler function that sends/receives to
the agent and the reset command logic) and either push each incoming/outgoing
message into activeSessions.get(channelId) (initializing an array if absent) and
read that array when constructing agent calls, or delete activeSessions and its
comment if session state is intentionally unsupported.
- Around line 91-97: The call to execFileSync in runAgentInSandbox can throw and
is currently unhandled; change the Promise constructor to new Promise((resolve,
reject)) and wrap the execFileSync(OPENSHELL, ["sandbox", "ssh-config",
SANDBOX], ...) call (and the subsequent mkdtempSync/writeFileSync sequence) in a
try/catch so any error is caught and the promise is rejected with the caught
error (or a wrapped error with context). Update error handling to include the
thrown error message and ensure any partially created temp files/dirs are
cleaned up in the catch block; reference function runAgentInSandbox and the
execFileSync/ mkdtempSync/ writeFileSync calls to locate the code to modify.
- Line 251: The banner contains a hardcoded model line (the console.log that
prints "  │  Model:    nvidia/nemotron-3-super-120b-a12b       │") which can be
misleading; replace that static string with a dynamic value from the configured
model (e.g., read the model name from the runtime config, env var, or the
variable used to initialize the sandbox like modelName/config.model) or remove
the line entirely; locate the console.log call in the banner-printing code
(search for the exact hardcoded string) and change it to interpolate the actual
model value with a sensible fallback such as "unknown" if not set.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a62adebc-50f8-47ce-9dda-57fa0bde1f6b

📥 Commits

Reviewing files that changed from the base of the PR and between 2cb6ed8 and 0d1d44a.

📒 Files selected for processing (4)
  • bin/lib/onboard.js
  • bin/nemoclaw.js
  • scripts/slack-bridge.js
  • scripts/start-services.sh

@caco26i
Copy link
Copy Markdown
Author

caco26i commented Mar 31, 2026

I have reviewed and applied all of CodeRabbit's comments:

  1. Reconnection logic lacks exponential backoff: Implemented exponential backoff (up to 60 seconds) instead of a fixed 3-second delay when attempting to reconnect Socket Mode.
  2. Inline require("fs") calls: Imported fs at the top of the file and used the reference throughout the script instead of inline calls.
  3. activeSessions Map is unused: Removed the activeSessions map and its deletion logic, as it wasn't being used to maintain message history.
  4. execFileSync can throw: Wrapped the execFileSync call in a try/catch block to properly handle errors if fetching the sandbox SSH config fails.
  5. Hardcoded model name in banner: Modified the banner to display the actual model configured in process.env.NEMOCLAW_MODEL or "unknown" if not defined.
  6. spawn does not support the timeout option: Removed the timeout option from spawn and implemented a manual timeout using setTimeout and proc.kill("SIGTERM").
  7. Variable shadowing: Renamed the internal event variable to slackEvent to avoid shadowing the event parameter from the addEventListener callback.

The changes have been pushed to the PR branch.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
scripts/slack-bridge.js (4)

259-285: Consider adding an unhandled rejection handler for production robustness.

The main() call on line 285 handles its own errors via process.exit(1), but unexpected rejections elsewhere (e.g., from the WebSocket message handler) could crash the process silently.

♻️ Proposed addition before main()
+process.on("unhandledRejection", (err) => {
+  console.error("Unhandled rejection:", err);
+});
+
 main();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/slack-bridge.js` around lines 259 - 285, Add a global unhandled
rejection handler to ensure unexpected Promise rejections are logged and
handled: register process.on('unhandledRejection', ...) (and optionally
process.on('uncaughtException', ...)) near the top-level before calling main(),
log the error with context and either exit(1) or perform graceful cleanup; this
will cover async failures coming from connectSocketMode(), main(), or any other
async handlers that would otherwise crash silently.

176-179: reconnectAttempts is referenced before its lexical definition.

The variable reconnectAttempts is used on line 177 but declared on line 242. This works because the open callback executes after line 242, but the source ordering is confusing and could lead to bugs if refactored.

♻️ Proposed fix: move declaration before event listeners
   const ws = new WebSocket(res.url);
+
+  let reconnectAttempts = 0;
+  const MAX_RECONNECT_DELAY = 60000;

   ws.addEventListener("open", () => {
     reconnectAttempts = 0;
     console.log("Connected to Slack Socket Mode.");
   });
   // ... rest of event listeners ...

-  let reconnectAttempts = 0;
-  const MAX_RECONNECT_DELAY = 60000;
-
   ws.addEventListener("close", () => {

Also applies to: 242-250

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/slack-bridge.js` around lines 176 - 179, The variable
reconnectAttempts is referenced in the ws.addEventListener("open") callback
before its lexical declaration; move the reconnectAttempts declaration (and any
related initialization) above the WebSocket event listener registrations (e.g.,
before ws.addEventListener("open") and other ws.addEventListener calls) so that
reconnectAttempts is declared in scope prior to being used; update the file's
top-of-scope area where reconnectAttempts (and its related reconnect/backoff
variables around the current declaration in the reconnect logic) are defined to
ensure event handlers like the "open" callback, "close", and reconnection logic
read/write a declared variable.

75-87: Silent failures on message sending may leave users without responses.

If slackApi("chat.postMessage", ...) fails (e.g., rate limited, network error), the error is swallowed and the user may not see part or all of the response. Consider logging failures or throwing to allow the caller to handle them.

♻️ Proposed improvement
 async function sendMessage(channel, text, thread_ts) {
   const chunks = [];
   for (let i = 0; i < text.length; i += 3000) {
     chunks.push(text.slice(i, i + 3000));
   }
   for (const chunk of chunks) {
-    await slackApi("chat.postMessage", {
+    const res = await slackApi("chat.postMessage", {
       channel,
       text: chunk,
       thread_ts,
     }, BOT_TOKEN);
+    if (!res.ok) {
+      console.error(`Failed to send message to ${channel}: ${res.error}`);
+    }
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/slack-bridge.js` around lines 75 - 87, The sendMessage function
currently awaits slackApi("chat.postMessage", ...) for each chunk but does not
handle failures; wrap each slackApi call in a try/catch inside sendMessage, log
the error with context (include channel, thread_ts and the chunk or chunk index)
using the existing logger or console.error, and rethrow the error (or throw a
new Error that preserves the original) so the caller can handle it; reference
sendMessage, slackApi and BOT_TOKEN when locating the change.

181-183: JSON.parse may throw on malformed WebSocket messages.

If Slack sends a malformed message (unlikely but possible during protocol changes or network corruption), the unguarded JSON.parse will throw and the event won't be acknowledged, potentially causing message replay.

♻️ Proposed fix
   ws.addEventListener("message", async (event) => {
-    const msg = JSON.parse(event.data);
+    let msg;
+    try {
+      msg = JSON.parse(event.data);
+    } catch (err) {
+      console.error("Failed to parse WebSocket message:", err.message);
+      return;
+    }

     if (msg.type === "hello") return;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/slack-bridge.js` around lines 181 - 183, Wrap the unguarded
JSON.parse(event.data) inside a try/catch in the ws.addEventListener("message",
async (event) => { ... }) handler so malformed WebSocket messages don't throw;
on catch, log the parse error (include event.data), optionally send any required
nack/ack over ws if the Slack protocol expects it, and return early so
downstream code that uses msg is not executed. Ensure you reference the parsed
variable msg and any existing logger (or console.error) when recording the
error.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@scripts/slack-bridge.js`:
- Around line 259-285: Add a global unhandled rejection handler to ensure
unexpected Promise rejections are logged and handled: register
process.on('unhandledRejection', ...) (and optionally
process.on('uncaughtException', ...)) near the top-level before calling main(),
log the error with context and either exit(1) or perform graceful cleanup; this
will cover async failures coming from connectSocketMode(), main(), or any other
async handlers that would otherwise crash silently.
- Around line 176-179: The variable reconnectAttempts is referenced in the
ws.addEventListener("open") callback before its lexical declaration; move the
reconnectAttempts declaration (and any related initialization) above the
WebSocket event listener registrations (e.g., before ws.addEventListener("open")
and other ws.addEventListener calls) so that reconnectAttempts is declared in
scope prior to being used; update the file's top-of-scope area where
reconnectAttempts (and its related reconnect/backoff variables around the
current declaration in the reconnect logic) are defined to ensure event handlers
like the "open" callback, "close", and reconnection logic read/write a declared
variable.
- Around line 75-87: The sendMessage function currently awaits
slackApi("chat.postMessage", ...) for each chunk but does not handle failures;
wrap each slackApi call in a try/catch inside sendMessage, log the error with
context (include channel, thread_ts and the chunk or chunk index) using the
existing logger or console.error, and rethrow the error (or throw a new Error
that preserves the original) so the caller can handle it; reference sendMessage,
slackApi and BOT_TOKEN when locating the change.
- Around line 181-183: Wrap the unguarded JSON.parse(event.data) inside a
try/catch in the ws.addEventListener("message", async (event) => { ... })
handler so malformed WebSocket messages don't throw; on catch, log the parse
error (include event.data), optionally send any required nack/ack over ws if the
Slack protocol expects it, and return early so downstream code that uses msg is
not executed. Ensure you reference the parsed variable msg and any existing
logger (or console.error) when recording the error.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a026fb97-ff23-4424-b6a1-a4c35515a34c

📥 Commits

Reviewing files that changed from the base of the PR and between 0d1d44a and f44fdce.

📒 Files selected for processing (1)
  • scripts/slack-bridge.js

@caco26i caco26i force-pushed the feat/slack-bridge branch from 8cc28e4 to b9a305e Compare March 31, 2026 00:18
@caco26i
Copy link
Copy Markdown
Author

caco26i commented Mar 31, 2026

I have reviewed and applied the new nitpicks from CodeRabbit:

  • Unhandled Rejection Handler: Added process.on("unhandledRejection", ...) before calling main() to gracefully handle any unexpected Promise rejections.
  • Variable Declaration Order: Moved the declaration of reconnectAttempts and MAX_RECONNECT_DELAY above the WebSocket event listeners to ensure they are defined before being referenced in the open callback.
  • Message Sending Error Handling: Wrapped the slackApi("chat.postMessage", ...) call in a try/catch block within sendMessage to log and throw errors if sending a message chunk fails.
  • WebSocket Message Parsing: Wrapped JSON.parse(event.data) in a try/catch block inside the WebSocket message event listener to prevent malformed messages from crashing the handler.

The changes have been pushed to the PR branch.

@caco26i
Copy link
Copy Markdown
Author

caco26i commented Mar 31, 2026

For context, here is an analysis of other open PRs that are related to this Slack integration:

  • Added discord support and openAI gpt-5-mini #1127 (Added discord support): This PR introduces another messaging bridge. Since both PRs likely modify scripts/start-services.sh to orchestrate their respective bridge scripts alongside the Telegram bridge, there might be minor merge conflicts depending on which PR gets merged first. The architectural approach (using a bridge script that communicates via SSH to the sandbox) seems conceptually similar.
  • fix(presets): add missing binaries to communication presets #1084 (fix(presets): add missing binaries to communication presets): This PR addresses missing binaries in existing communication presets. Currently, our slack.yaml preset only specifies /usr/local/bin/node under binaries. Depending on the outcome of fix(presets): add missing binaries to communication presets #1084, we may need to revisit slack.yaml to ensure it includes any standard binaries deemed necessary for all communication bridges to function reliably inside the sandbox.

@caco26i
Copy link
Copy Markdown
Author

caco26i commented Mar 31, 2026

I noticed there are a couple of open issues related to credential handling in the bridge scripts that also apply to this Slack integration:

Currently, this Slack bridge mirrors the existing Telegram implementation, meaning it also passes the NVIDIA_API_KEY via SSH and relies on environment variables for tokens.

Should we adapt this PR to use the new provider system (perhaps aligning with the work in draft PR #1081), or is it preferable to merge this as-is to establish feature parity with Telegram and address the credential isolation refactoring globally in a separate PR?

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
scripts/start-services.sh (1)

209-213: Slack banner fallback message is slightly misleading.

Line 212 always says “missing tokens,” but a non-running bridge can also be caused by missing NVIDIA_API_KEY or startup failure. Consider broadening the text to avoid false diagnosis.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/start-services.sh` around lines 209 - 213, The Slack bridge fallback
message is too specific; update the else branch that prints the Slack status
(the block using is_running slack-bridge) to use a broader, non-diagnostic
message so it doesn't incorrectly claim "missing tokens." Replace the echo in
the else branch with something like "Slack: not running (check tokens,
NVIDIA_API_KEY, or startup logs)" or similar phrasing to cover missing env vars
or startup failures; keep the is_running slack-bridge check intact and only
change the printed fallback text.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/start-services.sh`:
- Around line 161-165: The Slack bridge startup condition only checks
SLACK_BOT_TOKEN and SLACK_APP_TOKEN but not NVIDIA_API_KEY, which
scripts/slack-bridge.js requires; update the if condition that wraps the
start_service slack-bridge invocation to also require a non-empty NVIDIA_API_KEY
(e.g. add [ -n "${NVIDIA_API_KEY:-}" ] to the && chain) so the bridge is only
started when all three env vars are present.

---

Nitpick comments:
In `@scripts/start-services.sh`:
- Around line 209-213: The Slack bridge fallback message is too specific; update
the else branch that prints the Slack status (the block using is_running
slack-bridge) to use a broader, non-diagnostic message so it doesn't incorrectly
claim "missing tokens." Replace the echo in the else branch with something like
"Slack: not running (check tokens, NVIDIA_API_KEY, or startup logs)" or similar
phrasing to cover missing env vars or startup failures; keep the is_running
slack-bridge check intact and only change the printed fallback text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 93002c13-3c00-48cd-9269-61f2e3e5f51f

📥 Commits

Reviewing files that changed from the base of the PR and between b9a305e and 015bc70.

📒 Files selected for processing (3)
  • bin/lib/onboard.js
  • bin/nemoclaw.js
  • scripts/start-services.sh
✅ Files skipped from review due to trivial changes (2)
  • bin/lib/onboard.js
  • bin/nemoclaw.js

@wscurran wscurran added enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. Integration: Slack Use this label to identify Slack integration issues with NemoClaw. labels Apr 1, 2026
@wscurran
Copy link
Copy Markdown
Contributor

wscurran commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bin/lib/onboard.js (1)

2171-2174: Add explicit test assertion for SLACK_APP_TOKEN coverage.

The injection of SLACK_APP_TOKEN correctly follows the established pattern for DISCORD_BOT_TOKEN and SLACK_BOT_TOKEN. However, the credential-exposure test (test/credential-exposure.test.js, line 71) includes assertions for both DISCORD_BOT_TOKEN and SLACK_BOT_TOKEN but does not include SLACK_APP_TOKEN. While the code itself adds to sandboxEnv (protected from CLI exposure), add an assertion for completeness:

expect(src).not.toMatch(/envArgs\.push\(formatEnvAssignment\("SLACK_APP_TOKEN"/);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 2171 - 2174, The test suite is missing an
assertion to verify SLACK_APP_TOKEN isn't exposed; update the
credential-exposure test to add an assertion that the generated source does not
push SLACK_APP_TOKEN into envArgs, e.g. add
expect(src).not.toMatch(/envArgs\.push\(formatEnvAssignment\("SLACK_APP_TOKEN"/);
inside the same test that checks DISCORD_BOT_TOKEN and SLACK_BOT_TOKEN so the
credential coverage matches the new sandboxEnv injection for SLACK_APP_TOKEN.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 2175-2178: The sandboxEnv assignment for
NEMOCLAW_OPENCLAW_SLACK_GATEWAY is dead code; either remove the
retrieval/assignment (remove the getCredential call and the
sandboxEnv.NEMOCLAW_OPENCLAW_SLACK_GATEWAY = ... lines) or implement a consumer
that actually uses this variable (e.g., in slack-bridge.js read
process.env.NEMOCLAW_OPENCLAW_SLACK_GATEWAY or call
getCredential("NEMOCLAW_OPENCLAW_SLACK_GATEWAY") inside the SlackBridge
initialization and wire it into the SlackBridge constructor or config); update
or delete the unused constant slackGateway and ensure any config consumers
reference sandboxEnv.NEMOCLAW_OPENCLAW_SLACK_GATEWAY consistently.

---

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 2171-2174: The test suite is missing an assertion to verify
SLACK_APP_TOKEN isn't exposed; update the credential-exposure test to add an
assertion that the generated source does not push SLACK_APP_TOKEN into envArgs,
e.g. add
expect(src).not.toMatch(/envArgs\.push\(formatEnvAssignment\("SLACK_APP_TOKEN"/);
inside the same test that checks DISCORD_BOT_TOKEN and SLACK_BOT_TOKEN so the
credential coverage matches the new sandboxEnv injection for SLACK_APP_TOKEN.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c86ee815-b792-4b98-b5a7-8c89384589d4

📥 Commits

Reviewing files that changed from the base of the PR and between 015bc70 and 0f4d32e.

📒 Files selected for processing (2)
  • bin/lib/onboard.js
  • bin/nemoclaw.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • bin/nemoclaw.js

@caco26i
Copy link
Copy Markdown
Author

caco26i commented Apr 1, 2026

cc @wscurran @snapydziuba

I have refactored the API key validation logic based on the recent feedback:

  1. Centralized API Keys: Moved the list of supported API keys (NVIDIA, OpenAI, Anthropic, Gemini, etc.) to a single source of truth in bin/lib/credentials.js.
  2. Dynamic Sandbox Injection: Both slack-bridge.js and telegram-bridge.js now dynamically export all available supported API keys to the sandbox, rather than hardcoding NVIDIA_API_KEY.
  3. Delegated Validation: Removed the API key validation from the Bash script (start-services.sh). The Bash script now only checks for the bot tokens (Slack/Telegram) before starting the Node.js processes. The bridge scripts themselves now handle validating that at least one supported API key is present, which aligns with the approach taken in other recent PRs and keeps the provider logic out of Bash.
  4. Updated Tests: Adjusted service-env.test.js and cli.test.js to reflect the new generalized API key handling and the removal of the Bash-level API key gate.

These changes have been pushed and all tests are passing.

@caco26i caco26i requested a review from snapydziuba April 2, 2026 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement: feature Use this label to identify requests for new capabilities in NemoClaw. Integration: Slack Use this label to identify Slack integration issues with NemoClaw.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants