Skip to content

Don't use SCCACHE_MEMCACHED_ENDPOINT if it is not valid#1406

Open
wendell-hom wants to merge 6 commits intomainfrom
whom/check-memcached-endpoint
Open

Don't use SCCACHE_MEMCACHED_ENDPOINT if it is not valid#1406
wendell-hom wants to merge 6 commits intomainfrom
whom/check-memcached-endpoint

Conversation

@wendell-hom
Copy link
Copy Markdown
Contributor

@wendell-hom wendell-hom commented Feb 15, 2026

Don't pass the sccache memcached endpoint env variable if the server is not reachable, otherwise compilation will fail.

Summary by CodeRabbit

  • Bug Fixes
    • Added runtime validation for the cache endpoint so unreachable endpoints are no longer forwarded to the environment, preventing misconfiguration.
    • Emits a warning and falls back to local caching when the endpoint cannot be reached, improving reliability and startup behavior.

Copilot AI review requested due to automatic review settings February 15, 2026 03:13
@wendell-hom wendell-hom requested review from agirault and wyli and removed request for Copilot February 15, 2026 03:13
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 15, 2026

Greptile Summary

This PR adds runtime validation of the SCCACHE_MEMCACHED_ENDPOINT environment variable before it is forwarded to the Docker container. Without this fix, an unreachable or misconfigured memcached endpoint would be passed as-is into the container, causing sccache — and therefore the entire compilation — to fail.

Key changes:

  • Adds import socket to support TCP connectivity checking.
  • Introduces is_valid_endpoint(), which parses the host:port from SCCACHE_MEMCACHED_ENDPOINT, attempts a 5-second TCP connection, and returns True only if it succeeds. Failures (parse errors, timeouts, connection refused) are caught broadly and emit a warning.
  • Updates the sccache_keys forwarding loop: SCCACHE_MEMCACHED_ENDPOINT is only included in the Docker -e arguments if is_valid_endpoint() returns True; SCCACHE_DIR is still always excluded (overridden by a fixed container path); all other SCCACHE_* keys are forwarded unconditionally.
  • The boolean condition correctly uses short-circuit evaluation so is_valid_endpoint() is called at most once per invocation (only when SCCACHE_MEMCACHED_ENDPOINT is present in the environment).

The overall approach is sound. One minor observation: is_valid_endpoint() does not use any instance state and could be a @staticmethod, but this is purely stylistic and does not affect correctness.

Confidence Score: 4/5

  • This PR is safe to merge — it fixes a real compilation-breaking bug with a correct and well-contained implementation.
  • The boolean condition is logically correct (short-circuit evaluation ensures is_valid_endpoint() is called at most once and only for the right key), import socket is present, info/warn are already imported, and the broad except Exception safely covers all parse and network failure paths. Previously-flagged issues (missing import, operator precedence, parsing robustness) have all been addressed in this PR.
  • No files require special attention.

Important Files Changed

Filename Overview
utilities/cli/container.py Adds is_valid_endpoint() method to verify TCP reachability of SCCACHE_MEMCACHED_ENDPOINT before forwarding it into the container, and updates the loop condition in get_environment_args() to skip the variable when the endpoint is unreachable. Includes the necessary import socket. Logic and operator precedence are correct.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[get_environment_args called] --> B[Build sccache_keys from os.environ]
    B --> C{enable_sccache?}
    C -- No --> D[Warn if sccache keys present]
    C -- Yes --> E[Forward HOLOHUB_ENABLE_SCCACHE + SCCACHE_DIR override]
    E --> F[Loop over sccache_keys]
    F --> G{k == SCCACHE_DIR?}
    G -- Yes --> H[Skip always]
    G -- No --> I{k == SCCACHE_MEMCACHED_ENDPOINT?}
    I -- No --> J[Forward key unconditionally]
    I -- Yes --> K[Call is_valid_endpoint]
    K --> L[Parse host:port via rsplit]
    L --> M{Parse OK?}
    M -- No --> N[except Exception: warn + return False]
    M -- Yes --> O[socket.create_connection timeout=5s]
    O --> P{Connection OK?}
    P -- Yes --> Q[info log + return True]
    P -- No --> N
    Q --> J
    N --> H
Loading

Last reviewed commit: 996e865

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment thread utilities/cli/container.py Outdated
Comment thread utilities/cli/container.py
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 15, 2026

Additional Comments (1)

utilities/cli/container.py
missing socket import

import argparse
import glob
import os
import re
import shlex
import shutil
import socket
import stat
import subprocess
import sys
import tempfile
from pathlib import Path

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 15, 2026

Walkthrough

Added is_valid_endpoint() to HoloHubContainer to validate SCCACHE_MEMCACHED_ENDPOINT via a socket connection; get_environment_args() now forwards SCCACHE_MEMCACHED_ENDPOINT only when that check succeeds.

Changes

Cohort / File(s) Summary
Endpoint validation & env forwarding
utilities/cli/container.py
Added is_valid_endpoint(self) -> bool which attempts a socket connection to SCCACHE_MEMCACHED_ENDPOINT and logs success/warning. Updated get_environment_args() to forward SCCACHE_MEMCACHED_ENDPOINT only if the endpoint is reachable.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly summarizes the main change: preventing use of SCCACHE_MEMCACHED_ENDPOINT when invalid, which matches the core objective of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@utilities/cli/container.py`:
- Around line 729-748: The is_valid_endpoint method references
socket.create_connection but the socket module is not imported; add "import
socket" to the top-level stdlib imports to fix the F821 lint error, and while
here, tighten the broad except in is_valid_endpoint (referencing
is_valid_endpoint, endpoint, host, port_str) to except (ValueError, OSError) so
only malformed port and connection errors are caught instead of swallowing all
exceptions.

Copilot AI review requested due to automatic review settings February 15, 2026 03:23
@wendell-hom wendell-hom force-pushed the whom/check-memcached-endpoint branch from 99b53dc to 87595f5 Compare February 15, 2026 03:23
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment thread utilities/cli/container.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds validation for the SCCACHE_MEMCACHED_ENDPOINT environment variable before passing it to containers. The change prevents build failures when the memcached endpoint is configured but not reachable by checking endpoint connectivity before forwarding the environment variable.

Changes:

  • Added is_valid_endpoint() method to validate memcached endpoint reachability via socket connection
  • Modified environment variable forwarding logic to conditionally include SCCACHE_MEMCACHED_ENDPOINT based on validation

Comment thread utilities/cli/container.py
Comment thread utilities/cli/container.py
Comment thread utilities/cli/container.py Outdated
Comment thread utilities/cli/container.py Outdated
Comment thread utilities/cli/container.py
@wendell-hom wendell-hom force-pushed the whom/check-memcached-endpoint branch from 87595f5 to 0874a50 Compare February 15, 2026 03:27
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@wendell-hom wendell-hom force-pushed the whom/check-memcached-endpoint branch from 0a97e30 to e873ba3 Compare February 15, 2026 03:38
Comment thread utilities/cli/container.py Outdated
# Forward other SCCACHE_* environment variables present on host
for k in sccache_keys:
if k != "SCCACHE_DIR":
if (k != "SCCACHE_DIR") and (k != "SCCACHE_MEMCACHED_ENDPOINT" or self.is_valid_endpoint()):
Copy link
Copy Markdown
Contributor

@wyli wyli Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me, minor suggestions:

  • add the env var to the printing util
    "HOLOHUB_ENABLE_SCCACHE",
  • add usage to readme:
    - **`HOLOHUB_ENABLE_SCCACHE`**: Defaults to `false`. Set to `true` to enable rapids-sccache for the build. You can configure sccache with `SCCACHE_*` environment variables per the [sccache documentation](https://github.com/rapidsai/sccache/tree/rapids/docs). Use `--extra-scripts sccache` to install sccache in the container image (e.g., `./holohub build-container --extra-scripts sccache`).

@bhashemian bhashemian added the Action Required by Author An action is required by author to proceed with review and approval. label Feb 17, 2026
@bhashemian
Copy link
Copy Markdown
Member

@wendell-hom what do you think about Wenqi's comment? Could you please respond? Thanks
#1406 (comment)

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@utilities/cli/container.py`:
- Line 782: The long conditional is exceeding Black's 88-char limit; split the
expression across lines and/or parenthesize it to wrap before 88 chars (e.g.,
break after the first comparison and place the second comparison on the next
line), keeping the same logic that checks k != "SCCACHE_DIR" and (k !=
"SCCACHE_MEMCACHED_ENDPOINT" or self.is_valid_endpoint()); update the line in
utilities/cli/container.py where that condition appears so it is
Black-compliant, then run the formatter (black or ./holohub lint --fix) to
ensure styling passes.

---

Duplicate comments:
In `@utilities/cli/container.py`:
- Around line 730-749: The is_valid_endpoint method uses a broad except
Exception; narrow it to catch the realistic failure modes (catch ValueError for
bad split/port parsing and OSError for connection failures) and handle them
similarly to the current behavior; update the except block(s) in
is_valid_endpoint to catch ValueError and OSError (using an exception variable
if you want to include details in the warn message) and leave other exceptions
to propagate.

Comment thread utilities/cli/container.py Outdated
host, port_str = endpoint.rsplit(":", 1)
port = int(port_str)

with socket.create_connection((host, port), timeout=5):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded 5-second blocking timeout

The 5-second timeout is hardcoded and applied synchronously, meaning every container launch with an unreachable SCCACHE_MEMCACHED_ENDPOINT will block for at least 5 seconds before showing the warning. Consider making the timeout configurable or reducing it (e.g., 1–2 seconds), since a user who misconfigured the endpoint will experience a noticeable pause on every invocation.

Suggested change
with socket.create_connection((host, port), timeout=5):
with socket.create_connection((host, port), timeout=2):

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
utilities/cli/container.py (1)

782-783: Lift the memcached validation out of this predicate.

Line 782 hides self.is_valid_endpoint() inside what looks like simple env-var filtering. Precomputing it once before the loop keeps the control flow explicit and avoids burying I/O in a boolean expression.

♻️ Refactor sketch
         if enable_sccache:
             # Forward HOLOHUB_ENABLE_SCCACHE to enable launcher before cmake build
             args.extend(["-e", "HOLOHUB_ENABLE_SCCACHE"])
             # Always set SCCACHE_DIR inside container to mounted path
             args.extend(["-e", f"SCCACHE_DIR={SCCACHE_CONTAINER_DIR}"])
+            memcached_endpoint_valid = (
+                self.is_valid_endpoint()
+                if "SCCACHE_MEMCACHED_ENDPOINT" in sccache_keys
+                else False
+            )
             # Forward other SCCACHE_* environment variables present on host
             for k in sccache_keys:
-                if (k != "SCCACHE_DIR") and (k != "SCCACHE_MEMCACHED_ENDPOINT" or self.is_valid_endpoint()):
+                if k == "SCCACHE_DIR":
+                    continue
+                if k == "SCCACHE_MEMCACHED_ENDPOINT" and not memcached_endpoint_valid:
+                    continue
                 args.extend(["-e", k])

As per coding guidelines, "All code must adhere to Holoscan SDK coding standards for style compliance, descriptive naming, minimal abbreviations, inline documentation, and error handling".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@utilities/cli/container.py` around lines 782 - 783, The current env-var
filtering hides the call to self.is_valid_endpoint() inside the predicate and
performs I/O during each iteration; refactor by calling self.is_valid_endpoint()
once before the loop (e.g., memcached_valid = self.is_valid_endpoint()) and then
replace the compound condition with a simple check using memcached_valid so the
loop reads: if k != "SCCACHE_DIR" and (k != "SCCACHE_MEMCACHED_ENDPOINT" or
memcached_valid): args.extend(["-e", k]); this removes repeated I/O from the
boolean expression and makes control flow explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@utilities/cli/container.py`:
- Around line 724-743: The is_valid_endpoint function performs a real TCP probe
(socket.create_connection) which must be skipped in dry-run; change
is_valid_endpoint to accept a dry_run flag (e.g., def is_valid_endpoint(self,
dry_run: bool = False)) and if dry_run is True, avoid creating a socket and
instead validate only the endpoint format (split on ":" and int(port)) and log
or info that probing was skipped; update all call sites that construct the
Docker/CLI command to pass the current dry-run state (e.g., self.args.dry_run or
parser.dry_run) into is_valid_endpoint so dry-run printing never does network
I/O.

---

Nitpick comments:
In `@utilities/cli/container.py`:
- Around line 782-783: The current env-var filtering hides the call to
self.is_valid_endpoint() inside the predicate and performs I/O during each
iteration; refactor by calling self.is_valid_endpoint() once before the loop
(e.g., memcached_valid = self.is_valid_endpoint()) and then replace the compound
condition with a simple check using memcached_valid so the loop reads: if k !=
"SCCACHE_DIR" and (k != "SCCACHE_MEMCACHED_ENDPOINT" or memcached_valid):
args.extend(["-e", k]); this removes repeated I/O from the boolean expression
and makes control flow explicit.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 70b3f248-89f4-40a2-b5e4-c1f4d3599a4e

📥 Commits

Reviewing files that changed from the base of the PR and between baf0282 and 0c9f6bc.

📒 Files selected for processing (1)
  • utilities/cli/container.py

Comment on lines +724 to +743
def is_valid_endpoint(self) -> bool:
"""Check if SCCACHE_MEMCACHED_ENDPOINT is valid"""
endpoint = os.environ.get("SCCACHE_MEMCACHED_ENDPOINT")

if endpoint:
try:
host, port_str = endpoint.rsplit(":", 1)
port = int(port_str)

with socket.create_connection((host, port), timeout=5):
info(f" > Using memcached endpoint {endpoint}")
return True

except Exception:
warn(
f" > Memcached endpoint {endpoint} is not reachable, "
"falling back to local caching."
)

return False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't probe the memcached server in dry-run mode.

Line 733 opens a real TCP connection during argument construction. That makes dry-run mode depend on host reachability and can add a 5-second stall just to print the Docker command.

💡 Suggested direction
     def is_valid_endpoint(self) -> bool:
         """Check if SCCACHE_MEMCACHED_ENDPOINT is valid"""
         endpoint = os.environ.get("SCCACHE_MEMCACHED_ENDPOINT")
+        if self.dryrun:
+            return bool(endpoint)
 
         if endpoint:
             try:
                 host, port_str = endpoint.rsplit(":", 1)
                 port = int(port_str)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def is_valid_endpoint(self) -> bool:
"""Check if SCCACHE_MEMCACHED_ENDPOINT is valid"""
endpoint = os.environ.get("SCCACHE_MEMCACHED_ENDPOINT")
if endpoint:
try:
host, port_str = endpoint.rsplit(":", 1)
port = int(port_str)
with socket.create_connection((host, port), timeout=5):
info(f" > Using memcached endpoint {endpoint}")
return True
except Exception:
warn(
f" > Memcached endpoint {endpoint} is not reachable, "
"falling back to local caching."
)
return False
def is_valid_endpoint(self) -> bool:
"""Check if SCCACHE_MEMCACHED_ENDPOINT is valid"""
endpoint = os.environ.get("SCCACHE_MEMCACHED_ENDPOINT")
if self.dryrun:
return bool(endpoint)
if endpoint:
try:
host, port_str = endpoint.rsplit(":", 1)
port = int(port_str)
with socket.create_connection((host, port), timeout=5):
info(f" > Using memcached endpoint {endpoint}")
return True
except Exception:
warn(
f" > Memcached endpoint {endpoint} is not reachable, "
"falling back to local caching."
)
return False
🧰 Tools
🪛 Ruff (0.15.4)

[warning] 737-737: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@utilities/cli/container.py` around lines 724 - 743, The is_valid_endpoint
function performs a real TCP probe (socket.create_connection) which must be
skipped in dry-run; change is_valid_endpoint to accept a dry_run flag (e.g., def
is_valid_endpoint(self, dry_run: bool = False)) and if dry_run is True, avoid
creating a socket and instead validate only the endpoint format (split on ":"
and int(port)) and log or info that probing was skipped; update all call sites
that construct the Docker/CLI command to pass the current dry-run state (e.g.,
self.args.dry_run or parser.dry_run) into is_valid_endpoint so dry-run printing
never does network I/O.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
utilities/cli/container.py (1)

724-743: ⚠️ Potential issue | 🟠 Major

Keep dry-run side-effect free.

run() still reaches this helper while building the command, so --dry-run opens a real TCP connection here and can stall for 5 seconds just to print Docker args. Please thread the dry-run state into is_valid_endpoint() and skip the network probe in dry-run mode, while still validating the host:port format locally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@utilities/cli/container.py` around lines 724 - 743, The is_valid_endpoint
function currently performs a real network probe which causes side effects
during --dry-run; change its signature (or add an optional parameter) to accept
a dry_run boolean from run() and when dry_run is True only validate the
SCCACHE_MEMCACHED_ENDPOINT format (split on ":" and ensure port is an int and
host is non-empty) but do not call socket.create_connection; when dry_run is
False keep the existing reachability probe and logging behavior. Update callers
(e.g., run()) to pass the dry-run state into is_valid_endpoint so building the
command remains side-effect free in dry-run mode.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@utilities/cli/container.py`:
- Around line 724-743: The is_valid_endpoint function currently performs a real
network probe which causes side effects during --dry-run; change its signature
(or add an optional parameter) to accept a dry_run boolean from run() and when
dry_run is True only validate the SCCACHE_MEMCACHED_ENDPOINT format (split on
":" and ensure port is an int and host is non-empty) but do not call
socket.create_connection; when dry_run is False keep the existing reachability
probe and logging behavior. Update callers (e.g., run()) to pass the dry-run
state into is_valid_endpoint so building the command remains side-effect free in
dry-run mode.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1dfb4c7f-971c-49c8-81f2-7646f6dada88

📥 Commits

Reviewing files that changed from the base of the PR and between 0c9f6bc and 3b542c4.

📒 Files selected for processing (1)
  • utilities/cli/container.py

@bhashemian
Copy link
Copy Markdown
Member

@wendell-hom Could you please address the comments on this PR? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Action Required by Author An action is required by author to proceed with review and approval.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants