Skip to content

feat: isolation serialization, CUDA IPC hardening, and CUDA wheel resolution (v0.9.2)#7

Merged
pollockjj merged 4 commits intoComfy-Org:mainfrom
pollockjj:pr/0.9.2-isolation-support
Mar 12, 2026
Merged

feat: isolation serialization, CUDA IPC hardening, and CUDA wheel resolution (v0.9.2)#7
pollockjj merged 4 commits intoComfy-Org:mainfrom
pollockjj:pr/0.9.2-isolation-support

Conversation

@pollockjj
Copy link
Copy Markdown
Collaborator

@pollockjj pollockjj commented Mar 12, 2026

Summary

  • CUDA wheel resolution: Resolves runtime-matched CUDA wheels from custom indexes, ensuring isolated venvs get the correct torch build for the host GPU
  • Isolation serialization improvements: Extended serialization registry to support custom node types (File3D, PLY, NPZ, VIDEO) required for ComfyUI custom node isolation
  • CUDA IPC hardening: Fixed SIGABRT on process exit caused by purge_orphan_sender_shm_files racing the C++ RefcountedMapAllocator destructor — SIGKILL'd child peers leave the IPC refcount file at count > 1, so the host's C++ close() decrements but does not unlink; purge_orphan then unlinks the file while the C++ mmap is still open, causing a double-unlink → ENOENT → SIGABRT
  • Unit test coverage: Serialization registry, transport limits, and deserialization guard

Test plan

  • All unit tests pass: pytest tests/test_*.py --ignore=tests/test_sandbox_integration.py
  • CUDA IPC integration tests pass with no abort: pytest tests/integration_v2/test_tensors.py
  • Quality gates: ruff check, ruff format --check, mypy pyisolate
  • Full CI matrix validated (38/38 pass)

…dexes

Add opt-in CUDA wheel resolution for isolated extensions based on the host
Python/torch/CUDA tuple, rewrite selected requirements to direct wheel URLs,
and include resolver inputs in dependency cache invalidation.

Also add resolver tests and clean up benchmark files so repo checks pass.
Copilot AI review requested due to automatic review settings March 12, 2026 02:10
@socket-security
Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedpackaging@​26.099100100100100
Addedtyping-extensions@​4.15.0100100100100100

View full report

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 12, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a CUDA wheel resolution subsystem for custom PyTorch installations, refactors benchmarking system improvements including async database methods and standardized timing collection, enhances serialization handling with a data-type flag mechanism, increases RPC message limits to 2GB with size warnings, and adds comprehensive test coverage across new and modified functionality.

Changes

Cohort / File(s) Summary
Benchmarking System
benchmarks/benchmark.py, benchmarks/benchmark_harness.py, benchmarks/memory_benchmark.py, benchmarks/simple_benchmark.py
Refactored Torch availability detection using importlib.find_spec instead of try/except; added async set_value/get_value methods to DatabaseSingleton for in-process persistence; standardized timing collection and result aggregation with mean/stdev/min/max computation; updated main() return type to int; introduced MemoryBenchmarkRunner class; improved extension loading with explicit ExtensionConfig construction.
CUDA Wheel Resolution System
pyisolate/_internal/cuda_wheels.py, pyisolate/config.py, pyisolate/_internal/environment.py, tests/test_cuda_wheels.py
Added comprehensive CUDA wheel resolution subsystem for discovering and selecting compatible wheels from custom indices; introduced CUDAWheelConfig with index URL, package lists, and package mapping; integrated resolution into installation flow via resolve_cuda_wheel_requirements; exposed public APIs for runtime detection and wheel URL resolution; includes 317 lines of test coverage for resolution logic, error handling, and caching behavior.
Serialization Registry Enhancement
pyisolate/_internal/serialization_registry.py, pyisolate/interfaces.py, tests/test_serialization_registry.py
Extended register() with keyword-only data_type flag to mark types as pure data payloads; added is_data_type() inquiry method to SerializerRegistryProtocol; tracks data types in private _data_types set; includes comprehensive test coverage for flag behavior and protocol compliance.
Model Serialization & RPC Protocol
pyisolate/_internal/model_serialization.py, pyisolate/_internal/rpc_protocol.py, tests/test_model_serialization.py
Added early return guard in deserialize_from_isolation to prevent automatic resolution of RemoteObjectHandle when serializer registry has a handler; restricted deserializer-by-type-name check to dict inputs only; refactored RPC response preparation into local variable; includes 121 lines of test coverage for deserialization guard behavior with various object types.
RPC Transport & Tensor Management
pyisolate/_internal/rpc_transports.py, pyisolate/_internal/tensor_serializer.py, tests/test_rpc_transports.py
Increased maximum RPC message length from 100MB to 2GB with warning logs when messages exceed 100MB; refactored tensor keeper flush to suppress exceptions and defer cleanup to C++ destructor, eliminating purge_orphan_sender_shm_files calls to avoid double-unlink races; added 178 lines of test coverage for message size boundaries and edge cases.
Project Metadata
pyproject.toml
Bumped version from 0.9.1 to 0.9.2; added John Pollock as maintainer; added typing_extensions>=4.0 (for Python <3.11) and packaging>=23.0 dependencies.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Resolver
    participant Config as Config Parser
    participant Runtime as Runtime Detector
    participant Index as Wheel Index
    participant Parser as Candidate Parser
    participant Selector as Wheel Selector
    participant Handler as Result Handler

    Client->>Config: load CUDAWheelConfig
    Client->>Runtime: get_cuda_wheel_runtime()
    Runtime-->>Client: torch_version, cuda_version, python_tags
    
    Client->>Client: resolve_cuda_wheel_requirements()
    loop for each requirement
        Client->>Index: fetch index HTML
        Index-->>Client: wheel links & metadata
        Client->>Parser: parse wheel filenames & tags
        Parser-->>Client: candidate wheels
        Client->>Selector: filter by compatibility<br/>(CUDA, torch, python)
        Selector-->>Client: matching wheels
        Client->>Selector: select highest version<br/>& best platform
        Selector-->>Client: chosen wheel URL
    end
    Client->>Handler: resolve_cuda_wheel_url()
    Handler-->>Client: concrete wheel URL or error
Loading
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR prepares the v0.9.2 release by improving isolation-time serialization support, hardening RPC/CUDA IPC behavior, and adding a CUDA wheel resolution mechanism for isolated venv installs.

Changes:

  • Add CUDA wheel resolution utilities plus dependency-install integration and unit tests.
  • Extend SerializerRegistry/protocols with a data_type concept and add deserialization guard tests.
  • Harden RPC transport limits/logging and adjust CUDA IPC cleanup to avoid exit-time aborts.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pyproject.toml Bumps version to 0.9.2 and adds typing_extensions/packaging deps.
pyisolate/interfaces.py Extends registry protocol with data_type and is_data_type().
pyisolate/config.py Adds CUDAWheelConfig and ExtensionConfig.cuda_wheels.
pyisolate/_internal/cuda_wheels.py New CUDA wheel runtime probing + simple-index wheel selection logic.
pyisolate/_internal/environment.py Integrates CUDA wheel resolution into dependency installation and cache keying.
pyisolate/_internal/serialization_registry.py Adds _data_types tracking and is_data_type().
pyisolate/_internal/model_serialization.py Adds dict-only guard for adapter deserializers and tweaks handle resolution.
pyisolate/_internal/rpc_transports.py Adds 2GB hard recv limit and >100MB warning logging.
pyisolate/_internal/rpc_protocol.py Minor refactor to store prepared response before sending.
pyisolate/_internal/tensor_serializer.py Removes exit-time SHM purge to avoid destructor race/SIGABRT.
tests/test_serialization_registry.py Expands registry tests (data_type semantics + protocol checks).
tests/test_model_serialization.py New tests for dict-guard deserialization behavior.
tests/test_rpc_transports.py New tests for JSON socket recv size limits and connection errors.
tests/test_cuda_wheels.py New unit tests for CUDA wheel resolution and dependency cache invalidation.
benchmarks/* Formatting/import hygiene and benchmark harness loading implementation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +24 to +28
def _make_transport() -> JSONSocketTransport:
a, b = socket.socketpair()
b.close()
return JSONSocketTransport(a)

Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_make_transport() creates a socketpair and returns a JSONSocketTransport, but the transport/socket is never closed in the size-limit and connection-error tests that call it. This can leak file descriptors across the test run; consider making _make_transport() a contextmanager/fixture that yields the transport and closes it in finally, or ensure each test calls transport.close().

Copilot uses AI. Check for mistakes.
logger.debug("Dependency cache read failed: %s", exc)

resolved_deps = safe_deps
if cuda_wheels_config:
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolve_cuda_wheel_requirements(...) is called whenever cuda_wheels_config is set, even if none of the current safe_deps match the configured CUDA wheel packages (you already computed needs_cuda_probe above). Consider only running resolution when a match was detected (e.g., gate on needs_cuda_probe / cuda_wheel_runtime is not None) to avoid unnecessary host probing and unexpected failures.

Suggested change
if cuda_wheels_config:
if cuda_wheels_config and cuda_wheel_runtime is not None:

Copilot uses AI. Check for mistakes.
Comment on lines +234 to +238
def resolve_cuda_wheel_requirements(requirements: list[str], config: CUDAWheelConfig) -> list[str]:
normalized_config = _normalize_cuda_wheel_config(config)
configured_packages = set(normalized_config["packages"])
environment = cast(dict[str, str], default_environment())
runtime = get_cuda_wheel_runtime()
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolve_cuda_wheel_requirements() probes get_cuda_wheel_runtime() unconditionally before checking whether any requirement actually needs CUDA wheel resolution. This can raise (missing host torch / non-CUDA torch) even when no configured packages are present or when markers skip them. Consider lazily probing runtime only once you encounter a configured package, or accept an optional precomputed runtime similar to resolve_cuda_wheel_url.

Copilot uses AI. Check for mistakes.
Comment on lines 87 to +91
if isinstance(data, RemoteObjectHandle):
if _nested or extension is None:
return data
if registry.has_handler(data.type_name):
return data
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring says top-level RemoteObjectHandle values are resolved when an extension proxy is available, but this new registry.has_handler(data.type_name) branch skips resolution for some top-level handles. If this is intentional, update the docstring/behavior contract; if not, consider using a more specific signal than has_handler (e.g., is_data_type) so serializer registration alone doesn’t change handle-resolution semantics.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benchmarks/benchmark.py`:
- Around line 349-354: Update the dependency check message to only list truly
required packages (remove "tabulate" from the printed string) and propagate the
exit code returned by run_benchmarks: call result =
asyncio.run(run_benchmarks(args.quick, args.no_torch, args.no_gpu,
args.torch_mode)) and return that result instead of unconditionally returning 0.
Locate the check using find_spec and the run_benchmarks invocation to make these
changes; ensure run_benchmarks indeed returns an int and keep TABULATE_AVAILABLE
handling unchanged.

In `@pyisolate/_internal/cuda_wheels.py`:
- Around line 143-149: The _fetch_index_html function currently swallows network
errors and returns None; update it to log debug-level failure details before
returning None. Catch the exceptions as e (e.g., except (HTTPError, URLError,
FileNotFoundError) as e:) and emit a debug log via the module logger
(logging.getLogger(__name__) or the existing module-level logger) that includes
the URL, timeout, and exception message/traceback to aid troubleshooting, then
return None as before.

In `@pyisolate/_internal/rpc_transports.py`:
- Around line 148-154: Replace the magic numeric literals with descriptive
module-level constants: define e.g. MAX_RPC_MESSAGE_SIZE = 2 * 1024 * 1024 *
1024 and LARGE_MESSAGE_WARNING_THRESHOLD = 100 * 1024 * 1024 at the top of the
module, then update the checks that currently use the literals (the comparisons
against msg_len in the block that raises ValueError and calls logger.warning) to
use these constants, and update the logger.warning message to reference the
named constant where helpful; ensure references to msg_len, logger.warning, and
the ValueError remain unchanged except for using the constants.

In `@pyisolate/_internal/tensor_serializer.py`:
- Around line 199-207: The current change removed the only automatic cleanup of
sender-side SHM and left orphan files unmanaged; restore a safe automatic owner
or make the opt-in behavior explicit: either (A) reinstate a guarded cleanup
call to purge_orphan_sender_shm_files() executed by a safe internal janitor
(e.g., a background thread/process or context manager) that runs only if
os.getenv('PYISOLATE_PURGE_SENDER_SHM') == '1' to avoid the double-unlink race
with the C++ allocator, or (B) keep automatic suppression but add explicit
documentation and runtime guidance and add a clear opt-in check here (near
flush_tensor_keeper) so that when PYISOLATE_PURGE_SENDER_SHM is enabled the code
will call purge_orphan_sender_shm_files() (with appropriate
exception-suppression and locking to avoid races); reference
purge_orphan_sender_shm_files, flush_tensor_keeper and the env var
PYISOLATE_PURGE_SENDER_SHM in the change.

In `@tests/test_cuda_wheels.py`:
- Around line 256-317: Move the local import of os in the
test_install_dependencies_cache_invalidation_tracks_cuda_runtime function to the
top-level imports for the test module; update the module imports so os is
imported alongside other test imports and remove the in-function import,
ensuring the test function still references os for path logic (venv_path /
"Scripts" / "python.exe" branch) without changing any test behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 14e81aa2-c0ce-4b3a-9dd7-5eb041fbee72

📥 Commits

Reviewing files that changed from the base of the PR and between 357171e and 7e407d0.

📒 Files selected for processing (18)
  • benchmarks/benchmark.py
  • benchmarks/benchmark_harness.py
  • benchmarks/memory_benchmark.py
  • benchmarks/simple_benchmark.py
  • pyisolate/_internal/cuda_wheels.py
  • pyisolate/_internal/environment.py
  • pyisolate/_internal/model_serialization.py
  • pyisolate/_internal/rpc_protocol.py
  • pyisolate/_internal/rpc_transports.py
  • pyisolate/_internal/serialization_registry.py
  • pyisolate/_internal/tensor_serializer.py
  • pyisolate/config.py
  • pyisolate/interfaces.py
  • pyproject.toml
  • tests/test_cuda_wheels.py
  • tests/test_model_serialization.py
  • tests/test_rpc_transports.py
  • tests/test_serialization_registry.py

Comment on lines +349 to +354
if find_spec("numpy") is None or find_spec("psutil") is None:
print("Please install dependencies: pip install numpy psutil tabulate")
return 1

asyncio.run(run_benchmarks(args.quick, args.no_torch, args.no_gpu, args.torch_mode))
return 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Error message lists optional dependency; run_benchmarks return value is ignored.

Two minor issues:

  1. Line 350 mentions tabulate in the required dependencies message, but tabulate is optional (handled gracefully via TABULATE_AVAILABLE). Consider updating the message to only list truly required dependencies.

  2. Line 353 calls run_benchmarks() which returns 0, but this return value is ignored. Line 354 unconditionally returns 0. If run_benchmarks is intended to signal success/failure via its return value, propagate it.

Proposed fix
     if find_spec("numpy") is None or find_spec("psutil") is None:
-        print("Please install dependencies: pip install numpy psutil tabulate")
+        print("Please install dependencies: pip install numpy psutil")
+        print("Optional: pip install tabulate (for formatted output)")
         return 1

-    asyncio.run(run_benchmarks(args.quick, args.no_torch, args.no_gpu, args.torch_mode))
-    return 0
+    return asyncio.run(run_benchmarks(args.quick, args.no_torch, args.no_gpu, args.torch_mode))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if find_spec("numpy") is None or find_spec("psutil") is None:
print("Please install dependencies: pip install numpy psutil tabulate")
return 1
asyncio.run(run_benchmarks(args.quick, args.no_torch, args.no_gpu, args.torch_mode))
return 0
if find_spec("numpy") is None or find_spec("psutil") is None:
print("Please install dependencies: pip install numpy psutil")
print("Optional: pip install tabulate (for formatted output)")
return 1
return asyncio.run(run_benchmarks(args.quick, args.no_torch, args.no_gpu, args.torch_mode))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/benchmark.py` around lines 349 - 354, Update the dependency check
message to only list truly required packages (remove "tabulate" from the printed
string) and propagate the exit code returned by run_benchmarks: call result =
asyncio.run(run_benchmarks(args.quick, args.no_torch, args.no_gpu,
args.torch_mode)) and return that result instead of unconditionally returning 0.
Locate the check using find_spec and the run_benchmarks invocation to make these
changes; ensure run_benchmarks indeed returns an int and keep TABULATE_AVAILABLE
handling unchanged.

Comment on lines +143 to +149
def _fetch_index_html(url: str) -> str | None:
try:
with urlopen(url, timeout=30) as response: # noqa: S310 - URL is explicit extension config
content: bytes = response.read()
return content.decode("utf-8")
except (HTTPError, URLError, FileNotFoundError):
return None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Network timeout is appropriate; consider logging fetch failures.

The 30-second timeout is reasonable for index fetches. The function silently returns None on failure, which is handled by callers. Consider whether debug-level logging would help troubleshoot network issues.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyisolate/_internal/cuda_wheels.py` around lines 143 - 149, The
_fetch_index_html function currently swallows network errors and returns None;
update it to log debug-level failure details before returning None. Catch the
exceptions as e (e.g., except (HTTPError, URLError, FileNotFoundError) as e:)
and emit a debug log via the module logger (logging.getLogger(__name__) or the
existing module-level logger) that includes the URL, timeout, and exception
message/traceback to aid troubleshooting, then return None as before.

Comment on lines +148 to +154
if msg_len > 2 * 1024 * 1024 * 1024: # 2GB hard limit
raise ValueError(f"Message too large: {msg_len} bytes")
if msg_len > 100 * 1024 * 1024: # 100MB — flag large payloads
logger.warning(
"Large RPC message: %.1fMB — consider SHM-backed transfer for this type",
msg_len / (1024 * 1024),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

LGTM! Consider extracting named constants for clarity.

The limit increase to 2GB and the warning threshold at 100MB are sensible. The boundary tests in test_rpc_transports.py properly verify these limits.

Minor suggestion: extract magic numbers as module-level constants for self-documentation:

♻️ Optional refactor
+_MSG_SIZE_WARN_THRESHOLD = 100 * 1024 * 1024   # 100MB
+_MSG_SIZE_HARD_LIMIT = 2 * 1024 * 1024 * 1024  # 2GB
+
 # In recv():
-            if msg_len > 2 * 1024 * 1024 * 1024:  # 2GB hard limit
+            if msg_len > _MSG_SIZE_HARD_LIMIT:
                 raise ValueError(f"Message too large: {msg_len} bytes")
-            if msg_len > 100 * 1024 * 1024:  # 100MB — flag large payloads
+            if msg_len > _MSG_SIZE_WARN_THRESHOLD:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyisolate/_internal/rpc_transports.py` around lines 148 - 154, Replace the
magic numeric literals with descriptive module-level constants: define e.g.
MAX_RPC_MESSAGE_SIZE = 2 * 1024 * 1024 * 1024 and
LARGE_MESSAGE_WARNING_THRESHOLD = 100 * 1024 * 1024 at the top of the module,
then update the checks that currently use the literals (the comparisons against
msg_len in the block that raises ValueError and calls logger.warning) to use
these constants, and update the logger.warning message to reference the named
constant where helpful; ensure references to msg_len, logger.warning, and the
ValueError remain unchanged except for using the constants.

Comment on lines +199 to 207
# purge_orphan_sender_shm_files is intentionally NOT called here.
# Calling it with force=True races with the C++ RefcountedMapAllocator
# destructor: SIGKILL'd child peers leave the refcount file at count > 1,
# so the host's C++ close() decrements but does NOT unlink (refcount still > 0).
# purge_orphan would then unlink the file while the C++ mmap is still open,
# causing a double-unlink → ENOENT → SIGABRT at process exit.
# Files are cleaned up by the C++ destructor when the last consumer closes.
with contextlib.suppress(Exception):
flush_tensor_keeper()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Internal call sites for purge_orphan_sender_shm_files =="
rg -n -C2 '\bpurge_orphan_sender_shm_files\s*\('

echo
echo "== Cleanup-related hooks and env flags =="
rg -n -C2 'PYISOLATE_PURGE_SENDER_SHM|PYISOLATE_SIGNAL_CLEANUP|atexit\.register|signal\.signal'

echo
echo "== User-facing docs mentioning manual orphan SHM cleanup =="
fd -t f 'README(\..*)?$|.*\.md$' . | xargs -r rg -n -C2 'purge_orphan_sender_shm_files|PYISOLATE_PURGE_SENDER_SHM|orphan.*shm|shared memory'

Repository: Comfy-Org/pyisolate

Length of output: 5413


Restore a safe orphan-SHM cleanup owner, or explicitly document manual invocation.

This removes the only automatic cleanup path for sender-side torch_<pid>_* SHM files. The SIGABRT fix (avoiding the double-unlink race) is necessary, but removes the safeguard against unbounded /dev/shm leakage in long-lived hosts when children terminate abnormally. Cleanup is now manual—either by setting the opt-in environment variable PYISOLATE_PURGE_SENDER_SHM=1 (disabled by default) or by explicitly calling purge_orphan_sender_shm_files() after abnormal termination—but user-facing documentation does not surface this trade-off. Either move orphan cleanup to a safe internal owner (for example a context manager or periodic janitor) or clearly document that applications must invoke purge_orphan_sender_shm_files() themselves or set PYISOLATE_PURGE_SENDER_SHM=1 when children may be killed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyisolate/_internal/tensor_serializer.py` around lines 199 - 207, The current
change removed the only automatic cleanup of sender-side SHM and left orphan
files unmanaged; restore a safe automatic owner or make the opt-in behavior
explicit: either (A) reinstate a guarded cleanup call to
purge_orphan_sender_shm_files() executed by a safe internal janitor (e.g., a
background thread/process or context manager) that runs only if
os.getenv('PYISOLATE_PURGE_SENDER_SHM') == '1' to avoid the double-unlink race
with the C++ allocator, or (B) keep automatic suppression but add explicit
documentation and runtime guidance and add a clear opt-in check here (near
flush_tensor_keeper) so that when PYISOLATE_PURGE_SENDER_SHM is enabled the code
will call purge_orphan_sender_shm_files() (with appropriate
exception-suppression and locking to avoid races); reference
purge_orphan_sender_shm_files, flush_tensor_keeper and the env var
PYISOLATE_PURGE_SENDER_SHM in the change.

Comment on lines +256 to +317
def test_install_dependencies_cache_invalidation_tracks_cuda_runtime(monkeypatch, tmp_path):
import os

venv_path = tmp_path / "venv"
python_exe = venv_path / "Scripts" / "python.exe" if os.name == "nt" else venv_path / "bin" / "python"
python_exe.parent.mkdir(parents=True, exist_ok=True)
python_exe.write_text("#!/usr/bin/env python\n", encoding="utf-8")

monkeypatch.setattr(environment.shutil, "which", lambda binary: "/usr/bin/uv")
monkeypatch.setattr(
environment,
"exclude_satisfied_requirements",
lambda config, requirements, python_exe: requirements,
)
monkeypatch.setattr(
environment,
"resolve_cuda_wheel_requirements",
lambda requirements, config: ["https://example.invalid/flash_attn.whl"],
)

current_runtime = {"value": {"torch": "2.8", "cuda": "12.8", "python_tags": ["cp312"]}}
monkeypatch.setattr(
environment,
"get_cuda_wheel_runtime_descriptor",
lambda: current_runtime["value"],
)

popen_calls: list[list[str]] = []

class MockPopen:
def __init__(self, cmd, **kwargs):
popen_calls.append(cmd)
self.stdout = io.StringIO("installed\n")

def wait(self):
return 0

def __enter__(self):
return self

def __exit__(self, exc_type, exc, tb):
return False

monkeypatch.setattr(environment.subprocess, "Popen", MockPopen)

config = {
"dependencies": ["flash-attn>=1.0"],
"share_torch": True,
"cuda_wheels": {
"index_url": "https://example.invalid/cuda-wheels/",
"packages": ["flash-attn"],
"package_map": {},
},
}

environment.install_dependencies(venv_path, config, "demo")
environment.install_dependencies(venv_path, config, "demo")

current_runtime["value"] = {"torch": "2.8", "cuda": "12.9", "python_tags": ["cp312"]}
environment.install_dependencies(venv_path, config, "demo")

assert len(popen_calls) == 2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider importing os at module level for consistency.

The os import on line 257 could be moved to the module level with other imports. This is a minor style concern but would improve consistency.

The test logic itself is excellent - it properly validates cache invalidation by:

  1. Running install twice with same runtime (should only call Popen once)
  2. Changing CUDA version and running again (should trigger second Popen call)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_cuda_wheels.py` around lines 256 - 317, Move the local import of
os in the test_install_dependencies_cache_invalidation_tracks_cuda_runtime
function to the top-level imports for the test module; update the module imports
so os is imported alongside other test imports and remove the in-function
import, ensuring the test function still references os for path logic (venv_path
/ "Scripts" / "python.exe" branch) without changing any test behavior.

@pollockjj pollockjj merged commit 34db809 into Comfy-Org:main Mar 12, 2026
91 checks passed
@pollockjj pollockjj deleted the pr/0.9.2-isolation-support branch March 15, 2026 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants