Skip to content

feat: isolation serialization, CUDA IPC hardening, and CUDA wheel resolution (v0.9.2)#6

Closed
pollockjj wants to merge 4 commits intoComfy-Org:mainfrom
pollockjj:pr/0.9.2-isolation-support
Closed

feat: isolation serialization, CUDA IPC hardening, and CUDA wheel resolution (v0.9.2)#6
pollockjj wants to merge 4 commits intoComfy-Org:mainfrom
pollockjj:pr/0.9.2-isolation-support

Conversation

@pollockjj
Copy link
Copy Markdown
Collaborator

Summary

  • CUDA wheel resolution: Resolves runtime-matched CUDA wheels from custom indexes, ensuring isolated venvs get the correct torch build for the host GPU
  • Isolation serialization improvements: Extended serialization registry to support custom node types (File3D, PLY, NPZ, VIDEO) required for ComfyUI custom node isolation
  • CUDA IPC hardening: Fixed SIGABRT on process exit caused by purge_orphan_sender_shm_files racing the C++ RefcountedMapAllocator destructor — SIGKILL'd child peers leave the IPC refcount file at count > 1, so the host's C++ close() decrements but does not unlink; purge_orphan then unlinks the file while the C++ mmap is still open, causing a double-unlink → ENOENT → SIGABRT
  • Unit test coverage: Serialization registry, transport limits, and deserialization guard

Test plan

  • All unit tests pass: pytest tests/test_*.py --ignore=tests/test_sandbox_integration.py
  • CUDA IPC integration tests pass with no abort: pytest tests/integration_v2/test_tensors.py
  • Quality gates: ruff check, ruff format --check, mypy pyisolate

Copilot AI review requested due to automatic review settings March 12, 2026 00:44
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 12, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Adds CUDA wheel resolution and integration into dependency installation, expands benchmark harness and runners with torch-aware modes and async extension APIs, extends serialization registry with data_type semantics, raises RPC JSON transport limit to 2GB with warnings, and updates tests and project metadata (v0.9.2).

Changes

Cohort / File(s) Summary
Benchmarking
benchmarks/benchmark.py, benchmarks/benchmark_harness.py, benchmarks/memory_benchmark.py, benchmarks/simple_benchmark.py
Reworked torch availability checks and extension loading; added async host/child DatabaseSingleton methods, BenchmarkExtensionWrapper.on_module_loaded validation, new MemoryBenchmarkRunner, updated run_benchmarks/main signatures and wiring, and guarded torch multiprocessing setup.
CUDA wheel resolver
pyisolate/_internal/cuda_wheels.py
New comprehensive module to detect runtime (torch/CUDA/Python tags), normalize config, crawl index pages, evaluate wheel candidates, and resolve best wheel URLs/requirements; introduces CUDAWheelRuntime and CUDAWheelResolutionError.
CUDA integration into env
pyisolate/_internal/environment.py, pyisolate/config.py
Added CUDAWheelConfig TypedDict and cuda_wheels field on ExtensionConfig; resolve_cuda_wheel_requirements integrated into dependency resolution and installation flow with extra logging and wheel-runtime propagation.
Serialization registry & adapters
pyisolate/_internal/serialization_registry.py, pyisolate/_internal/model_serialization.py, pyisolate/interfaces.py
Added data_type tracking to SerializerRegistry (register(data_type=True), is_data_type), init/clear changes; model deserialization now guards adapter deserializers to only apply to dict-shaped serialized data; protocol updated accordingly.
RPC control & transports
pyisolate/_internal/rpc_protocol.py, pyisolate/_internal/rpc_transports.py
Minor request dispatch refactor (introduces prepared var) and JSONSocketTransport recv: increased hard message-size limit to 2GB with a warning above 100MB and ValueError for >2GB.
Tensor serialization cleanup
pyisolate/_internal/tensor_serializer.py
Simplified shutdown cleanup: suppress exceptions from flush_tensor_keeper and removed forced orphan SHM purge to avoid race/double-unlink issues.
Tests
tests/test_cuda_wheels.py, tests/test_model_serialization.py, tests/test_rpc_transports.py, tests/test_serialization_registry.py
New and expanded tests for CUDA wheel resolution behavior and runtime errors, model deserialization dict-guard semantics, JSONSocketTransport size limits and warnings, and SerializerRegistry data_type behavior and protocol compliance.
Project metadata
pyproject.toml, requirements.txt
Bumped version to 0.9.2, added John Pollock as maintainer, added packaging and typing_extensions dependencies where appropriate.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client Code
    participant Resolver as resolve_cuda_wheel_url()
    participant Runtime as get_cuda_wheel_runtime()
    participant Config as Config Normalizer
    participant Fetcher as _fetch_index_html()
    participant Parser as _parse_index_links()
    participant Evaluator as Wheel Candidate Evaluator

    Client->>Resolver: request resolution(requirement, config)
    Resolver->>Runtime: query torch/CUDA/python runtime
    Runtime-->>Resolver: CUDAWheelRuntime
    Resolver->>Config: normalize/validate cuda_wheels config
    Config-->>Resolver: normalized config
    Resolver->>Fetcher: fetch index HTML
    Fetcher-->>Resolver: HTML content
    Resolver->>Parser: extract wheel links
    Parser-->>Resolver: wheel URLs
    Resolver->>Evaluator: filter/rank candidates by runtime, tags, version
    Evaluator-->>Resolver: best wheel URL
    Resolver-->>Client: resolved wheel URL
Loading
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@socket-security
Copy link
Copy Markdown

socket-security bot commented Mar 12, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedpackaging@​26.099100100100100
Addedtyping-extensions@​4.15.0100100100100100

View full report

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends PyIsolate’s serialization/RPC infrastructure and dependency installation to better handle large payloads and CUDA-specific wheel resolution, with accompanying unit tests and a version bump.

Changes:

  • Add custom “CUDA wheels” resolution via a simple-index HTML fetch/parse flow, and thread CUDA-runtime info into dependency-cache invalidation.
  • Add data_type support to the serializer registry/protocol and add guardrails to avoid double-deserialization in deserialize_from_isolation.
  • Enforce a 2GB hard receive limit (with warnings above 100MB) for JSON socket RPC messages, and add targeted tests for these behaviors.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/test_serialization_registry.py Expands coverage for new data_type registry behavior and protocol surface.
tests/test_rpc_transports.py New tests covering JSONSocketTransport recv size limits, warnings, and connection errors.
tests/test_model_serialization.py New tests validating dict-only deserializer application and passthrough behavior.
tests/test_cuda_wheels.py New tests for CUDA wheel URL resolution and dependency-cache invalidation behavior.
pyproject.toml Bumps version to 0.9.2 and adds packaging dependency.
pyisolate/interfaces.py Extends SerializerRegistryProtocol.register with kw-only data_type and adds is_data_type.
pyisolate/config.py Introduces CUDAWheelConfig + cuda_wheels extension config wiring.
pyisolate/_internal/tensor_serializer.py Adjusts atexit cleanup behavior to avoid shutdown race/double-unlink issues.
pyisolate/_internal/serialization_registry.py Implements data_type tracking and is_data_type() with clear/reset behavior.
pyisolate/_internal/rpc_transports.py Adds 2GB hard limit and 100MB warning threshold to recv path.
pyisolate/_internal/rpc_protocol.py Minor refactor to store prepared response before send.
pyisolate/_internal/model_serialization.py Adds dict-only guard for adapter deserializers and tweaks RemoteObjectHandle handling.
pyisolate/_internal/environment.py Integrates CUDA wheel resolution into dependency install flow + cache descriptor.
pyisolate/_internal/cuda_wheels.py New module implementing runtime detection and wheel URL resolution from a simple index.
benchmarks/simple_benchmark.py Minor formatting change.
benchmarks/memory_benchmark.py Import cleanup/typing tweaks and runner class addition.
benchmarks/benchmark.py Formatting/refactor improvements and safer dependency presence checks.
benchmarks/benchmark_harness.py Refactor harness loading and torch multiprocessing setup handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@@ -28,6 +29,7 @@ keywords = ["virtual environment", "venv", "development"]
dependencies = [
"uv>=0.1.0",
"tomli>=2.0.1; python_version < '3.11'",
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pyisolate/config.py now imports typing_extensions.NotRequired, but typing-extensions is not listed in the project dependencies. This will raise ModuleNotFoundError for users on Python 3.10 (and any envs without the transitive dep). Add an explicit typing-extensions dependency (optionally environment-marked for Python < 3.11).

Suggested change
"tomli>=2.0.1; python_version < '3.11'",
"tomli>=2.0.1; python_version < '3.11'",
"typing-extensions>=4.0.0; python_version < '3.11'",

Copilot uses AI. Check for mistakes.
Comment on lines +143 to +149
def _fetch_index_html(url: str) -> str | None:
try:
with urlopen(url) as response: # noqa: S310 - URL is explicit extension config
content: bytes = response.read()
return content.decode("utf-8")
except (HTTPError, URLError, FileNotFoundError):
return None
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urlopen() is called without a timeout, which can cause install_dependencies() to hang indefinitely if the wheel index is slow/unresponsive. Add a reasonable default timeout (and consider restricting schemes to http/https if this config can come from untrusted sources).

Copilot uses AI. Check for mistakes.


def _make_transport() -> JSONSocketTransport:
a, _ = socket.socketpair()
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_make_transport() creates a socketpair() but drops the peer socket without closing it, leaking file descriptors across the test run. Close the unused peer (or return both ends and ensure both transports/sockets are closed).

Suggested change
a, _ = socket.socketpair()
a, b = socket.socketpair()
b.close()

Copilot uses AI. Check for mistakes.
@pytest.fixture()
def socket_pair() -> tuple[JSONSocketTransport, JSONSocketTransport]:
a, b = socket.socketpair()
return JSONSocketTransport(a), JSONSocketTransport(b)
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

socket_pair yields two transports but never closes them, which can leak sockets/FDs and make the test suite flaky under low ulimit settings. Convert this to a yielding fixture with a teardown that calls close() on both transports (and/or closes the underlying sockets).

Suggested change
return JSONSocketTransport(a), JSONSocketTransport(b)
transport_a = JSONSocketTransport(a)
transport_b = JSONSocketTransport(b)
try:
yield transport_a, transport_b
finally:
transport_a.close()
transport_b.close()

Copilot uses AI. Check for mistakes.
@pollockjj pollockjj force-pushed the pr/0.9.2-isolation-support branch from a484333 to 9350f9f Compare March 12, 2026 00:51
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 12, 2026

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

ℹ️ You can also turn on project coverage checks and project coverage reporting on Pull Request comment

Thanks for integrating Codecov - We've got you covered ☂️

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
benchmarks/memory_benchmark.py (2)

374-377: ⚠️ Potential issue | 🟠 Major

Always clean up extensions and tensors in a finally block.

Any exception before the tail cleanup path leaves isolated child processes and CUDA/IPC state alive, which can skew later measurements in the same run and leak resources after a failed benchmark iteration.

Suggested pattern
             manager = ExtensionManager(
                 MemoryBenchmarkExtensionBase,
                 ExtensionManagerConfig(venv_root_path=str(self.test_base.test_root / "extension-venvs")),
             )
+            test_tensor = None

-            # Clean up and reset baseline before measuring
-            gc.collect()
-            if CUDA_AVAILABLE:
-                torch.cuda.empty_cache()
-                torch.cuda.synchronize()  # Ensure all operations complete
+            try:
+                # Clean up and reset baseline before measuring
+                gc.collect()
+                if CUDA_AVAILABLE:
+                    torch.cuda.empty_cache()
+                    torch.cuda.synchronize()  # Ensure all operations complete

-            # Reset GPU memory baseline for this test
-            self.memory_tracker.reset_baseline()
+                # Reset GPU memory baseline for this test
+                self.memory_tracker.reset_baseline()

-            # Wait a moment for memory to settle
-            await asyncio.sleep(1)
+                # Wait a moment for memory to settle
+                await asyncio.sleep(1)

-            ...
+                ...
-            manager.stop_all_extensions()
-            del test_tensor
-            gc.collect()
-            if CUDA_AVAILABLE:
-                torch.cuda.empty_cache()
-                torch.cuda.synchronize()
-
-            # Wait for cleanup
-            await asyncio.sleep(2)
+            finally:
+                manager.stop_all_extensions()
+                if test_tensor is not None:
+                    del test_tensor
+                gc.collect()
+                if CUDA_AVAILABLE:
+                    torch.cuda.empty_cache()
+                    torch.cuda.synchronize()
+                await asyncio.sleep(2)

Apply the same structure in run_large_tensor_sharing_test for large_tensor.

Also applies to: 397-550, 584-689

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/memory_benchmark.py` around lines 374 - 377, The ExtensionManager
instantiation and large tensor allocation in run_large_tensor_sharing_test (and
similar blocks) must be wrapped in try/finally so resources are always released;
specifically, ensure the ExtensionManager (created from
MemoryBenchmarkExtensionBase and ExtensionManagerConfig) is shut down in a
finally block and that any allocated large_tensor is freed/closed in the same
finally, mirroring the suggested pattern referenced in the comment so child
processes and CUDA/IPC state are always cleaned up even on exceptions.

367-368: ⚠️ Potential issue | 🟡 Minor

Reject num_extensions <= 0 before computing per-extension metrics.

run_scaling_test() divides by num_extensions, so --counts 0 currently crashes with ZeroDivisionError. The large-tensor path already guards the equivalent calculation, so this method should validate or guard too.

Minimal fix
         for num_extensions in num_extensions_list:
+            if num_extensions <= 0:
+                raise ValueError("num_extensions must be greater than 0")
+
             print(f"\n{'=' * 60}")
             print(f"Testing with {num_extensions} extensions (share_torch={share_torch})")
             print("=" * 60)

Also applies to: 488-499

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/memory_benchmark.py` around lines 367 - 368, run_scaling_test()
currently divides by num_extensions when computing per-extension metrics and
crashes on num_extensions <= 0; before doing the per-extension calculations (the
loop over num_extensions_list and the computations later around the
per-extension metrics block near the other guarded large-tensor path), validate
num_extensions and either raise a clear ValueError for num_extensions <= 0 or
skip/continue that iteration (e.g., if num_extensions == 0, set per-extension
metrics to None or skip printing) so you never perform a division-by-zero;
update the checks where per-extension metrics are computed (refer to
run_scaling_test and the per-extension metric calculation block) to include this
guard.
benchmarks/simple_benchmark.py (1)

39-52: 🧹 Nitpick | 🔵 Trivial

Duplicate import statement.

os is imported twice within measure_rpc_overhead: once at line 39 and again at line 52. The second import is redundant.

🧹 Proposed fix
     import os
+    from typing import TypedDict, cast
 
     if sys.platform == "linux" and os.environ.get("TMPDIR") != "/dev/shm":
         print("WARNING: TMPDIR is not set to /dev/shm on Linux.")
@@ -49,8 +50,6 @@
     print("Setting up extensions (this may take a moment)...")
 
     # Use the same setup as the example
     try:
-        import os
-        from typing import TypedDict, cast
 
         import yaml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/simple_benchmark.py` around lines 39 - 52, The function
measure_rpc_overhead contains a duplicate import of the os module (import os
appears at the top of the function and again later); remove the redundant second
import (the one inside the try block) so os is only imported once (prefer
top-level imports), and run lint/tests to confirm no other duplicate imports
remain — references: measure_rpc_overhead, the redundant "import os" inside the
try block.
pyisolate/_internal/tensor_serializer.py (1)

168-195: ⚠️ Potential issue | 🟡 Minor

Improve docstring to document the force parameter's race condition risk.

purge_orphan_sender_shm_files is intentionally exported for external/manual invocation. However, the docstring lacks critical safety information: using force=True races with the C++ RefcountedMapAllocator destructor and can cause ENOENT → SIGABRT at process exit if orphan files are unlinked while the C++ mmap is still open. Add this warning to the docstring so users understand when it's safe to call with force=True.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyisolate/_internal/tensor_serializer.py` around lines 168 - 195, Update the
docstring for purge_orphan_sender_shm_files to document that using force=True
can race with the C++ RefcountedMapAllocator destructor and may cause an
ENOENT→SIGABRT at process exit if the underlying mmap is still open; explicitly
state when it is safe to use force (e.g., only when you can guarantee the C++
allocator/mmap is no longer active or from a supervisory process after the
originating process has exited) and advise preferring the default guarded
behavior (PYISOLATE_PURGE_SENDER_SHM or force=False) otherwise so callers
understand the risk and safe usage patterns.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benchmarks/benchmark_harness.py`:
- Around line 68-76: The ExtensionConfig TypedDict is missing required keys
(share_cuda_ipc, sandbox, sandbox_mode, env) when constructed before calling
self.manager.load_extension; add those fields with sensible defaults (e.g.,
share_cuda_ipc=False, sandbox=False, sandbox_mode="default" or the module's
default, and env={} or None as appropriate) in the config passed to
ExtensionConfig so the TypedDict contract is satisfied (alternatively update
ExtensionConfig to mark those keys NotRequired if you prefer), and ensure this
aligns with how Extension.__init__ consumes .get() defaults.

In `@pyisolate/_internal/cuda_wheels.py`:
- Around line 234-238: resolve_cuda_wheel_requirements currently calls
get_cuda_wheel_runtime() eagerly which raises on CPU-only hosts even when no
requirements need rewriting; instead defer probing until a specific requirement
actually requires CUDA handling. Remove the early runtime =
get_cuda_wheel_runtime() call and call get_cuda_wheel_runtime() lazily inside
the loop that iterates over requirements (only when req.package in
configured_packages and its marker does not skip it), cache the runtime result
for subsequent rewrites, and ensure any exception is only raised when a rewrite
is actually attempted; update the same pattern in the related helper code
referenced around the later block handling cuda_wheels.packages.

In `@pyisolate/_internal/environment.py`:
- Around line 313-317: The code eagerly calls
get_cuda_wheel_runtime_descriptor() whenever cuda_wheels_config is present,
causing unnecessary host CUDA probes; change the logic so cuda_wheel_runtime
remains None unless the current install set contains at least one
resolver-targeted dependency that could be rewritten for CUDA (i.e., detect
resolver-targeted deps in the install set before probing). Specifically, keep
the existing cuda_wheels_config and cuda_wheel_runtime variables but defer
invoking get_cuda_wheel_runtime_descriptor() until after checking the install
set for resolver-targeted dependencies; only then assign cuda_wheel_runtime =
get_cuda_wheel_runtime_descriptor().
- Around line 366-373: The logs currently print raw dependency URLs (e.g.,
original_dep, resolved_dep and any install target lists or uv output) which may
contain credentials; before calling logger.info in the CUDA_WHEEL_RESOLVED site
(the loop over safe_deps/resolved_deps using zip), and the other two logging
sites you flagged, sanitize values by parsing any string that looks like a URL
and replacing it with only stable metadata (host/netloc and wheel filename/path
basename) while stripping query strings and userinfo; fallback to the original
string when parsing fails. Update the logger.info calls that reference
original_dep, resolved_dep, install target lists or raw uv output to pass the
sanitized versions instead so only host and filename (no credentials/query)
appear in logs.

In `@pyisolate/config.py`:
- Line 6: The import of NotRequired from typing_extensions in config.py causes
runtime failures because typing_extensions is not declared as a dependency;
either add "typing_extensions" to pyproject.toml dependencies, or change the
import in config.py to a safe conditional import: try to import NotRequired from
typing (for Python versions that include it) and fall back to importing from
typing_extensions, ensuring the symbol NotRequired is always defined at runtime.

In `@pyproject.toml`:
- Around line 29-33: The dependency list is missing typing_extensions for older
Python versions which pyisolate/config.py relies on (it imports NotRequired);
update the dependencies array in pyproject.toml to include typing_extensions
(e.g., "typing_extensions>=4.0; python_version < '3.11'") alongside tomli and
packaging so NotRequired is available on Python <3.11.

In `@tests/test_cuda_wheels.py`:
- Around line 250-253: The test always creates venv/bin/python but the
environment lookup (pyisolate/_internal/environment.py search for
Scripts/python.exe) expects the Windows layout; update the fixture that sets
venv_path/python_exe to create the platform-specific venv layout instead of
hard-coding "bin/python": detect the platform (e.g., sys.platform or os.name)
and create "Scripts/python.exe" on Windows (with appropriate executable content
and extension) and "bin/python" on POSIX, or create both paths for reliability;
reference venv_path and python_exe when making the path change so the test
mirrors the real environment layout used by the code under test.

---

Outside diff comments:
In `@benchmarks/memory_benchmark.py`:
- Around line 374-377: The ExtensionManager instantiation and large tensor
allocation in run_large_tensor_sharing_test (and similar blocks) must be wrapped
in try/finally so resources are always released; specifically, ensure the
ExtensionManager (created from MemoryBenchmarkExtensionBase and
ExtensionManagerConfig) is shut down in a finally block and that any allocated
large_tensor is freed/closed in the same finally, mirroring the suggested
pattern referenced in the comment so child processes and CUDA/IPC state are
always cleaned up even on exceptions.
- Around line 367-368: run_scaling_test() currently divides by num_extensions
when computing per-extension metrics and crashes on num_extensions <= 0; before
doing the per-extension calculations (the loop over num_extensions_list and the
computations later around the per-extension metrics block near the other guarded
large-tensor path), validate num_extensions and either raise a clear ValueError
for num_extensions <= 0 or skip/continue that iteration (e.g., if num_extensions
== 0, set per-extension metrics to None or skip printing) so you never perform a
division-by-zero; update the checks where per-extension metrics are computed
(refer to run_scaling_test and the per-extension metric calculation block) to
include this guard.

In `@benchmarks/simple_benchmark.py`:
- Around line 39-52: The function measure_rpc_overhead contains a duplicate
import of the os module (import os appears at the top of the function and again
later); remove the redundant second import (the one inside the try block) so os
is only imported once (prefer top-level imports), and run lint/tests to confirm
no other duplicate imports remain — references: measure_rpc_overhead, the
redundant "import os" inside the try block.

In `@pyisolate/_internal/tensor_serializer.py`:
- Around line 168-195: Update the docstring for purge_orphan_sender_shm_files to
document that using force=True can race with the C++ RefcountedMapAllocator
destructor and may cause an ENOENT→SIGABRT at process exit if the underlying
mmap is still open; explicitly state when it is safe to use force (e.g., only
when you can guarantee the C++ allocator/mmap is no longer active or from a
supervisory process after the originating process has exited) and advise
preferring the default guarded behavior (PYISOLATE_PURGE_SENDER_SHM or
force=False) otherwise so callers understand the risk and safe usage patterns.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1aef4ea9-b241-4ccb-9008-a02deeb2d35d

📥 Commits

Reviewing files that changed from the base of the PR and between 357171e and a484333.

📒 Files selected for processing (18)
  • benchmarks/benchmark.py
  • benchmarks/benchmark_harness.py
  • benchmarks/memory_benchmark.py
  • benchmarks/simple_benchmark.py
  • pyisolate/_internal/cuda_wheels.py
  • pyisolate/_internal/environment.py
  • pyisolate/_internal/model_serialization.py
  • pyisolate/_internal/rpc_protocol.py
  • pyisolate/_internal/rpc_transports.py
  • pyisolate/_internal/serialization_registry.py
  • pyisolate/_internal/tensor_serializer.py
  • pyisolate/config.py
  • pyisolate/interfaces.py
  • pyproject.toml
  • tests/test_cuda_wheels.py
  • tests/test_model_serialization.py
  • tests/test_rpc_transports.py
  • tests/test_serialization_registry.py

Comment on lines +234 to +238
def resolve_cuda_wheel_requirements(requirements: list[str], config: CUDAWheelConfig) -> list[str]:
normalized_config = _normalize_cuda_wheel_config(config)
configured_packages = set(normalized_config["packages"])
environment = cast(dict[str, str], default_environment())
runtime = get_cuda_wheel_runtime()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Defer CUDA runtime probing until a dependency actually needs rewriting.

get_cuda_wheel_runtime() runs before the loop, so this helper raises on CPU-only hosts even when none of the input requirements are in cuda_wheels.packages or every matching entry is skipped by its marker. That turns a no-op rewrite pass into an unnecessary install blocker.

Also applies to: 256-270

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyisolate/_internal/cuda_wheels.py` around lines 234 - 238,
resolve_cuda_wheel_requirements currently calls get_cuda_wheel_runtime() eagerly
which raises on CPU-only hosts even when no requirements need rewriting; instead
defer probing until a specific requirement actually requires CUDA handling.
Remove the early runtime = get_cuda_wheel_runtime() call and call
get_cuda_wheel_runtime() lazily inside the loop that iterates over requirements
(only when req.package in configured_packages and its marker does not skip it),
cache the runtime result for subsequent rewrites, and ensure any exception is
only raised when a rewrite is actually attempted; update the same pattern in the
related helper code referenced around the later block handling
cuda_wheels.packages.

@pollockjj pollockjj force-pushed the pr/0.9.2-isolation-support branch from 9350f9f to 94a29d4 Compare March 12, 2026 01:02
…dexes

Add opt-in CUDA wheel resolution for isolated extensions based on the host
Python/torch/CUDA tuple, rewrite selected requirements to direct wheel URLs,
and include resolver inputs in dependency cache invalidation.

Also add resolver tests and clean up benchmark files so repo checks pass.
@pollockjj pollockjj force-pushed the pr/0.9.2-isolation-support branch from 94a29d4 to 7e407d0 Compare March 12, 2026 01:25
@pollockjj pollockjj closed this Mar 12, 2026
@pollockjj pollockjj deleted the pr/0.9.2-isolation-support branch March 15, 2026 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants