Use cuda-pathfinder in `build-system.requires` by rwgk · Pull Request #1817 · NVIDIA/cuda-python

rwgk · 2026-03-25T00:10:43Z

Follow-up to #1801. Closes the build-time integration gap identified in #1803.

The main point of this PR is to firmly establish cuda-pathfinder as a tool usable at build time, not just at runtime.

The build hooks in cuda_bindings and cuda_core now use cuda.pathfinder.get_cuda_path_or_home() to resolve CUDA_PATH/CUDA_HOME, replacing the previous inline os.environ.get() fallback.

Unfortunately a helper function (_import_get_cuda_path_or_home()) is needed because PEP 517 isolated build environments shadow the cuda namespace package, preventing normal import cuda.pathfinder. Discussion #1887 serves as a platform for advertising the helper function. The use in cuda_bindings and cuda_core is essentially a unit test to ensure the helper function works as intended. Discussion #1887 will be kept up-to-dated as the Python ecosystem evolves. It is also a place where users can report problems.

Technical details

Pin cuda-pathfinder>=1.5 in build-system.requires for cuda_bindings and cuda_core.
Namespace fix for PEP 517 isolated builds: The _import_get_cuda_path_or_home() helper detects namespace shadowing and uses importlib.metadata to precisely locate the installed cuda-pathfinder, then fixes cuda.__path__ so the import succeeds.
Test updates: cuda_core/tests/test_build_hooks.py adds get_cuda_path_or_home.cache_clear() calls so monkeypatched env vars take effect.

Pin cuda-pathfinder>=1.5 in both build-system.requires and project.dependencies. Made-with: Cursor

The previous 1.3.4a0 was a stale value that could confuse someone. Made-with: Cursor

copy-pr-bot · 2026-03-25T00:10:47Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

rwgk · 2026-03-25T00:11:14Z

/ok to test

rparolin · 2026-03-25T17:57:41Z

cuda_core/build_hooks.py

 COMPILE_FOR_COVERAGE = bool(int(os.environ.get("CUDA_PYTHON_COVERAGE", "0")))


+# Please keep in sync with the copy in cuda_bindings/build_hooks.py.


We should avoid duplicating implementations. Could we not hoist this into a separate file and import from both?

We should avoid duplicating implementations.

Agreed

Could we not hoist this into a separate file and import from both?

No 😭

In the words of Cursor Opus 4.6 1M Thinking:

There's no reasonable way to share code between the two build_hooks.py files:

Each is a PEP 517 build backend loaded via backend-path=["."] from its own package directory (cuda_bindings/ or cuda_core/). They can't import from the monorepo root or from each other.

The function exists because cuda.pathfinder can't be imported normally in this context — so it can't live inside cuda-pathfinder either.

A shared helper package in build-system.requires would be massive overkill for ~15 lines of stable code.

The only alternatives are worse: a new PyPI package just for this, importlib file-path hacks, or copying a shared file into both directories at CI time.

The "please keep in sync" comments are the pragmatic solution here.

We have other existing cases of "please keep in sync" that are impractical to avoid.

It just crossed my mind: a pre-commit check to enforce that they stay in sync is probably pretty easy to implement.

rparolin · 2026-03-25T18:00:11Z

cuda_core/build_hooks.py

+            if os.path.isdir(os.path.join(sp_cuda, "pathfinder")):
+                cuda.__path__ = list(cuda.__path__) + [sp_cuda]
+                break
+        import cuda.pathfinder


Should we include some message explaining that we iterated through sys.path looking for pathfinder and couldn't find it? That seems to be important information that would be missing from just throwing we couldn't import pathfinder module.

for p in sys.path: sp_cuda = os.path.join(p, "cuda") if os.path.isdir(os.path.join(sp_cuda, "pathfinder")): cuda.__path__ = list(cuda.__path__) + [sp_cuda] break else: raise ModuleNotFoundError( "cuda-pathfinder is not installed in the build environment. " "Ensure 'cuda-pathfinder>=1.5' is in build-system.requires." ) import cuda.pathfinder ```

Done: commit 3d422de

Made-with: Cursor

cpcloud

The path hacking needs some more justification and explicit examples of in what common scenarios a plain import isn't going to work.

cpcloud · 2026-03-25T18:54:18Z

cuda_bindings/build_hooks.py

+        for p in sys.path:
+            sp_cuda = os.path.join(p, "cuda")
+            if os.path.isdir(os.path.join(sp_cuda, "pathfinder")):
+                cuda.__path__ = list(cuda.__path__) + [sp_cuda]
+                break


Path hacking of any sort like this is a code smell. At the very least the docstring is missing a concrete example of why this is needed.

isolated build environments

is doing a ton of work there. Isolated how? What is backend-path and where does that come from? Where are all these nouns like backend-path and build-env coming from?

In what real-world scenarios does this actually happen where there's not a viable alternative?

I'm really skeptical that path hacking like this is necessary.

It's fully documented now here: #1824 (that in turn is pointing to #1803, where you can see the original development history for the workaround)

The comment here was updated:

commit 6892ae0

Made-with: Cursor

cpcloud

Main concern: this workaround solves one specific PEP 517 shadowing scenario by mutating global namespace-package state (cuda.__path__). That is risky in a build backend because the rest of the process now sees a different import model depending on whether this helper has already run.

The two concrete issues I see are:

except ModuleNotFoundError is broad enough to catch real import failures inside cuda.pathfinder, so a broken dependency/import in that package can be misdiagnosed as namespace shadowing.
Replacing importlib's _NamespacePath with a plain list and appending the first cuda/pathfinder hit from sys.path makes the selected package "sticky" for the rest of the process and depends on path ordering heuristics.

I'm commenting on the cuda_bindings copy only to avoid duplication, but the same concerns apply to the mirrored helper in cuda_core.

cpcloud · 2026-03-30T19:37:50Z

cuda_bindings/build_hooks.py

+    """
+    try:
+        import cuda.pathfinder
+    except ModuleNotFoundError:


This except ModuleNotFoundError is too broad: it will also catch ModuleNotFoundErrors raised from inside cuda.pathfinder itself. If cuda.pathfinder is installed but one of its own imports is broken, we silently reinterpret that as the namespace-shadowing case and take the workaround path.

I'd narrow the fallback to the exact failure we expect here:

try: from cuda.pathfinder import get_cuda_path_or_home return get_cuda_path_or_home except ModuleNotFoundError as exc: if exc.name != "cuda.pathfinder": raise

That preserves the real traceback for genuine cuda.pathfinder import bugs and keeps the workaround limited to the cuda.pathfinder-not-found case.

Thanks for catching this (no pun intended)!

Done: commit b779d0d

cpcloud · 2026-03-30T19:37:50Z

cuda_bindings/build_hooks.py

+        for p in sys.path:
+            sp_cuda = os.path.join(p, "cuda")
+            if os.path.isdir(os.path.join(sp_cuda, "pathfinder")):
+                cuda.__path__ = list(cuda.__path__) + [sp_cuda]


Mutating cuda.__path__ here permanently changes namespace resolution for the rest of the backend process, and it selects whichever cuda/pathfinder directory happens to appear first on sys.path. That's pretty fragile.

If we keep this overall approach, I think the safer version is to resolve the installed cuda-pathfinder distribution explicitly and append only that exact cuda/ directory, instead of scanning sys.path:

from importlib import metadata from pathlib import Path # after narrowing the ModuleNotFoundError as above import cuda dist = metadata.distribution("cuda-pathfinder") site_cuda = str(dist.locate_file(Path("cuda"))) cuda_paths = list(cuda.__path__) if site_cuda not in cuda_paths: cuda.__path__ = cuda_paths + [site_cuda] from cuda.pathfinder import get_cuda_path_or_home return get_cuda_path_or_home

That still uses the namespace workaround, but it removes the sys.path guesswork.

If you'd rather avoid cuda.__path__ mutation entirely, another option is to load cuda/pathfinder/_utils/env_vars.py directly from the installed distribution under a private module name and call get_cuda_path_or_home() from there. That file is stdlib-only today, so it avoids rewriting namespace-package state.

I think the safer version is to resolve the installed cuda-pathfinder distribution explicitly

Yes, much better, thanks!

Done: commit 6facaa5

another option is to load cuda/pathfinder/_utils/env_vars.py directly

The answer to this goes deeper, I rewrote the PR description to make this clearer. Quoting from there:

The main point of this PR is to firmly establish cuda-pathfinder as a tool usable at build time, not just at runtime.

We don't want to advertise a helper function that reaches into pathfinder internals.

Unfortunately it looks like we have to backtrack to the sys.path scan. Here is a Cursor writeup, based on information generated via commit 3cb0313:

We tried replacing the sys.path scan with importlib.metadata per your suggestion. It works locally but fails in CI (cibuildwheel / manylinux). Here's why.

Diagnostic output (linux-64, py3.12)

dist._path: /project/cuda_bindings/cuda_bindings.egg-info dist._path.parent: /project/cuda_bindings locate_file('cuda'): /project/cuda_bindings/cuda locate_file exists: True locate_file/pathfinder exists: False cuda.__path__ (before): _NamespacePath(['/project/cuda_bindings/cuda']) sys.path: /tmp/build-env-_bna8wjb/lib/python3.12/site-packages -> cuda/pathfinder exists: True cuda.__path__ (after): _NamespacePath(['/project/cuda_bindings/cuda'])

What happens

distribution("cuda-pathfinder") finds the project's own cuda_bindings.egg-info (created by the setuptools metadata step), not the cuda-pathfinder package installed in the build env under /tmp/build-env-…/lib/python3.12/site-packages/.

So dist.locate_file(Path("cuda")) returns /project/cuda_bindings/cuda — the shadowed directory itself. Adding it to cuda.__path__ is a no-op.

Separately, cuda.__path__ remained a _NamespacePath after our plain-list assignment. The _NamespacePath descriptor recalculates on access, discarding the mutation. (The old working code assigned cuda.__path__ = list(cuda.__path__) + [sp_cuda] without a preceding import cuda that could re-trigger recalculation.)

Conclusion

importlib.metadata is unreliable here because the in-tree egg-info shadows the installed distribution. The sys.path scan is the correct approach — it directly verifies that cuda/pathfinder/ exists on disk and is not fooled by metadata from the wrong package.

CI run with diagnostics: https://github.com/NVIDIA/cuda-python/actions/runs/23809723643

Another Cursor writeup, based on the diagnostic results:

Regarding the two concerns about the sys.path scan:

"First hit wins" ordering: In this context there is exactly one cuda-pathfinder installed in the build environment — pip deduplicates. So there is exactly one sys.path entry with cuda/pathfinder/ on disk. "First hit" finds the only hit.

Sticky cuda.__path__ mutation: The build backend is an ephemeral subprocess that builds one wheel and exits. The mutation corrects an already-wrong _NamespacePath (which only points to the project's cuda/ directory) to what it should have been (both the project's and site-packages' cuda/ directories). "Permanent" means the remaining ~seconds of that subprocess's lifetime.

And as shown in the diagnostic run, importlib.metadata cannot be used as an alternative because the in-tree cuda_bindings.egg-info shadows the installed cuda-pathfinder distribution — distribution("cuda-pathfinder") resolves to the project's own metadata, not the build-env's site-packages copy.

Made-with: Cursor

…anning sys.path Made-with: Cursor

Made-with: Cursor

rwgk · 2026-03-31T16:46:09Z

Converting back to draft, until I figure out why all CI builds failed.

Made-with: Cursor

rwgk · 2026-03-31T17:04:20Z

/ok to test

…17 builds" This reverts commit 3cb0313.

See NVIDIA#1817 (comment) See NVIDIA#1817 (comment) Made-with: Cursor

rwgk · 2026-03-31T19:20:08Z

/ok to test

rwgk · 2026-03-31T19:50:56Z

@cpcloud this is ready for review again.

leofang

FWIW, this PR would unblock #692. But we don't need to do everything at once.

cuda_bindings/pyproject.toml

cuda_bindings/build_hooks.py

leofang · 2026-04-07T21:52:08Z

Discussed offline, let's push this to the next milestone.

Use exclusion-style constraints in cuda_bindings, cuda_core, and cuda_pathfinder so CI avoids cuda-toolkit 12.9.2 and 13.0.3 while still allowing newer good patch releases. Made-with: Cursor

rwgk · 2026-04-09T05:50:52Z

Cursor GPT-5.4 Extra High Fast

First attempt for commit 937ff7f

https://github.com/NVIDIA/cuda-python/actions/runs/24172970743/attempts/1?pr=1817

I analyzed all five failing job logs. Check job status is just the aggregate failure gate; the five real failures break down into three classes.

Test win-64 / py3.12, 13.0.2, local, a100 (TCC), Test win-64 / py3.12, 13.2.0, local, a100 (TCC), and Test win-64 / py3.14t, 13.2.0, wheels, a100 (MCDM) all failed before checkout. Each log shows repeated failures downloading actions/checkout from api.github.com, followed by a final timeout on api.github.com:443. No repo code ran in those jobs, so these are Windows runner/network infra failures.
Test linux-64 / py3.10, 12.9.1, local, v100 also failed before tests. This job takes the backport-artifact path and first tries to install gh from cli.github.com. In this run, the wget for the GitHub CLI keyring hit Network is unreachable, so gh never got installed, and the next command failed with gh: command not found. In the last successful main run, the same job completed that exact step and installed gh 2.89.0, so this is another CI/network issue, not a code regression.
Test linux-64 / py3.12, 13.2.0, wheels, rtx4090, wsl is the only one that looks like a real job-content failure, but it is still not in the area we changed. The tests passed through the bindings cython phase, then the job failed in the benchmark step python run_pyperf.py --fast --loops 1 --min-time 0. pyperf aborted with ValueError: benchmark function returned zero. There is only one bindings benchmark in that step, bindings.pointer_attributes.pointer_get_attribute, and it just times a very short loop around one cuPointerGetAttribute call:

def bench_pointer_get_attribute(loops: int) -> float:
    # Local references to avoid global lookups in the hot loop
    _cuPointerGetAttribute = cuda.cuPointerGetAttribute
    _attr = ATTRIBUTE
    _ptr = PTR

    t0 = time.perf_counter()
    for _ in range(loops):
        _cuPointerGetAttribute(_attr, _ptr)
    return time.perf_counter() - t0

That makes a zero-duration reading plausible as a timing flake, especially with --loops 1 --min-time 0. The last successful main run on the same WSL lane passed this exact benchmark step and reported about 15.8 us, so this does not look like a new dependency break.

Bottom line: the cuda-toolkit 12.9.2 / 13.0.3 pathfinder breakage appears fixed. The remaining failures are three Windows infra timeouts, one Linux infra/network failure while installing gh, and one likely flaky WSL benchmark.

Natural next steps are:

rerun the three Windows jobs and the Linux py3.10 / 12.9.1 / local / v100 job
treat the WSL benchmark separately, since it may need either a retry or a less fragile pyperf configuration

…ecs." This reverts commit 937ff7f.

rwgk · 2026-04-09T17:09:36Z

Retrying after seeing that cuda-toolkit 12.9.2 and 13.0.3 were yanked on PyPI:

https://pypi.org/project/cuda-toolkit/#history

rwgk · 2026-04-09T17:16:28Z

@mdboom I added you as a reviewer because I'm interested in your opinion: the awkward helper function (#1824) is needed because cuda is a namespace package. It seems to me, the helper is ultimately a workaround for a bug in pip. Is that a fair way to look at the problem? Should we raise this as a formal pip issue or similar?

github-actions · 2026-04-09T22:10:20Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

mdboom · 2026-04-09T22:16:42Z

Having looked at this, I think we are being bit (in part) by not having a src layout.

As far as I can tell pip is indeed not isolating us from the cuda subdirectory if cuda_bindings during the build. This is not really happening on purpose, but from inheriting Python's default behavior, as far as I can tell. It also seems that uv pip install does not have this problem.

We would have a lot easier time of it by moving to a src layout for all of the subprojects. I'm having my bot work on that refactor so we can test it. It's not something to do lightly, since it may make a merging some of our open PRs more difficult.

rwgk added 2 commits March 24, 2026 17:00

build: use cuda.pathfinder.get_cuda_path_or_home in build hooks

01f33cd

Pin cuda-pathfinder>=1.5 in both build-system.requires and project.dependencies. Made-with: Cursor

chore: update stale pixi package version to 1.5.0

0fc4597

The previous 1.3.4a0 was a stale value that could confuse someone. Made-with: Cursor

rwgk mentioned this pull request Mar 25, 2026

EXPERIMENT — NOT FOR MERGING — cuda-pathfinder in build-system.requires #1803

Closed

This comment has been minimized.

Sign in to view

rwgk marked this pull request as ready for review March 25, 2026 03:44

rwgk self-assigned this Mar 25, 2026

rwgk added enhancement Any code-related improvements P1 Medium priority - Should do cuda.bindings Everything related to the cuda.bindings module cuda.core Everything related to the cuda.core module labels Mar 25, 2026

rwgk added this to the cuda.core v0.7.0 milestone Mar 25, 2026

rwgk added P1 Medium priority - Should do and removed P1 Medium priority - Should do labels Mar 25, 2026

rwgk requested a review from cpcloud March 25, 2026 04:40

rparolin reviewed Mar 25, 2026

View reviewed changes

rwgk added 2 commits March 25, 2026 11:18

build: raise actionable error when cuda-pathfinder not found on sys.path

3d422de

Made-with: Cursor

Merge branch 'main' into pathfinder_in_build-system_requires

73e840e

cpcloud requested changes Mar 25, 2026

View reviewed changes

rwgk mentioned this pull request Mar 25, 2026

Known issue: cuda.pathfinder cannot be imported in PEP 517 in-tree build backends without a workaround #1824

Closed

build: replace inline docstring with reference to issue NVIDIA#1824

6892ae0

Made-with: Cursor

cpcloud reviewed Mar 30, 2026

View reviewed changes

rwgk added 4 commits March 31, 2026 08:27

Merge branch 'main' into pathfinder_in_build-system_requires

5a06d6f

build: narrow except to only catch cuda.pathfinder-not-found

b779d0d

Made-with: Cursor

build: use importlib.metadata to locate cuda-pathfinder instead of sc…

6facaa5

…anning sys.path Made-with: Cursor

build: produce actionable error when cuda namespace is entirely absent

f1ef076

Made-with: Cursor

rwgk marked this pull request as draft March 31, 2026 16:45

build: diagnostic for importlib.metadata locate_file in PEP 517 builds

3cb0313

Made-with: Cursor

rwgk added 2 commits March 31, 2026 12:14

Revert "build: diagnostic for importlib.metadata locate_file in PEP 5…

a9995ac

…17 builds" This reverts commit 3cb0313.

build: revert to sys.path scan, importlib.metadata finds wrong dist-info

cb5ffbb

See NVIDIA#1817 (comment) See NVIDIA#1817 (comment) Made-with: Cursor

rwgk marked this pull request as ready for review March 31, 2026 19:50

leofang requested changes Apr 7, 2026

View reviewed changes

cuda_bindings/pyproject.toml Outdated Show resolved Hide resolved

cuda_bindings/build_hooks.py Show resolved Hide resolved

cuda_bindings/build_hooks.py Show resolved Hide resolved

leofang modified the milestones: cuda.core v0.7.0, cuda.bindings next Apr 7, 2026

rwgk added 3 commits April 8, 2026 15:24

Merge branch 'main' into pathfinder_in_build-system_requires

a8ae9ce

Revert changes to cuda-pathfinder runtime dependency pinning.

62d334e

Exclude broken cuda-toolkit patch releases from dependency specs.

937ff7f

Use exclusion-style constraints in cuda_bindings, cuda_core, and cuda_pathfinder so CI avoids cuda-toolkit 12.9.2 and 13.0.3 while still allowing newer good patch releases. Made-with: Cursor

rwgk mentioned this pull request Apr 9, 2026

Exclude broken cuda-toolkit wheels on Windows #1884

Closed

Revert "Exclude broken cuda-toolkit patch releases from dependency sp…

96e1b69

…ecs." This reverts commit 937ff7f.

rwgk marked this pull request as draft April 9, 2026 16:13

rwgk requested a review from mdboom April 9, 2026 16:39

rwgk mentioned this pull request Apr 9, 2026

Flaky WSL cuda.bindings benchmark smoke test: pyperf raises ValueError: benchmark function returned zero #1885

Open

rwgk marked this pull request as ready for review April 9, 2026 17:09

cpcloud approved these changes Apr 9, 2026

View reviewed changes

leofang approved these changes Apr 9, 2026

View reviewed changes

leofang merged commit f983b9d into NVIDIA:main Apr 9, 2026
91 checks passed

		COMPILE_FOR_COVERAGE = bool(int(os.environ.get("CUDA_PYTHON_COVERAGE", "0")))


		# Please keep in sync with the copy in cuda_bindings/build_hooks.py.

Conversation

rwgk commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Mar 25, 2026

Uh oh!

rwgk commented Mar 25, 2026

Uh oh!

This comment has been minimized.

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpcloud left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rwgk Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpcloud left a comment

Choose a reason for hiding this comment

Uh oh!

cpcloud Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpcloud Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Diagnostic output (linux-64, py3.12)

What happens

Conclusion

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rwgk commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rwgk commented Mar 31, 2026

Uh oh!

rwgk commented Mar 31, 2026

Uh oh!

rwgk commented Mar 31, 2026

Uh oh!

leofang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

leofang commented Apr 7, 2026

Uh oh!

rwgk commented Apr 9, 2026

Uh oh!

rwgk commented Apr 9, 2026

Uh oh!

rwgk commented Apr 9, 2026

Uh oh!

Uh oh!

github-actions bot commented Apr 9, 2026

Uh oh!

mdboom commented Apr 9, 2026

Uh oh!

rwgk commented Mar 25, 2026 •

edited

Loading

rwgk Mar 25, 2026 •

edited

Loading

cpcloud Mar 30, 2026 •

edited

Loading

cpcloud Mar 30, 2026 •

edited

Loading

rwgk commented Mar 31, 2026 •

edited

Loading