Skip to content

Commit aa35a36

Browse files
committed
Merge remote-tracking branch 'origin/main' into explicit-graph-construction
Made-with: Cursor # Conflicts: # cuda_pathfinder/cuda/pathfinder/_dynamic_libs/descriptor_catalog.py
2 parents b55782a + ec99964 commit aa35a36

File tree

100 files changed

+5116
-492
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

100 files changed

+5116
-492
lines changed

.pre-commit-config.yaml

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ repos:
3030
language: python
3131
additional_dependencies:
3232
- https://files.pythonhosted.org/packages/cc/20/ff623b09d963f88bfde16306a54e12ee5ea43e9b597108672ff3a408aad6/pathspec-0.12.1-py3-none-any.whl
33-
exclude: '(.*pixi\.lock)|(\.git_archival\.txt)'
33+
exclude: '(.*pixi\.lock)|(\.git_archival\.txt)|(.*\.patch$)'
3434
args: ["--fix"]
3535

3636
- id: no-markdown-in-docs-source
@@ -59,7 +59,13 @@ repos:
5959
exclude: &gen_exclude '^(?:cuda_python/README\.md|cuda_bindings/cuda/bindings/.*\.in?|cuda_bindings/docs/source/module/.*\.rst?)$'
6060
- id: mixed-line-ending
6161
- id: trailing-whitespace
62-
exclude: *gen_exclude
62+
exclude: |
63+
(?x)^(?:
64+
cuda_python/README\.md|
65+
cuda_bindings/cuda/bindings/.*\.in?|
66+
cuda_bindings/docs/source/module/.*\.rst?|
67+
.*\.patch$
68+
)$
6369
6470
# Checking for common mistakes
6571
- repo: https://github.com/pre-commit/pygrep-hooks

AGENTS.md

Lines changed: 263 additions & 77 deletions
Large diffs are not rendered by default.

CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
AGENTS.md

ci/versions.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,6 @@ backport_branch: "12.9.x" # keep in sync with target-branch in .github/dependab
55

66
cuda:
77
build:
8-
version: "13.1.1"
8+
version: "13.2.0"
99
prev_build:
1010
version: "12.9.1"

cuda_bindings/AGENTS.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
This file describes `cuda_bindings`, the low-level CUDA host API bindings
2+
subpackage in the `cuda-python` monorepo.
3+
4+
## Scope and principles
5+
6+
- **Role**: provide low-level, close-to-CUDA interfaces under
7+
`cuda.bindings.*` with broad API coverage.
8+
- **Style**: prioritize correctness and API compatibility over convenience
9+
wrappers. High-level ergonomics belong in `cuda_core`, not here.
10+
- **Cross-platform**: preserve Linux and Windows behavior unless a change is
11+
intentionally platform-specific.
12+
13+
## Package architecture
14+
15+
- **Public module layer**: Cython modules under `cuda/bindings/` expose user
16+
APIs (`driver`, `runtime`, `nvrtc`, `nvjitlink`, `nvvm`, `cufile`, etc.).
17+
- **Internal binding layer**: `cuda/bindings/_bindings/` provides lower-level
18+
glue and loader helpers used by public modules.
19+
- **Platform internals**: `cuda/bindings/_internal/` contains
20+
platform-specific implementation files and support code.
21+
- **Build/codegen backend**: `build_hooks.py` drives header parsing, template
22+
expansion, extension configuration, and Cythonization.
23+
24+
## Generated-source workflow
25+
26+
- **Do not hand-edit generated binding files**: many files under
27+
`cuda/bindings/` (including `*.pyx`, `*.pxd`, `*.pyx.in`, and `*.pxd.in`)
28+
are generated artifacts.
29+
- **Generated files are synchronized from another repository**: changes to these
30+
files in this repo are expected to be overwritten by the next sync.
31+
- **If generated output must change**: make the change at the generation source
32+
and sync the updated artifacts back here, rather than patching generated files
33+
directly in this repo.
34+
- **Header-driven generation**: parser behavior and required CUDA headers are
35+
defined in `build_hooks.py`; update those rules when introducing new symbols.
36+
- **Platform split files**: keep `_linux.pyx` and `_windows.pyx` variants
37+
aligned when behavior should be equivalent.
38+
39+
## Testing expectations
40+
41+
- **Primary tests**: `pytest tests/`
42+
- **Cython tests**:
43+
- build: `tests/cython/build_tests.sh` (or platform equivalent)
44+
- run: `pytest tests/cython/`
45+
- **Examples**: example coverage is pytest-based under `examples/`.
46+
- **Benchmarks**: run with `pytest --benchmark-only benchmarks/` when needed.
47+
- **Orchestrated run**: from repo root, `scripts/run_tests.sh bindings`.
48+
49+
## Build and environment notes
50+
51+
- `CUDA_HOME` or `CUDA_PATH` must point to a valid CUDA Toolkit for source
52+
builds that parse headers.
53+
- `CUDA_PYTHON_PARALLEL_LEVEL` controls build parallelism.
54+
- `CUDA_PYTHON_PARSER_CACHING` controls parser-cache behavior during generation.
55+
- Runtime behavior is affected by
56+
`CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM` and
57+
`CUDA_PYTHON_DISABLE_MAJOR_VERSION_WARNING`.
58+
59+
## Editing guidance
60+
61+
- Keep CUDA return/error semantics explicit and avoid broad fallback behavior.
62+
- Reuse existing helper layers (`_bindings`, `_internal`, `_lib`) before adding
63+
new one-off utilities.
64+
- If you add or change exported APIs, update relevant docs under
65+
`docs/source/module/` and tests in `tests/`.
66+
- Prefer changes that are easy to regenerate/rebuild rather than patching
67+
generated output directly.

cuda_bindings/CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
AGENTS.md

cuda_bindings/cuda/bindings/_bindings/cydriver.pxd.in

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# SPDX-FileCopyrightText: Copyright (c) 2021-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
22
# SPDX-License-Identifier: LicenseRef-NVIDIA-SOFTWARE-LICENSE
33

4-
# This code was automatically generated with version 13.1.0, generator version 49a8141. Do not modify it directly.
4+
# This code was automatically generated with version 13.2.0, generator version fa58871. Do not modify it directly.
55
from cuda.bindings.cydriver cimport *
66

77
{{if 'cuGetErrorString' in found_functions}}
@@ -439,6 +439,11 @@ cdef CUresult _cuKernelGetName(const char** name, CUkernel hfunc) except ?CUDA_E
439439
cdef CUresult _cuKernelGetParamInfo(CUkernel kernel, size_t paramIndex, size_t* paramOffset, size_t* paramSize) except ?CUDA_ERROR_NOT_FOUND nogil
440440
{{endif}}
441441

442+
{{if 'cuKernelGetParamCount' in found_functions}}
443+
444+
cdef CUresult _cuKernelGetParamCount(CUkernel kernel, size_t* paramCount) except ?CUDA_ERROR_NOT_FOUND nogil
445+
{{endif}}
446+
442447
{{if 'cuMemGetInfo_v2' in found_functions}}
443448

444449
cdef CUresult _cuMemGetInfo_v2(size_t* free, size_t* total) except ?CUDA_ERROR_NOT_FOUND nogil
@@ -679,6 +684,16 @@ cdef CUresult _cuMemcpyBatchAsync_v2(CUdeviceptr* dsts, CUdeviceptr* srcs, size_
679684
cdef CUresult _cuMemcpy3DBatchAsync_v2(size_t numOps, CUDA_MEMCPY3D_BATCH_OP* opList, unsigned long long flags, CUstream hStream) except ?CUDA_ERROR_NOT_FOUND nogil
680685
{{endif}}
681686

687+
{{if 'cuMemcpyWithAttributesAsync' in found_functions}}
688+
689+
cdef CUresult _cuMemcpyWithAttributesAsync(CUdeviceptr dst, CUdeviceptr src, size_t size, CUmemcpyAttributes* attr, CUstream hStream) except ?CUDA_ERROR_NOT_FOUND nogil
690+
{{endif}}
691+
692+
{{if 'cuMemcpy3DWithAttributesAsync' in found_functions}}
693+
694+
cdef CUresult _cuMemcpy3DWithAttributesAsync(CUDA_MEMCPY3D_BATCH_OP* op, unsigned long long flags, CUstream hStream) except ?CUDA_ERROR_NOT_FOUND nogil
695+
{{endif}}
696+
682697
{{if 'cuMemsetD8_v2' in found_functions}}
683698

684699
cdef CUresult _cuMemsetD8_v2(CUdeviceptr dstDevice, unsigned char uc, size_t N) except ?CUDA_ERROR_NOT_FOUND nogil
@@ -1069,6 +1084,16 @@ cdef CUresult _cuStreamCreate(CUstream* phStream, unsigned int Flags) except ?CU
10691084
cdef CUresult _cuStreamCreateWithPriority(CUstream* phStream, unsigned int flags, int priority) except ?CUDA_ERROR_NOT_FOUND nogil
10701085
{{endif}}
10711086

1087+
{{if 'cuStreamBeginCaptureToCig' in found_functions}}
1088+
1089+
cdef CUresult _cuStreamBeginCaptureToCig(CUstream hStream, CUstreamCigCaptureParams* streamCigCaptureParams) except ?CUDA_ERROR_NOT_FOUND nogil
1090+
{{endif}}
1091+
1092+
{{if 'cuStreamEndCaptureToCig' in found_functions}}
1093+
1094+
cdef CUresult _cuStreamEndCaptureToCig(CUstream hStream) except ?CUDA_ERROR_NOT_FOUND nogil
1095+
{{endif}}
1096+
10721097
{{if 'cuStreamGetPriority' in found_functions}}
10731098

10741099
cdef CUresult _cuStreamGetPriority(CUstream hStream, int* priority) except ?CUDA_ERROR_NOT_FOUND nogil
@@ -1309,6 +1334,11 @@ cdef CUresult _cuFuncGetName(const char** name, CUfunction hfunc) except ?CUDA_E
13091334
cdef CUresult _cuFuncGetParamInfo(CUfunction func, size_t paramIndex, size_t* paramOffset, size_t* paramSize) except ?CUDA_ERROR_NOT_FOUND nogil
13101335
{{endif}}
13111336

1337+
{{if 'cuFuncGetParamCount' in found_functions}}
1338+
1339+
cdef CUresult _cuFuncGetParamCount(CUfunction func, size_t* paramCount) except ?CUDA_ERROR_NOT_FOUND nogil
1340+
{{endif}}
1341+
13121342
{{if 'cuFuncIsLoaded' in found_functions}}
13131343

13141344
cdef CUresult _cuFuncIsLoaded(CUfunctionLoadingState* state, CUfunction function) except ?CUDA_ERROR_NOT_FOUND nogil
@@ -1344,6 +1374,11 @@ cdef CUresult _cuLaunchCooperativeKernelMultiDevice(CUDA_LAUNCH_PARAMS* launchPa
13441374
cdef CUresult _cuLaunchHostFunc(CUstream hStream, CUhostFn fn, void* userData) except ?CUDA_ERROR_NOT_FOUND nogil
13451375
{{endif}}
13461376

1377+
{{if 'cuLaunchHostFunc_v2' in found_functions}}
1378+
1379+
cdef CUresult _cuLaunchHostFunc_v2(CUstream hStream, CUhostFn fn, void* userData, unsigned int syncMode) except ?CUDA_ERROR_NOT_FOUND nogil
1380+
{{endif}}
1381+
13471382
{{if 'cuFuncSetBlockShape' in found_functions}}
13481383

13491384
cdef CUresult _cuFuncSetBlockShape(CUfunction hfunc, int x, int y, int z) except ?CUDA_ERROR_NOT_FOUND nogil
@@ -1824,6 +1859,11 @@ cdef CUresult _cuGraphAddNode_v2(CUgraphNode* phGraphNode, CUgraph hGraph, const
18241859
cdef CUresult _cuGraphNodeSetParams(CUgraphNode hNode, CUgraphNodeParams* nodeParams) except ?CUDA_ERROR_NOT_FOUND nogil
18251860
{{endif}}
18261861

1862+
{{if 'cuGraphNodeGetParams' in found_functions}}
1863+
1864+
cdef CUresult _cuGraphNodeGetParams(CUgraphNode hNode, CUgraphNodeParams* nodeParams) except ?CUDA_ERROR_NOT_FOUND nogil
1865+
{{endif}}
1866+
18271867
{{if 'cuGraphExecNodeSetParams' in found_functions}}
18281868

18291869
cdef CUresult _cuGraphExecNodeSetParams(CUgraphExec hGraphExec, CUgraphNode hNode, CUgraphNodeParams* nodeParams) except ?CUDA_ERROR_NOT_FOUND nogil
@@ -2159,6 +2199,26 @@ cdef CUresult _cuCoredumpSetAttribute(CUcoredumpSettings attrib, void* value, si
21592199
cdef CUresult _cuCoredumpSetAttributeGlobal(CUcoredumpSettings attrib, void* value, size_t* size) except ?CUDA_ERROR_NOT_FOUND nogil
21602200
{{endif}}
21612201

2202+
{{if 'cuCoredumpRegisterStartCallback' in found_functions}}
2203+
2204+
cdef CUresult _cuCoredumpRegisterStartCallback(CUcoredumpStatusCallback callback, void* userData, CUcoredumpCallbackHandle* callbackOut) except ?CUDA_ERROR_NOT_FOUND nogil
2205+
{{endif}}
2206+
2207+
{{if 'cuCoredumpRegisterCompleteCallback' in found_functions}}
2208+
2209+
cdef CUresult _cuCoredumpRegisterCompleteCallback(CUcoredumpStatusCallback callback, void* userData, CUcoredumpCallbackHandle* callbackOut) except ?CUDA_ERROR_NOT_FOUND nogil
2210+
{{endif}}
2211+
2212+
{{if 'cuCoredumpDeregisterStartCallback' in found_functions}}
2213+
2214+
cdef CUresult _cuCoredumpDeregisterStartCallback(CUcoredumpCallbackHandle callback) except ?CUDA_ERROR_NOT_FOUND nogil
2215+
{{endif}}
2216+
2217+
{{if 'cuCoredumpDeregisterCompleteCallback' in found_functions}}
2218+
2219+
cdef CUresult _cuCoredumpDeregisterCompleteCallback(CUcoredumpCallbackHandle callback) except ?CUDA_ERROR_NOT_FOUND nogil
2220+
{{endif}}
2221+
21622222
{{if 'cuGetExportTable' in found_functions}}
21632223

21642224
cdef CUresult _cuGetExportTable(const void** ppExportTable, const CUuuid* pExportTableId) except ?CUDA_ERROR_NOT_FOUND nogil

0 commit comments

Comments
 (0)