Skip to content

Releases: NVIDIA/cuda-python

CUDA Python 12.9.1

07 Aug 03:52
v12.9.1
d1a166a

Choose a tag to compare

cuda-pathfinder v1.1.0

07 Aug 01:46
cuda-pathfinder-v1.1.0
382f49b

Choose a tag to compare

  • CTK 13.0.0 compatibility
  • Bug fix: load libnvJitLink.so.12 from conda, not /usr/local/cuda PR #767

cuda-pathfinder v1.0.0

16 Jul 21:54
ed12c83

Choose a tag to compare

First release of cuda-pathfinder as a stand-alone module.

cuda.pathfinder replaces cuda.bindings.path_finder, which was released with cuda-bindings 12.9.0 and is now deprecated.

Note that cuda-pathfinder is a noarch package and has no dependencies (other than a Python 3.9+ interpreter).

Please see cuda/pathfinder/README for more information.

cuda.core v0.3.1

02 Jul 20:05
a8550cf

Choose a tag to compare

cuda.core v0.3.1 release announcement

Release note

All functionalities are currently hosted under the cuda.core.experimental namespace. Once the features become stable they will be moved out of experimental.

Documentation

Sample codes

What's Changed

  • Bump github/codeql-action from 3.28.19 to 3.29.0 by @dependabot in #710
  • Fix Windows build CI by @leofang in #713
  • Bump pypa/cibuildwheel from 2.23.3 to 3.0.0 by @dependabot in #711
  • Ensure correct handling of buffers allocated with LegacyPinnedMemoryResource.allocate as kernel parameters by @shwina in #717
  • Fix nvbugpro 5348750 by @oleksandr-pavlyk in #725
  • Add a "Getting Started" page to the documentation by @shwina in #720
  • Bump korthout/backport-action from 3.2.0 to 3.2.1 by @dependabot in #738
  • Bump github/codeql-action from 3.29.0 to 3.29.2 by @dependabot in #737
  • cuda_core/tests/test_event.py::test_timing_success WSL compatibility by @rwgk in #740
  • Restore option to run testing without cupy installed. by @rwgk in #741
  • Cythonize away some perf hot spots by @leofang in #709
  • cuda_core forward compatibility changes. by @rwgk in #722
  • Update docs for v0.3.1 release by @leofang in #695

Full Changelog: cuda-core-v0.3.0...cuda-core-v0.3.1

cuda.core v0.3.0

12 Jun 16:35
a13a917

Choose a tag to compare

cuda.core v0.3.0 release announcement

Release note

All functionalities are currently hosted under the cuda.core.experimental namespace. Once the features become stable they will be moved out of experimental.

Documentation

Sample codes

What's Changed

New Contributors

Full Changelog: cuda-core-v0.2.0...cuda-core-v0.3.0

CUDA Python 12.9.0

06 May 19:15
34ef825

Choose a tag to compare

CUDA Python 11.8.7

06 May 19:48
e290d52

Choose a tag to compare

cuda.core v0.2.0

17 Mar 20:44
111c713

Choose a tag to compare

cuda.core v0.2.0 release announcement

Release note

All functionalities are currently hosted under the cuda.core.experimental namespace. Once the features become stable they will be moved out of experimental.

Key Features and Enhancements

  • Add ProgramOptions to facilitate the passing of runtime compile options to Program.
  • Add pythonic access to Device and Kernel attributes.

For full details please refer to the release note above.

Breaking Changes

  • The stream attribute is removed from LaunchConfig. Instead, the Stream object should now be directly passed to launch as an argument.
  • The signature for launch is changed by swapping positional arguments, the new signature is now (stream, config, kernel, *kernel_args)
  • Change __cuda_stream__ from attribute to method.
  • The Program.compile method no longer accepts the options argument. Instead, you can optionally pass an instance of ProgramOptions to the constructor of Program.
  • Device.properties now provides attribute getters instead of a dictionary interface.
  • The .handle attribute of various cuda.core objects now returns the underlying Python object instead of a (type-erased) Python integer.

New examples

  • jit_lto_fractal.py — Demonstrates just-in-time link-time optimization for fractal generation. (Device, LaunchConfig, Linker, LinkerOptions, Program, ProgramOptions) (#475)
  • simple_multi_gpu_example.py — Example of using multiple GPUs. (Device, Program, LaunchConfig) (#304)
  • show_device_properties.py — Displays detailed device properties. (Device) (#474)

Documentation

Sample codes

Test fixes

  • Clean up device initialization in some tests. (#507)

What's Changed

  • Add back CuPy as an optional test dependency + Fix an example bug by @leofang in #334
  • add warning to the nvjitlink ctor when falling back to cuLink by @ksimpson-work in #315
  • Set up a preliminary doc build/publish pipeline by @leofang in #325
  • Fix doc ci permissions by @leofang in #338
  • Add conda installation instructions by @leofang in #321
  • Add a dummy email address to the doc bot by @leofang in #343
  • multi gpu example by @amwi04 in #304
  • Change __cuda_stream__ from attribute to method by @NaderAlAwar in #389
  • Ensure deprecation warnings from cuda.bindings are swallowed by @ksimpson-work in #404
  • Add the options data class to program by @ksimpson-work in #237
  • Update to use the new NVKS runners by @leofang in #426
  • Disable notebook execution for cuda.bindings by @vzhurba01 in #424
  • Stop tracking cached files and Jupyter notebook hashes in doc builds by @vzhurba01 in #425
  • Documentation remove gpu dependency by @ksimpson-work in #398
  • handle CTK version specific options in the linker test by @ksimpson-work in #371
  • support ptx code type for program by @ksimpson-work in #317
  • Update release checklist to focus on subpackages by @vzhurba01 in #427
  • Kernel attributes by @ksimpson-work in #360
  • Use cpu runner to build docs by @leofang in #434
  • device properties by @ksimpson-work in #409
  • Update linker options sequence handling by @ksimpson-work in #436
  • Expose ObjectCode as public API + prune unnecessary input arguments by @ksimpson-work in #435
  • Pin Sphinx <8.2.0 to fix doc build by @leofang in #456
  • CI: Add Windows GPU runner for tests by @leofang in #444
  • Improve program checks by @ksimpson-work in #394
  • Various handle-related changes and improvements by @leofang in #463
  • Improve perf of accessing dev.compute_capability by @leofang in #459
  • Device properties example by @samaid in #474
  • Add ObjectCode ptx constructor by @brandon-b-miller in #470
  • Add a Linker example by @vzhurba01 in #475
  • Add error log producing test by @ksimpson-work in #423
  • Apply __new__ approach to disabling __init__ by @rwgk in #484
  • switch the launch argument order by @ksimpson-work in #316
  • NEW: Create an Event without recording to Stream by @carterbox in #487
  • Clearer error messages (cuda.core) by @rwgk in #458
  • Add event timing by @leofang in #481
  • Fix merge error by @leofang in #495
  • add public handle to object code by @ksimpson-work in #492
  • Bump cuda-core ver to v0.2.0 by @leofang in #494
  • Increase tolerance in test_timing() to avoid flaky tests. by @rwgk in #498
  • Fix test_timing flakiness under Windows by @rwgk in #508
  • NEW: Add Event to public API by @carterbox in #501
  • cuda.core: Change selected .decode() calls to .decode("utf-8", errors="backslashreplace") by @rwgk in #510
  • clean up device initialization in test by @ksimpson-work in #507
  • Add @functools.lru_cache decorator for get_binding_version() by @rwgk in #512
  • Fix dangling pointer problem in _linker.py by @rwgk in #516
  • cuda.core: release notes update by @rwgk in #519
  • Set release date by @vzhurba01 in #523

New Contributors

Full Changelog: cuda-core-v0.1.1...cuda-core-v0.2.0

CUDA Python 12.8.0

25 Jan 04:37
c04025d

Choose a tag to compare

CUDA Python 11.8.6

25 Jan 04:39
569edf4

Choose a tag to compare