Documentation for rocRAND is available at https://rocm.docs.amd.com/projects/rocRAND/en/latest/
- Added
rocrand_create_generator_host- The following generators are supported:
ROCRAND_RNG_PSEUDO_MRG31K3PROCRAND_RNG_PSEUDO_MRG32K3AROCRAND_RNG_PSEUDO_PHILOX4_32_10ROCRAND_RNG_PSEUDO_THREEFRY2_32_20ROCRAND_RNG_PSEUDO_THREEFRY2_64_20ROCRAND_RNG_PSEUDO_THREEFRY4_32_20ROCRAND_RNG_PSEUDO_THREEFRY4_64_20ROCRAND_RNG_PSEUDO_XORWOWROCRAND_RNG_QUASI_SCRAMBLED_SOBOL32ROCRAND_RNG_QUASI_SCRAMBLED_SOBOL64ROCRAND_RNG_QUASI_SOBOL32ROCRAND_RNG_QUASI_SOBOL64
- The host-side generators support multi-core processing. On Linux, this requires the TBB (Thread Building Blocks) development package to be installed on the system when building rocRAND (
libtbb-devon Ubuntu and derivatives).- If TBB is not found when configuring rocRAND, the configuration is still successful, and the host generators are executed on a single CPU thread.
- The following generators are supported:
- Added the option to create a host generator to the Python wrapper
- Added the option to create a host generator to the Fortran wrapper
- Added dynamic ordering. This ordering is free to rearrange the produced numbers,
which can be specific to devices and distributions. It is implemented for:
- XORWOW, MRG32K3A, MTGP32, Philox 4x32-10, MRG31K3P, LFSR113, and ThreeFry
- For the NVIDIA platform compilation using clang as the host compiler is now supported.
- C++ wrapper:
lfsr113_enginenow also supports being constructed with a seed of typeunsigned long long, not onlyuint4.- added optional order parameter to constructor of
mt19937_engine
- Added the following functions for the
ROCRAND_RNG_PSEUDO_MTGP32generator:rocrand_normal2rocrand_normal_double2rocrand_log_normal2rocrand_log_normal_double2
- Added
rocrand_create_generator_host_blockingwhich dispatches without stream semantics. - Added host-side generator for
ROCRAND_RNG_PSEUDO_MTGP32. - Added offset and skipahead functionality to LFSR113 generator.
- Added dynamic ordering for architecture
gfx1102.
- Building rocRAND now requires a C++17 capable compiler, as the internal library sources now require it. However consuming rocRAND is still possible from C++11 as public headers don't make use of the new features.
- Building rocRAND should be faster on machines with multiple CPU cores as the library has been split to multiple compilation units.
- C++ wrapper: the
min()andmax()member functions of the generators and distributions are nowstatic constexpr. - Rename and unify the existing ROCRAND_DETAIL_.*_BM_NOT_IN_STATE to ROCRAND_DETAIL_BM_NOT_IN_STATE
- Static & dynamic library: moved all internal symbols to namespaces to avoid potential symbol name collisions when linking.
- Deprecated the following typedefs. Please use the unified
state_typealias instead.rocrand_device::threefry2x32_20_engine::threefry2x32_20_staterocrand_device::threefry2x64_20_engine::threefry2x64_20_staterocrand_device::threefry4x32_20_engine::threefry4x32_20_staterocrand_device::threefry4x64_20_engine::threefry4x64_20_state
- Deprecated internal header: src/rng/distribution/distributions.hpp
- Deprecated internal header: src/rng/device_engines.hpp
- Removed references to and workarounds for deprecated hcc.
- Support for HIP-CPU
- SOBOL64 and SCRAMBLED_SOBOL64 generate poisson-distributed
unsigned long long intnumbers instead ofunsigned int. This will be fixed in the next major release.
- Added
rocrand_create_generator_hostwith initial support forROCRAND_RNG_PSEUDO_PHILOX4_32_10andROCRAND_RNG_PSEUDO_MRG31K3P. - Added the option to create a host generator to the Python wrapper
- Added the option to create a host generator to the Fortran wrapper
- Generator classes from
rocrand.hppare no longer copyable (in previous versions these copies would copy internal references to the generators and would lead to double free or memory leak errors)- These types should be moved instead of copied; move constructors and operators are now defined
- Improved MT19937 initialization and generation performance
- Removed the hipRAND submodule from rocRAND; hipRAND is now only available as a separate package
- Removed references to, and workarounds for, the deprecated hcc
mt19937_enginefromrocrand.hppis now move-constructible and move-assignable (the move constructor and move assignment operator was deleted for this class)- Various fixes for the C++ wrapper header
rocrand.hpp- The name of
mrg31k3pit is now correctly spelled (was incorrectly namedmrg31k3ain previous versions) - Added the missing
ordersetter method forthreefry4x64 - Fixed the default ordering parameter for
lfsr113
- The name of
- Build error when using Clang++ directly resulting from unsupported
amdgpu-targetreferences - Added hip::device as dependency to benchmark_rocrand_tuning to make it compile with amdclang++.
- Minor entropy waste in 64-bits Threefry function producing two log-normally-distributed doubles.
- MT19937 pseudo random number generator based on M. Matsumoto and T. Nishimura, 1998, Mersenne Twister: A 623-dimensionally equidistributed uniform pseudorandom number generator
- New benchmark APIs for Google Benchmark:
benchmark_rocrand_device_apireplacesbenchmark_rocrand_kernelbenchmark_curand_host_apireplacesbenchmark_curand_generatebenchmark_curand_device_apireplacesbenchmark_curand_kernel
- Experimental HIP-CPU feature
- ThreeFry pseudorandom number generator based on Salmon et al., 2011, Parallel random numbers: as easy as 1, 2, 3
- Accessor methods for SOBOL 32 and 64 direction vectors and constants:
- Enum
rocrand_direction_vector_setto select the direction vector set rocrand_get_direction_vectors32(...)supersedes:rocrand_h_sobol32_direction_vectorsrocrand_h_scrambled_sobol32_direction_vectors
rocrand_get_direction_vectors64(...)supersedes:rocrand_h_sobol64_direction_vectorsrocrand_h_scrambled_sobol64_direction_vectors
rocrand_get_scramble_constants32(...)supersedesh_scrambled_sobol32_constantsrocrand_get_scramble_constants64(...)supersedesh_scrambled_sobol64_constants
- Enum
- Python 2.7 is no longer officially supported
- MRG31K3P pseudorandom number generator based on L'Ecuyer and Touzin, 2000, Fast combined multiple recursive generators with multipliers of the form a = ±2q ±2r
- LFSR113 pseudorandom number generator based on L'Ecuyer, 1999, Tables of maximally equidistributed combined LFSR generators
SCRAMBLED_SOBOL32andSCRAMBLED_SOBOL64quasirandom number generators (scrambled Sobol sequences are generated by scrambling the output of a Sobol sequence)
- The
mrg_<distribution>_distributionstructures, which provide numbers based on MRG32K3A, have been replaced bymrg_engine_<distribution>_distribution, where<distribution>islog_normal,normal,poisson, oruniform- These structures provide numbers for MRG31K3P (with template type
rocrand_state_mrg31k3p) and MRG32K3A (with template typerocrand_state_mrg32k3a)
- These structures provide numbers for MRG31K3P (with template type
- Sobol64 now returns 64-bit (instead of 32-bit) random numbers, which results in the performance of this generator being regressed
- Bug that prevented Windows code compilation in C++ mode (with a host compiler) when rocRAND headers were included
- New benchmark for the host API using Google benchmark that replaces
benchmark_rocrand_generate, which is deprecated
- Increased the number of warmup iterations for
rocrand_benchmark_generatefrom 5 to 15 to eliminate corner cases that generate artificially high benchmark scores
- Backward compatibility for
#include <rocrand.h>(deprecated) using wrapper header files - Packages for test and benchmark executables on all supported operating systems using CPack
- Generating a random sequence of different sizes now produces the sequence without gaps,
independent of how many values are generated per call
- This is only in the case of XORWOW, MRG32K3A, PHILOX4X32_10, SOBOL32, and SOBOL64
- This is only true if the size in each call is a divisor of the distributions
output_widthdue to performance - The output pointer must be aligned with
output_width * sizeof(output_type)
- hipRAND has been split into a separate package
- Header file installation location changed to match other libraries.
- When using the
rocrand.hheader file, use#include <rocrand/rocrand.h>rather than#include <rocrand.h>
- When using the
- rocRAND still includes hipRAND using a submodule
- The rocRAND package sets the provides field with hipRAND, so projects that require hipRAND can begin to specify it
- Offset behavior for XORWOW, MRG32K3A, and PHILOX4X32_10 generator
- Setting offset now correctly generates the same sequence starting from the offset
- Only uniform
intandfloatwill work, as these can be generated with a single call to the generator
kernel_xorwowunit test is failing for certain GPU architectures
There are no updates for this ROCm release.
- Initial HIP on Windows support
- Packaging has been split into a runtime package (
rocrand) and a development package (rocrand-devel): The development package depends on the runtime package. When installing the runtime package, the package manager will suggest the installation of the development package to aid users transitioning from the previous version's combined package. This suggestion by package manager is for all supported operating systems (except CentOS 7) to aid in the transition. Thesuggestionfeature in the runtime package is introduced as a deprecated feature and will be removed in a future ROCm release.
mrg_uniform_distribution_doubleis no longer generating an incorrect range of values- Order of state calls for
log_normal,normal, anduniform
kernel_xorwowtest is failing for certain GPU architectures
- Sobol64 support
- Benchmark time measurement improvement
- AddressSanitizer build option
- NVCC backend fix
- Fix ranges of MRG32k3a device functions
- gfx90a support
- gfx1030 support
- gfx803 supported re-enabled
- Memory leaks in Poisson tests
- Memory leaks when generator is created, but setting seed/offset/dimensions throws an exception
- The rocRAND benchmark performance drop for
xorwowhas been fixed for older ROCm builds
- Ability to force install dependencies with new
-dflag in install script
- rocRAND package name has been updated to support newer versions of ROCm
- rocRAND benchmark performance drop
- Debug builds via the install script
There are no updates for this ROCm release.
There are no updates for this ROCm release.
There are no updates for this ROCm release.
There are no updates for this ROCm release.
- Package naming now reflects operating system name and architecture
There are no updates for this ROCm release.
- Static library build options were added in the beta; these are subject to change (build method and naming) in future releases
- HIP-Clang is now the default compiler
- HCC build