From 35965a31c8ae00b93bff05c74bdae76d8f33eee6 Mon Sep 17 00:00:00 2001 From: Scott Staniewicz Date: Mon, 9 Feb 2026 16:03:17 -0500 Subject: [PATCH] `gpuResampSlc.cu`: fix out of bounds reading bug in `transformTile` MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit I was getting the following error sporadicall when running CUDA-versions of geo2rdr and gpuResampleSlc Reading in image data for tile 38 of 39 Interpolating tile 38 of 39 terminate called after throwing an instance of 'isce3::cuda::except::CudaError' what(): Error in file .../cxx/isce3/cuda/core/gpuLUT1d.cu, line 77, function isce3::cuda::core::gpuLUT1d< >::~gpuLUT1d() [with T = double]: cudaError 700 (an illegal memory access was encountered) zsh: IOT instruction (core dumped) The CPU version in cxx/isce3/image/ResampSlc.cpp:322-327 has the correct bounds check: if ((iRowResampled < chipHalf) || (iRowResampled >= (inLength - chipHalf))) continue; if ((iColResampled < chipHalf) || ( iColResampled >= (inWidth - chipHalf))) continue; However, the GPU version has only `>`. The chip reading loop iterates iChipRow from 0 to chipSize - 1 (0 to 8), accessing row: iTileRow = iRowResamp + iChipRow - chipHalf The maximum iTileRow is iRowResamp + 8 - 4 = iRowResamp + 4. With the current bounds check (iRowResamp + 4 > inReadableLength → skip), the kernel proceeds when iRowResamp = inReadableLength - 4. That gives: max iTileRow = (inReadableLength - 4) + 4 = inReadableLength ← OUT OF BOUNDS The valid row indices are 0 to inReadableLength - 1, so this reads one row past the end of the tile buffer. --- cxx/isce3/cuda/image/Resample.cu | 2 +- cxx/isce3/cuda/image/gpuResampSlc.cu | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/cxx/isce3/cuda/image/Resample.cu b/cxx/isce3/cuda/image/Resample.cu index 87f8747b7..407b74c16 100644 --- a/cxx/isce3/cuda/image/Resample.cu +++ b/cxx/isce3/cuda/image/Resample.cu @@ -53,7 +53,7 @@ void _resampleToCoordsGlobal( // done with a multiple of the thrd_per_block pixels, but the output data size will // typically be smaller than this multiple. So, some calls to this function on // the device will be for non-existent pixels which must be discarded. - if (pixel_index > resampled_block_width * resampled_block_length) return; + if (pixel_index >= resampled_block_width * resampled_block_length) return; const auto chip_size = static_cast(SINC_ONE); diff --git a/cxx/isce3/cuda/image/gpuResampSlc.cu b/cxx/isce3/cuda/image/gpuResampSlc.cu index ebd55ab7c..198c65d04 100644 --- a/cxx/isce3/cuda/image/gpuResampSlc.cu +++ b/cxx/isce3/cuda/image/gpuResampSlc.cu @@ -125,7 +125,7 @@ void transformTile(thrust::complex *resampledSlc, // Check if resampling possible at the starting rows of a tile. (iRowResamp < chipHalf) // Check if resampling possible at the ending rows of a tile. - || (iRowResamp + chipHalf > inReadableLength); + || (iRowResamp + chipHalf >= inReadableLength); // Check if resampled column index is in bounds by checking if chip // used to resample can be populated. @@ -133,7 +133,7 @@ void transformTile(thrust::complex *resampledSlc, // Check if resampling possible at starting columns of a tile. (iColResamp - chipHalf < 0) // Check if resampling possible at the ending columns of a tile. - || (iColResamp + chipHalf > inWidth); + || (iColResamp + chipHalf >= (long long int)inWidth); // Skip computations if indices out of bound or az/rng not in doppler if (rowOutOfBounds || colOutOfBounds || iRowResamp < 0 || iColResamp < 0)