diff --git a/docs/source/reference/geotiff.rst b/docs/source/reference/geotiff.rst
index d6511587a..1eab532c4 100644
--- a/docs/source/reference/geotiff.rst
+++ b/docs/source/reference/geotiff.rst
@@ -285,9 +285,9 @@ turn the process into a port scanner. The knobs are:
   and ``xrspatial/geotiff/tests/test_http_read_all_bounded_2051.py``.
 * ``XRSPATIAL_COG_MAX_TILE_BYTES``. Per-tile / per-strip compressed
   byte cap (default 256 MiB). Locked by
-  ``xrspatial/geotiff/tests/test_local_tile_byte_cap_1664.py``,
+  ``xrspatial/geotiff/tests/read/test_tiling.py``,
   ``xrspatial/geotiff/tests/test_cloud_read_byte_limit_1928.py``, and
-  ``xrspatial/geotiff/tests/test_gpu_tile_byte_cap_2026_05_18.py``.
+  ``xrspatial/geotiff/tests/read/test_tiling.py``.
 * ``XRSPATIAL_GEOTIFF_HTTP_CONNECT_TIMEOUT`` and
   ``XRSPATIAL_GEOTIFF_HTTP_READ_TIMEOUT``. Per-request connect / read
   timeouts in seconds. Positive floats only; other values fall back
diff --git a/docs/source/reference/release_gate_geotiff.rst b/docs/source/reference/release_gate_geotiff.rst
index b616af390..58a45a0df 100644
--- a/docs/source/reference/release_gate_geotiff.rst
+++ b/docs/source/reference/release_gate_geotiff.rst
@@ -231,7 +231,7 @@ Local GeoTIFF read and write
      - stable
      - Lossless byte-for-byte round-trip on integer and float dtypes.
      - ``xrspatial/geotiff/tests/test_supported_features_tiers_2137.py``,
-       ``xrspatial/geotiff/tests/test_compression.py``
+       ``xrspatial/geotiff/tests/read/test_compression.py``
      - `#2340`_
    * - Stable codec round-trip (read / write / read)
      - stable
@@ -354,7 +354,7 @@ HTTP / fsspec reads
      - Tile or strip declared sizes exceeding ``XRSPATIAL_COG_MAX_TILE_BYTES``
        (default 256 MiB) raise ``ValueError``.
      - ``xrspatial/geotiff/tests/test_cloud_read_byte_limit_1928.py``,
-       ``xrspatial/geotiff/tests/test_gpu_tile_byte_cap_2026_05_18.py``
+       ``xrspatial/geotiff/tests/read/test_tiling.py``
      - `#2344`_
    * - ``max_cloud_bytes`` dispatcher pass-through
      - stable
@@ -665,7 +665,7 @@ GPU paths (experimental)
      - Integer and float nodata sentinels survive the GPU read / write
        round-trip.
      - ``xrspatial/geotiff/tests/test_gpu_nodata_1542.py``,
-       ``xrspatial/geotiff/tests/test_apply_nodata_mask_gpu_inplace_1934.py``
+       ``xrspatial/geotiff/tests/read/test_nodata.py``
      - `#2341`_
 
 Internal-only surfaces (not promised)
diff --git a/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR8.md b/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR8.md
new file mode 100644
index 000000000..02fa2fc6d
--- /dev/null
+++ b/xrspatial/geotiff/tests/CLUSTER_AUDIT_PR8.md
@@ -0,0 +1,193 @@
+# CLUSTER_AUDIT_PR8.md — Reader-path tests
+
+Temporary audit table mapping every old `file::test` to its new home in
+the `read/` cluster. Deleted in a follow-up commit on the same branch
+before merge, per the epic #2390 contract.
+
+## Cluster split
+
+PR 8 owns the reader-side cluster. The following eight files land under
+`xrspatial/geotiff/tests/read/`:
+
+- `read/test_basic.py` — minimal read paths, band validation.
+- `read/test_dtypes.py` — reader dtype handling (eager / dask / GPU).
+- `read/test_compression.py` — decompression-codec round-trips and
+  bomb caps (DEFLATE / LZW / ZSTD / PACKBITS / LZ4 / LERC / JPEG2000 /
+  JPEG).
+- `read/test_tiling.py` — tile / strip byte-count cap on CPU and GPU.
+- `read/test_endianness.py` — big-endian multi-byte read paths.
+- `read/test_nodata.py` — nodata propagation on read (GPU helper).
+- `read/test_coords.py` — descending / ascending coord round-trip.
+- `read/test_streaming.py` — streaming BigTIFF threshold (folds in
+  `xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.py`).
+
+PR 3's `read/test_crs.py` (rotated / dropped / missing CRS) is the
+parallel sibling and is left for that PR.
+
+## Folded files
+
+### `test_band_validation_1673.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `test_band_validation_1673.py::test_read_to_array_negative_band_rejected` | `read/test_basic.py::TestBandValidationLocal::test_negative_band_rejected` | Renamed, class-grouped. Same assertion. |
+| `test_band_validation_1673.py::test_read_to_array_band_equal_to_samples_rejected` | `read/test_basic.py::TestBandValidationLocal::test_band_equal_to_samples_rejected` | Same. |
+| `test_band_validation_1673.py::test_read_to_array_band_far_above_samples_rejected` | `read/test_basic.py::TestBandValidationLocal::test_band_far_above_samples_rejected` | Same. |
+| `test_band_validation_1673.py::test_read_to_array_valid_band_still_works` | `read/test_basic.py::TestBandValidationLocal::test_valid_band_still_works` | Same. |
+| `test_band_validation_1673.py::test_read_to_array_band_none_still_returns_all_bands` | `read/test_basic.py::TestBandValidationLocal::test_band_none_returns_all_bands` | Same. |
+| `test_band_validation_1673.py::test_backend_parity_negative_band` | `read/test_basic.py::TestBandValidationBackendParity::test_negative_band` | Class-grouped. |
+| `test_band_validation_1673.py::test_backend_parity_band_equal_to_samples` | `read/test_basic.py::TestBandValidationBackendParity::test_band_equal_to_samples` | Class-grouped. |
+| (fixture) `multiband_tiff_path` | same fixture in `read/test_basic.py` | Filename in tmp_path renamed `mb_1673.tif` -> `mb_band_validation.tif`. |
+
+### `test_dtype_read.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `test_dtype_read.py::TestDtypeEager::*` | `read/test_dtypes.py::TestDtypeEager::*` | Verbatim. Fixture filenames renamed `test_1083_*.tif` -> `dtype_*.tif`. |
+| `test_dtype_read.py::TestDtypeDask::*` | `read/test_dtypes.py::TestDtypeDask::*` | Same. |
+
+### `test_float16_read_1941.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `TestDtypeMap::*` | `read/test_dtypes.py::TestFloat16DtypeMap::*` | Renamed to disambiguate from generic dtype map tests. Body unchanged. |
+| `TestEagerFloat16Read::*` | `read/test_dtypes.py::TestEagerFloat16Read::*` | Verbatim. |
+| `TestPredictor3Float16::*` | `read/test_dtypes.py::TestPredictor3Float16::*` | Verbatim. |
+| `TestRegressionGuards::*` | `read/test_dtypes.py::TestFloat16RegressionGuards::*` | Class renamed (no name collisions with other regression-guard classes). |
+
+### `test_float16_read_gpu_1941.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `TestEagerGPUReadFloat16::*` | `read/test_dtypes.py::TestEagerGPUReadFloat16::*` | Body unchanged. Module-level `pytestmark` skip replaced with per-method `@_gpu_only` since the consolidated file mixes GPU and non-GPU tests. |
+| `TestGPUWindowedFloat16::*` | `read/test_dtypes.py::TestGPUWindowedFloat16::*` | Same. |
+| `TestDaskGPUFloat16::*` | `read/test_dtypes.py::TestDaskGPUFloat16::*` | Same. |
+| `TestGDSPathGatedOffForFloat16::*` | `read/test_dtypes.py::TestGDSPathGatedOffForFloat16::*` | Same. |
+| `TestBackendParityFloat16::*` | `read/test_dtypes.py::TestBackendParityFloat16::*` | Same. |
+| `TestPredictor3Float16GPU::*` | `read/test_dtypes.py::TestPredictor3Float16GPU::*` | Same. |
+
+### `test_compression.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `TestDeflate::*` | `read/test_compression.py::TestDeflate::*` | Verbatim. |
+| `TestLZW::*` | `read/test_compression.py::TestLZW::*` | Verbatim. |
+| `TestPredictor::*` | `read/test_compression.py::TestPredictor::*` | Verbatim. |
+| `TestDispatch::*` | `read/test_compression.py::TestDispatch::*` | Verbatim. |
+
+### `test_decompression_caps.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `TestCodecDirect::*` | `read/test_compression.py::TestCodecDirect::*` | Verbatim. |
+| `TestZstdDirect::*` | `read/test_compression.py::TestZstdDirect::*` | Verbatim. |
+| `TestLz4Direct::*` | `read/test_compression.py::TestLz4Direct::*` | Verbatim. |
+| `test_deflate_bomb_rejected` | `read/test_compression.py::test_deflate_bomb_rejected` | Verbatim. |
+| `test_zstd_bomb_rejected` | `read/test_compression.py::test_zstd_bomb_rejected` | Verbatim. |
+| `test_lz4_bomb_rejected` | `read/test_compression.py::test_lz4_bomb_rejected` | Verbatim. |
+| `test_packbits_bomb_rejected` | `read/test_compression.py::test_packbits_bomb_rejected` | Verbatim. |
+| `test_legitimate_high_compression_passes` | `read/test_compression.py::test_legitimate_high_compression_passes` | Verbatim. |
+| `test_cap_includes_metadata_margin` | `read/test_compression.py::test_cap_includes_metadata_margin` | Verbatim. |
+| `TestLercDirect::*` | `read/test_compression.py::TestLercDirect::*` | Verbatim. |
+| `TestJpeg2000Direct::*` | `read/test_compression.py::TestJpeg2000Direct::*` | Verbatim. |
+| `TestJpegDirect::*` | `read/test_compression.py::TestJpegDirect::*` | Verbatim. |
+
+### `test_local_tile_byte_cap_1664.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `TestLocalTileByteCap::*` | `read/test_tiling.py::TestLocalTileByteCap::*` | Verbatim. Fixture filenames renamed `forged_local_*_1664.tif` -> `forged_*.tif`. |
+| `TestLocalStripByteCap::*` | `read/test_tiling.py::TestLocalStripByteCap::*` | Same. |
+| `test_max_tile_bytes_env_negative_falls_back` | `read/test_tiling.py::test_max_tile_bytes_env_negative_falls_back` | Verbatim. |
+| `test_max_tile_bytes_env_zero_falls_back` | `read/test_tiling.py::test_max_tile_bytes_env_zero_falls_back` | Verbatim. |
+| `test_max_tile_bytes_env_garbage_falls_back` | `read/test_tiling.py::test_max_tile_bytes_env_garbage_falls_back` | Verbatim. |
+| Import: `from ._helpers.tiff_surgery import ...` | `from .._helpers.tiff_surgery import ...` | One-level deeper under `read/`. |
+
+### `test_gpu_tile_byte_cap_2026_05_18.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `TestGpuTileByteCap::*` | `read/test_tiling.py::TestGpuTileByteCap::*` | Verbatim. Shares `_build_forged_tiled_cog` helper with the CPU class via a `basename` parameter so the two CPU-vs-GPU forged-tile groups do not collide on `tmp_path`. |
+| `TestGpuChunkedTileByteCap::*` | `read/test_tiling.py::TestGpuChunkedTileByteCap::*` | Verbatim. |
+
+### `test_gpu_byteswap_1508.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `test_read_geotiff_gpu_big_endian_multibyte[*]` | `read/test_endianness.py::test_read_geotiff_gpu_big_endian_multibyte[*]` | Verbatim. |
+| `test_read_geotiff_gpu_big_endian_uncompressed` | `read/test_endianness.py::test_read_geotiff_gpu_big_endian_uncompressed` | Verbatim. |
+| `test_xp_byteswap_preserves_dtype` | `read/test_endianness.py::test_xp_byteswap_preserves_dtype` | Verbatim. |
+| `test_xp_byteswap_uint8_passthrough` | `read/test_endianness.py::test_xp_byteswap_uint8_passthrough` | Verbatim. |
+
+### `test_apply_nodata_mask_gpu_inplace_1934.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `test_apply_nodata_mask_gpu_float_masks_sentinel_to_nan_1934` | `read/test_nodata.py::test_apply_nodata_mask_gpu_float_masks_sentinel_to_nan` | Issue number dropped from name. Body unchanged. |
+| `test_apply_nodata_mask_gpu_float_in_place_no_copy_1934` | `read/test_nodata.py::test_apply_nodata_mask_gpu_float_in_place_no_copy` | Same. |
+| `test_apply_nodata_mask_gpu_float_alloc_count_unchanged_1934` | `read/test_nodata.py::test_apply_nodata_mask_gpu_float_alloc_count_unchanged` | Same. |
+| `test_apply_nodata_mask_gpu_int_promotes_and_masks_1934` | `read/test_nodata.py::test_apply_nodata_mask_gpu_int_promotes_and_masks` | Same. |
+| `test_apply_nodata_mask_gpu_int_no_extra_buffer_after_astype_1934` | `read/test_nodata.py::test_apply_nodata_mask_gpu_int_no_extra_buffer_after_astype` | Same. |
+| `test_apply_nodata_mask_gpu_float_nan_sentinel_noop_1934` | `read/test_nodata.py::test_apply_nodata_mask_gpu_float_nan_sentinel_noop` | Same. |
+| `test_apply_nodata_mask_gpu_none_nodata_passthrough_1934` | `read/test_nodata.py::test_apply_nodata_mask_gpu_none_nodata_passthrough` | Same. |
+
+### `test_apply_nodata_mask_gpu_with_presence_removed_2208.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `test_apply_nodata_mask_gpu_with_presence_not_importable_2208` | `read/test_nodata.py::test_apply_nodata_mask_gpu_with_presence_not_importable` | Issue number dropped. Same `ImportError` assertion. |
+| `test_apply_nodata_mask_gpu_still_present_2208` | `read/test_nodata.py::test_apply_nodata_mask_gpu_still_present` | Same. |
+
+### `test_descending_coords_1716.py` (deleted)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `test_descending_x_roundtrip` | `read/test_coords.py::TestDescendingCoordsRoundTrip::test_descending_x_roundtrip` | Class-grouped. tmp_path filenames renamed (`tmp_1716_desc_x.tif` -> `desc_x.tif`). |
+| `test_ascending_y_roundtrip` | `read/test_coords.py::TestDescendingCoordsRoundTrip::test_ascending_y_roundtrip` | Same. |
+| `test_descending_x_and_ascending_y_roundtrip` | `read/test_coords.py::TestDescendingCoordsRoundTrip::test_descending_x_and_ascending_y_roundtrip` | Same. |
+| `test_north_up_still_uses_pixel_scale_and_tiepoint` | `read/test_coords.py::TestOrientationTagSelection::test_north_up_uses_pixel_scale_and_tiepoint` | Class-grouped, name slimmed. |
+| `test_descending_x_uses_transformation_tag` | `read/test_coords.py::TestOrientationTagSelection::test_descending_x_uses_transformation_tag` | Same. |
+| `test_ascending_y_uses_transformation_tag` | `read/test_coords.py::TestOrientationTagSelection::test_ascending_y_uses_transformation_tag` | Same. |
+
+### `xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.py` (deleted — cross-directory move)
+
+| Old `file::test` | New `file::test_id` | Notes |
+|---|---|---|
+| `TestShouldUseBigTIFFStreaming::*` | `read/test_streaming.py::TestShouldUseBigTIFFStreaming::*` | Verbatim. |
+| `TestStreamingBigTIFFUserOverride::*` | `read/test_streaming.py::TestStreamingBigTIFFUserOverride::*` | Verbatim. Fixture filenames renamed `*_1785.tif` -> issue-number-free. |
+
+## Files NOT folded in (justified)
+
+Several files in the prompt's "key examples" list turned out to be
+writer-side or unit-level on inspection and would conflict with another
+PR's surface. They are left in place for their natural cluster:
+
+| File | Reason left in place |
+|---|---|
+| `test_accuracy_1081.py` | Mixed read/write numerical accuracy with parity surface area; folding into `read/test_basic.py` would expand PR scope beyond the reader-only contract. Defer to PR 11 unit-cleanup. |
+| `test_ambiguous_metadata_hooks_1987.py` | Metadata contract / parity surface — overlaps with PR 4 (parity) and PR 5 (attrs contract). |
+| `test_assemble_layout_no_bytes_copy_1756.py` | Tests `_assemble_standard_layout`, `_assemble_cog_layout`, `_assemble_tiff` — writer internals. Belongs to PR 7. |
+| `test_bytesio_source.py` | Mixed BytesIO read/write; round-trip surface area is large and the file already groups its own concerns coherently. Defer to PR 11. |
+| `test_chunked_gpu_declared_dtype_1909.py` | Mixed dtype/dask coverage that overlaps with the parity matrix (PR 4). |
+| `test_compression_docstring_1644.py` | Tests `write_geotiff_gpu` docstring + GPU writer codec acceptance — writer-side. Belongs to PR 7. |
+| `test_compression_level.py` | Tests `to_geotiff(compression_level=...)` — writer-side. Belongs to PR 7. |
+| `test_conflicting_crs_write_1987.py` | Writer-side (CRS conflict on write). Belongs to PR 7. |
+| `test_coord_regularity_1720.py` | Tests `_coords_to_transform` validation on the writer path. Belongs to PR 7. |
+| `test_coords_1813.py` | Unit tests of `xrspatial.geotiff._coords` helpers — fits `unit/` (PR 11). |
+| `test_coords_to_transform_3d_1643.py` | Writer-side coord-to-transform. Belongs to PR 7. |
+| `test_predictor2_big_endian.py` / `test_predictor2_big_endian_gpu_1517.py` / `test_predictor3_big_endian.py` / `test_predictor3_int_dtype*` / `test_predictor_fp_write_*` | Predictor coverage overlaps with the writer codec matrix (PR 7) and the parity matrix (PR 4). Defer to a future endianness/predictor sub-cluster rather than risk colliding mid-PR. |
+
+## Verification
+
+- 134 tests collected in `xrspatial/geotiff/tests/read/` after PR 8 (8
+  modules, including PR 3's `test_crs.py` once that PR lands).
+- Total `test_*.py` files removed across the PR: 13 (12 inside
+  `geotiff/tests/`, plus the one cross-directory move from
+  `xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.py`).
+- New `test_*.py` files added under `read/`: 8 (plus the empty
+  `__init__.py`).
+- Net delta inside `geotiff/tests/`: -12 + 8 = -4 `test_*.py` files
+  (`find xrspatial/geotiff/tests -name 'test_*.py' | wc -l` goes from
+  352 to 348).
+- Net delta inside `xrspatial/tests/`: -1 `test_*.py` file.
+- Total PR-wide `test_*.py` delta: -5.
diff --git a/xrspatial/geotiff/tests/read/__init__.py b/xrspatial/geotiff/tests/read/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/xrspatial/geotiff/tests/read/test_basic.py b/xrspatial/geotiff/tests/read/test_basic.py
new file mode 100644
index 000000000..c5ebe3795
--- /dev/null
+++ b/xrspatial/geotiff/tests/read/test_basic.py
@@ -0,0 +1,112 @@
+"""Minimal reader paths: band validation, byte-band, eager read.
+
+Consolidates the reader-side band-validation regression coverage
+(formerly ``test_band_validation_1673.py``). The contract is that every
+backend rejects out-of-range ``band`` arguments with the same typed
+``IndexError`` so callers see consistent diagnostics regardless of
+which path they pick.
+"""
+from __future__ import annotations
+
+import numpy as np
+import pytest
+import xarray as xr
+
+
+@pytest.fixture
+def multiband_tiff_path(tmp_path):
+    """4x6 three-band tiled tiff for band-validation tests."""
+    from xrspatial.geotiff import to_geotiff
+
+    arr = np.arange(72, dtype=np.float32).reshape(4, 6, 3)
+    da = xr.DataArray(
+        arr,
+        dims=['y', 'x', 'band'],
+        coords={
+            'y': np.array([0.5, 1.5, 2.5, 3.5]),
+            'x': np.array([0.5, 1.5, 2.5, 3.5, 4.5, 5.5]),
+            'band': [0, 1, 2],
+        },
+        attrs={'crs': 4326},
+    )
+    p = tmp_path / 'mb_band_validation.tif'
+    to_geotiff(da, str(p), tile_size=16)
+    return str(p), arr
+
+
+class TestBandValidationLocal:
+    """``read_to_array`` rejects out-of-range band indices."""
+
+    def test_negative_band_rejected(self, multiband_tiff_path):
+        """``band=-1`` no longer silently selects the last channel."""
+        from xrspatial.geotiff._reader import read_to_array
+
+        path, _ = multiband_tiff_path
+        with pytest.raises(IndexError, match="band=-1 out of range"):
+            read_to_array(path, band=-1)
+
+    def test_band_equal_to_samples_rejected(self, multiband_tiff_path):
+        """``band=samples_per_pixel`` (off-by-one) raises a typed error."""
+        from xrspatial.geotiff._reader import read_to_array
+
+        path, _ = multiband_tiff_path
+        with pytest.raises(IndexError, match="band=3 out of range"):
+            read_to_array(path, band=3)
+
+    def test_band_far_above_samples_rejected(self, multiband_tiff_path):
+        """A wildly out-of-range band index gives the same typed error."""
+        from xrspatial.geotiff._reader import read_to_array
+
+        path, _ = multiband_tiff_path
+        with pytest.raises(IndexError, match="band=103 out of range"):
+            read_to_array(path, band=103)
+
+    def test_valid_band_still_works(self, multiband_tiff_path):
+        """Valid band indices keep working after the validation guard."""
+        from xrspatial.geotiff._reader import read_to_array
+
+        path, arr = multiband_tiff_path
+        out, _ = read_to_array(path, band=1)
+        np.testing.assert_array_equal(out, arr[:, :, 1])
+
+    def test_band_none_returns_all_bands(self, multiband_tiff_path):
+        """``band=None`` still returns the full multi-band array."""
+        from xrspatial.geotiff._reader import read_to_array
+
+        path, arr = multiband_tiff_path
+        out, _ = read_to_array(path)
+        np.testing.assert_array_equal(out, arr)
+
+
+class TestBandValidationBackendParity:
+    """Local eager and dask paths agree on the rejection contract."""
+
+    def test_negative_band(self, multiband_tiff_path):
+        """Both paths raise the same error for ``band=-1``."""
+        from xrspatial.geotiff import read_geotiff_dask
+        from xrspatial.geotiff._reader import read_to_array
+
+        path, _ = multiband_tiff_path
+
+        with pytest.raises(IndexError) as eager_exc:
+            read_to_array(path, band=-1)
+        with pytest.raises(IndexError) as dask_exc:
+            read_geotiff_dask(path, chunks=4, band=-1)
+
+        assert "band=-1 out of range" in str(eager_exc.value)
+        assert "band=-1 out of range" in str(dask_exc.value)
+
+    def test_band_equal_to_samples(self, multiband_tiff_path):
+        """Both paths agree on the off-by-one rejection."""
+        from xrspatial.geotiff import read_geotiff_dask
+        from xrspatial.geotiff._reader import read_to_array
+
+        path, _ = multiband_tiff_path
+
+        with pytest.raises(IndexError) as eager_exc:
+            read_to_array(path, band=3)
+        with pytest.raises(IndexError) as dask_exc:
+            read_geotiff_dask(path, chunks=4, band=3)
+
+        assert "band=3 out of range" in str(eager_exc.value)
+        assert "band=3 out of range" in str(dask_exc.value)
diff --git a/xrspatial/geotiff/tests/test_decompression_caps.py b/xrspatial/geotiff/tests/read/test_compression.py
similarity index 70%
rename from xrspatial/geotiff/tests/test_decompression_caps.py
rename to xrspatial/geotiff/tests/read/test_compression.py
index 5954c8e43..19e42d583 100644
--- a/xrspatial/geotiff/tests/test_decompression_caps.py
+++ b/xrspatial/geotiff/tests/read/test_compression.py
@@ -1,17 +1,22 @@
-"""Tests for decompression-bomb defenses (security finding S1).
-
-Each codec used by the TIFF reader (deflate, zstd, lz4, packbits) accepts an
-``expected_size`` argument and refuses to produce more than ~5% above that
-size before raising ``ValueError``.  Without these caps a small malicious
-TIFF could expand to many GB during decode and OOM the reader before the
-post-decode size check ran.
-
-Each end-to-end test here builds a minimal TIFF that declares a 1024x1024
-uint8 image (1 MiB of legitimate pixel data) and feeds in a strip whose
-decoded size is several MiB. That ratio is enough to trip the cap (~1.05
-MiB) without forcing the test process to allocate a multi-gigabyte
-payload host-side -- the audit's original 1024:1 framing was symbolic;
-what we actually verify is "compressed size << decoded size > cap".
+"""Reader compression-codec coverage.
+
+Consolidates:
+
+* ``test_compression.py`` -- codec round-trip unit tests for the
+  reader's decompression entry points (deflate, LZW, predictor encode /
+  decode, the dispatcher).
+* ``test_decompression_caps.py`` -- decompression-bomb defenses
+  (security finding S1) across deflate, ZSTD, LZ4, PackBits, LERC,
+  JPEG 2000, and JPEG.
+
+Several classes here (``TestDeflate``, ``TestLZW``, ``TestPredictor``,
+``TestDispatch``, the ``Test*Direct`` codec bomb classes, and the JPEG
+SOF cap class) call the codec functions directly rather than going
+through ``open_geotiff`` / ``read_to_array``. They live under ``read/``
+because the reader is the only consumer of those decode entry points;
+the writer side of the same codecs is exercised from PR 7's writer
+cluster. Future maintainers scanning ``read/`` should treat these as
+reader-internal codec coverage rather than end-to-end read paths.
 """
 from __future__ import annotations
 
@@ -22,8 +27,11 @@
 import numpy as np
 import pytest
 
-from xrspatial.geotiff._compression import (deflate_decompress, lz4_decompress, packbits_decompress,
-                                            zstd_decompress)
+from xrspatial.geotiff._compression import (COMPRESSION_DEFLATE, COMPRESSION_LZW, COMPRESSION_NONE,
+                                            compress, decompress, deflate_compress,
+                                            deflate_decompress, lz4_decompress, lzw_compress,
+                                            lzw_decompress, packbits_decompress,
+                                            predictor_decode, predictor_encode, zstd_decompress)
 from xrspatial.geotiff._reader import read_to_array
 
 
@@ -50,9 +58,123 @@ def _module_available(name: str) -> bool:
 
 
 # ---------------------------------------------------------------------------
-# Helpers
+# Codec round-trips (formerly test_compression.py)
 # ---------------------------------------------------------------------------
 
+
+class TestDeflate:
+    def test_round_trip(self):
+        data = b'hello world! ' * 100
+        compressed = deflate_compress(data)
+        assert compressed != data
+        assert deflate_decompress(compressed) == data
+
+    def test_empty(self):
+        compressed = deflate_compress(b'')
+        assert deflate_decompress(compressed) == b''
+
+    def test_binary_data(self):
+        data = bytes(range(256)) * 10
+        compressed = deflate_compress(data)
+        assert deflate_decompress(compressed) == data
+
+
+class TestLZW:
+    def test_round_trip_simple(self):
+        data = b'ABCABCABCABC'
+        compressed = lzw_compress(data)
+        decompressed = lzw_decompress(compressed, len(data))
+        assert decompressed.tobytes() == data
+
+    def test_round_trip_repetitive(self):
+        data = b'\x00' * 1000
+        compressed = lzw_compress(data)
+        decompressed = lzw_decompress(compressed, len(data))
+        assert decompressed.tobytes() == data
+
+    def test_round_trip_sequential(self):
+        data = bytes(range(256))
+        compressed = lzw_compress(data)
+        decompressed = lzw_decompress(compressed, len(data))
+        assert decompressed.tobytes() == data
+
+    def test_round_trip_random(self):
+        rng = np.random.RandomState(42)
+        data = bytes(rng.randint(0, 256, size=500, dtype=np.uint8))
+        compressed = lzw_compress(data)
+        decompressed = lzw_decompress(compressed, len(data))
+        assert decompressed.tobytes() == data
+
+    def test_round_trip_large(self):
+        rng = np.random.RandomState(123)
+        data = bytes(rng.randint(0, 256, size=10000, dtype=np.uint8))
+        compressed = lzw_compress(data)
+        decompressed = lzw_decompress(compressed, len(data))
+        assert decompressed.tobytes() == data
+
+    def test_empty(self):
+        compressed = lzw_compress(b'')
+        decompressed = lzw_decompress(compressed, 0)
+        assert decompressed.tobytes() == b''
+
+
+class TestPredictor:
+    def test_round_trip_uint8(self):
+        # 4x4 image, 1 byte per sample
+        data = np.array([10, 20, 30, 40, 50, 60, 70, 80,
+                         90, 100, 110, 120, 130, 140, 150, 160],
+                        dtype=np.uint8)
+        encoded = predictor_encode(data.copy(), 4, 4, 1)
+        decoded = predictor_decode(encoded.copy(), 4, 4, 1)
+        np.testing.assert_array_equal(decoded, data)
+
+    def test_round_trip_float32(self):
+        # 2x3 image, 4 bytes per sample
+        arr = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], dtype=np.float32)
+        raw = np.frombuffer(arr.tobytes(), dtype=np.uint8).copy()
+        encoded = predictor_encode(raw.copy(), 3, 2, 4)
+        decoded = predictor_decode(encoded.copy(), 3, 2, 4)
+        np.testing.assert_array_equal(decoded, raw)
+
+    def test_predictor_encode_differences(self):
+        # First pixel unchanged, rest are differences
+        data = np.array([10, 20, 30, 40], dtype=np.uint8)
+        encoded = predictor_encode(data.copy(), 4, 1, 1)
+        assert encoded[0] == 10
+        assert encoded[1] == 10  # 20 - 10
+        assert encoded[2] == 10  # 30 - 20
+        assert encoded[3] == 10  # 40 - 30
+
+
+class TestDispatch:
+    def test_none(self):
+        data = b'hello'
+        assert decompress(data, COMPRESSION_NONE).tobytes() == data
+        assert compress(data, COMPRESSION_NONE) == data
+
+    def test_deflate(self):
+        data = b'test data ' * 50
+        compressed = compress(data, COMPRESSION_DEFLATE)
+        assert decompress(compressed, COMPRESSION_DEFLATE).tobytes() == data
+
+    def test_lzw(self):
+        data = b'ABCABC' * 20
+        compressed = compress(data, COMPRESSION_LZW)
+        decompressed = decompress(compressed, COMPRESSION_LZW, len(data))
+        assert decompressed.tobytes() == data
+
+    def test_unsupported(self):
+        with pytest.raises(ValueError, match="Unsupported compression"):
+            decompress(b'', 99)
+        with pytest.raises(ValueError, match="Unsupported compression"):
+            compress(b'', 99)
+
+
+# ---------------------------------------------------------------------------
+# Decompression-bomb caps (formerly test_decompression_caps.py)
+# ---------------------------------------------------------------------------
+
+
 def _build_tiff_with_strip(strip_bytes: bytes, *, compression: int,
                            width: int, height: int) -> bytes:
     """Build a minimal little-endian uint8 TIFF with one strip of opaque bytes.
@@ -64,9 +186,6 @@ def _build_tiff_with_strip(strip_bytes: bytes, *, compression: int,
     """
     bo = '<'
 
-    # Tags: (tag_id, type_id, count, value_or_offset_bytes_4)
-    # We keep every tag's value inline so the TIFF body order is:
-    #   header(8) | IFD | strip
     tags = []
 
     def add_short(tag, val):
@@ -88,11 +207,10 @@ def add_long(tag, val):
 
     tags.sort(key=lambda t: t[0])
     num_entries = len(tags)
-    ifd_size = 2 + 12 * num_entries + 4  # count + entries + next-IFD
+    ifd_size = 2 + 12 * num_entries + 4
     ifd_start = 8
     strip_offset = ifd_start + ifd_size
 
-    # Patch StripOffsets (tag 273) with the real strip location.
     patched = []
     for tag_id, typ, count, raw in tags:
         if tag_id == 273:
@@ -101,23 +219,18 @@ def add_long(tag, val):
     tags = patched
 
     out = bytearray()
-    out += b'II'                                      # little-endian
-    out += struct.pack(f'{bo}H', 42)                  # magic
-    out += struct.pack(f'{bo}I', ifd_start)           # offset to IFD0
-    out += struct.pack(f'{bo}H', num_entries)         # IFD entry count
+    out += b'II'
+    out += struct.pack(f'{bo}H', 42)
+    out += struct.pack(f'{bo}I', ifd_start)
+    out += struct.pack(f'{bo}H', num_entries)
     for tag_id, typ, count, raw in tags:
         out += struct.pack(f'{bo}HHI', tag_id, typ, count)
-        # Pad raw to 4 bytes (all our tags fit inline).
         out += raw.ljust(4, b'\x00')
-    out += struct.pack(f'{bo}I', 0)                    # no next IFD
+    out += struct.pack(f'{bo}I', 0)
     out += strip_bytes
     return bytes(out)
 
 
-# ---------------------------------------------------------------------------
-# Codec-level direct tests
-# ---------------------------------------------------------------------------
-
 # Direct-codec bomb size: 4 MiB of zeros, well above the 1 KiB cap used
 # in those tests but small enough to keep CI host allocations under
 # control. 4 MiB / 1 KiB ~ 4000:1 is still bomb-shaped territory; the
@@ -141,14 +254,8 @@ def test_deflate_legitimate_passes(self):
         assert out == data
 
     def test_deflate_no_cap_when_expected_size_zero(self):
-        """Backward-compat: ``expected_size=0`` (default) disables the cap.
-
-        Callers that haven't been updated to supply a size must keep
-        getting the unbounded library decode -- otherwise a default
-        cap would silently break them. Round-tripping data larger than
-        any plausible cap proves the disable path is intact.
-        """
-        data = b'A' * (256 * 1024)  # 256 KiB, well above any cap default
+        """Backward-compat: ``expected_size=0`` (default) disables the cap."""
+        data = b'A' * (256 * 1024)
         comp = zlib.compress(data, 9)
         out = deflate_decompress(comp)  # no expected_size kwarg
         assert out == data
@@ -168,8 +275,6 @@ def test_packbits_legitimate_passes(self):
         assert out == b'ABCD'
 
     def test_packbits_no_cap_when_expected_size_zero(self):
-        # Same literal-run pattern, no expected_size argument: the
-        # backward-compat path must skip the cap and decode in full.
         data = b'\x03ABCD' * 1024
         out = packbits_decompress(data)
         assert out == b'ABCD' * 1024
@@ -224,15 +329,13 @@ def test_lz4_no_cap_when_expected_size_zero(self):
 
 
 # ---------------------------------------------------------------------------
-# End-to-end TIFF tests (audit reproducer shape)
+# End-to-end TIFF bomb tests (audit reproducer shape)
 # ---------------------------------------------------------------------------
 
 # 1024 x 1024 uint8 = 1 MiB declared image, cap is ~1.05 MiB. We feed a
 # strip whose decoded size is 8 MiB. The cap is exceeded by ~7x, which is
 # enough to prove the codec rejects rather than silently truncates, while
 # keeping the test's host-side allocation small enough for any CI runner.
-# (The audit's original 1024:1 framing was symbolic; the defense fires the
-# moment decoded > cap, not at any specific ratio.)
 _DECLARED_W = 1024
 _DECLARED_H = 1024
 _DECLARED_BYTES = _DECLARED_W * _DECLARED_H  # 1 MiB
@@ -271,24 +374,16 @@ def test_lz4_bomb_rejected(tmp_path):
     import lz4.frame
     payload = b'\x00' * _BOMB_BYTES
     strip = lz4.frame.compress(payload)
-    # LZ4 has a higher floor than deflate/zstd for runs of zeros, but
-    # still well below the bomb size.
     assert len(strip) < _BOMB_BYTES // 4
     tiff = _build_tiff_with_strip(strip, compression=50004,
                                   width=_DECLARED_W, height=_DECLARED_H)
     path = tmp_path / "lz4_bomb.tif"
     path.write_bytes(tiff)
-    # LZ4 is the Experimental read tier (PR 4 of epic #2340); pass the
-    # opt-in so the test exercises the bomb cap rather than the codec
-    # gate.
     with pytest.raises(ValueError, match="exceed"):
         read_to_array(str(path), allow_experimental_codecs=True)
 
 
 def test_packbits_bomb_rejected(tmp_path):
-    # Packbits "repeat next byte 128 times" header is 0x81 0x00 (2 bytes).
-    # We declare 1024x1024=1 MiB image but supply a 2 MiB strip that
-    # decodes to 128 MiB.  The cap should fire long before allocation.
     strip = b'\x81\x00' * (1024 * 1024)
     tiff = _build_tiff_with_strip(strip, compression=32773,
                                   width=_DECLARED_W, height=_DECLARED_H)
@@ -302,11 +397,11 @@ def test_packbits_bomb_rejected(tmp_path):
 # Negative tests: legitimate high-ratio compression must still pass
 # ---------------------------------------------------------------------------
 
+
 def test_legitimate_high_compression_passes(tmp_path):
     """All-zero array compresses to a fraction of declared size — must pass."""
     arr = np.zeros((_DECLARED_H, _DECLARED_W), dtype=np.uint8)
     strip = zlib.compress(arr.tobytes(), 9)
-    # Confirm we actually have a high ratio (not a degenerate test).
     assert len(strip) < _DECLARED_BYTES // 50
     tiff = _build_tiff_with_strip(strip, compression=8,
                                   width=_DECLARED_W, height=_DECLARED_H)
@@ -319,14 +414,8 @@ def test_legitimate_high_compression_passes(tmp_path):
 
 
 def test_cap_includes_metadata_margin():
-    """The cap allows ~5% of legitimate codec metadata above expected size.
-
-    Some encoders emit small framing or trailing bytes; the cap must not
-    reject them.  We feed a payload exactly at expected_size + a few bytes
-    and confirm it decodes.
-    """
+    """The cap allows ~5% of legitimate codec metadata above expected size."""
     expected = 1000
-    # Decompressed size: expected + 30 (3% over).  Within the 5% margin.
     data = b'A' * (expected + 30)
     comp = zlib.compress(data, 9)
     out = deflate_decompress(comp, expected_size=expected)
@@ -334,31 +423,19 @@ def test_cap_includes_metadata_margin():
 
 
 # ---------------------------------------------------------------------------
-# LERC and JPEG 2000 codec-level bomb tests (issue #1625)
+# LERC and JPEG 2000 codec-level bomb tests
 # ---------------------------------------------------------------------------
-#
-# LERC and JPEG 2000 use external libraries (lerc / glymur) that materialise
-# the full decoded buffer before returning, so the existing post-decode
-# size check in ``_decode_strip_or_tile`` fires only after the bomb is
-# already in memory. The wrappers in ``_compression.py`` now query each
-# codestream's declared dimensions (LERC via ``getLercBlobInfo``, JPEG 2000
-# via ``Jp2k.shape``) and raise before invoking the underlying decoder.
+
 
 @pytest.mark.skipif(not _HAS_LERC, reason="lerc not installed")
 class TestLercDirect:
     def test_lerc_bomb_raises(self):
-        """A LERC blob whose declared dimensions exceed the cap must raise.
-
-        Constant-value rasters compress at >700,000:1 in LERC, so a
-        4096x4096 float32 (64 MiB) encodes to ~94 bytes. The cap is set to
-        1 KiB, well below the declared 64 MiB, and the wrapper must reject
-        the blob before ``lerc.decode`` allocates the output buffer.
-        """
+        """A LERC blob whose declared dimensions exceed the cap must raise."""
         import lerc
         arr = np.zeros((4096, 4096), dtype=np.float32)
         encoded = lerc.encode(arr, 1, False, None, 0.0, 1)
         blob = bytes(encoded[2])
-        assert len(blob) < 1024  # confirm this is a real high-ratio blob
+        assert len(blob) < 1024
         from xrspatial.geotiff._compression import lerc_decompress
         with pytest.raises(ValueError, match="exceed"):
             lerc_decompress(blob, expected_size=1024)
@@ -381,7 +458,7 @@ def test_lerc_no_cap_when_expected_size_zero(self):
         encoded = lerc.encode(arr, 1, False, None, 0.0, 1)
         blob = bytes(encoded[2])
         from xrspatial.geotiff._compression import lerc_decompress
-        out = lerc_decompress(blob)  # no expected_size
+        out = lerc_decompress(blob)
         decoded = np.frombuffer(out, dtype=np.float32).reshape(128, 128)
         assert decoded.shape == arr.shape
 
@@ -389,24 +466,16 @@ def test_lerc_no_cap_when_expected_size_zero(self):
 @pytest.mark.skipif(not _HAS_GLYMUR, reason="glymur not installed")
 class TestJpeg2000Direct:
     def test_jpeg2000_bomb_raises(self, tmp_path):
-        """A JPEG 2000 codestream whose declared shape exceeds the cap raises.
-
-        Glymur reports ``Jp2k(file).shape`` from the SIZ marker without
-        triggering pixel decoding, so the wrapper validates the declared
-        ``H * W * dtype_bytes`` against the bomb cap before calling
-        ``jp2[:]``.
-        """
+        """A JPEG 2000 codestream whose declared shape exceeds the cap raises."""
         import glymur
 
-        # Build a real 2000x2000 uint8 codestream (~150 bytes for zeros).
         arr = np.zeros((2000, 2000), dtype=np.uint8)
         tmp = tmp_path / "src.j2k"
         glymur.Jp2k(str(tmp), data=arr)
         blob = tmp.read_bytes()
-        assert len(blob) < 10_000  # confirm high ratio
+        assert len(blob) < 10_000
         from xrspatial.geotiff._compression import jpeg2000_decompress
         with pytest.raises(ValueError, match="exceed"):
-            # declared output 4 MiB, cap 1 KiB
             jpeg2000_decompress(
                 blob, width=2000, height=2000, samples=1,
                 expected_size=1024)
@@ -415,8 +484,6 @@ def test_jpeg2000_legitimate_passes(self, tmp_path):
         """A JPEG 2000 blob whose declared output matches expected_size passes."""
         import glymur
 
-        # Use a 64x64 raster: large enough for the default 6-resolution
-        # OpenJPEG pyramid without tripping its min-tile-size check.
         arr = (np.arange(64 * 64, dtype=np.uint8) % 200).reshape(64, 64)
         tmp = tmp_path / "legit.j2k"
         glymur.Jp2k(str(tmp), data=arr)
@@ -442,14 +509,7 @@ def test_jpeg2000_no_cap_when_expected_size_zero(self, tmp_path):
 
     def test_jpeg2000_unreadable_shape_fails_closed(
             self, tmp_path, monkeypatch):
-        """If the SIZ marker is unreadable, refuse to call ``jp2[:]``.
-
-        Earlier the wrapper silently disabled the cap on
-        ``Jp2k.shape``/``dtype`` failure, which would let an attacker
-        bypass the bomb guard with a malformed-but-decodable
-        codestream.  The current behaviour is fail-closed: raise
-        ``ValueError`` before any pixel-decoding work runs.
-        """
+        """If the SIZ marker is unreadable, refuse to call ``jp2[:]``."""
         import glymur
         arr = np.zeros((64, 64), dtype=np.uint8)
         tmp = tmp_path / "broken.j2k"
@@ -479,31 +539,15 @@ def __getitem__(self, _):
 
 
 # ---------------------------------------------------------------------------
-# JPEG (issue #1792)
+# JPEG SOF cap
 # ---------------------------------------------------------------------------
-#
-# Pillow has its own DecompressionBombError, but it fires only at
-# ~178M pixels (~500 MB RGB). A malicious TIFF can declare a small tile
-# (e.g. 256x256 RGB, ~196 KiB expected) while shipping a JPEG payload
-# whose SOF marker declares a much larger image; that lets ~500 MB
-# allocate per tile before the downstream chunk.size != expected
-# reshape check fires. ``jpeg_decompress`` now parses the JPEG SOF
-# marker and raises before handing the blob to Pillow when the
-# declared output exceeds ``expected * 1.05 + 1`` bytes. See
-# https://github.com/xarray-contrib/xarray-spatial/issues/1792 .
+
 
 def _forge_jpeg_with_sof_dimensions(real_h: int, real_w: int,
                                     real_c: int,
                                     declared_h: int,
                                     declared_w: int) -> bytes:
-    """Build a real JPEG and rewrite the SOF marker's H/W fields.
-
-    Pillow encodes a small valid image so the bytestream is a complete
-    JPEG; we then overwrite the height/width fields in the SOF segment
-    so the SOF claims a much larger image than the payload can decode.
-    The decoder never gets the chance to fail on the mismatch because
-    the pre-decode cap fires first -- which is the property under test.
-    """
+    """Build a real JPEG and rewrite the SOF marker's H/W fields."""
     import io
 
     from PIL import Image
@@ -512,7 +556,6 @@ def _forge_jpeg_with_sof_dimensions(real_h: int, real_w: int,
     buf = io.BytesIO()
     img.save(buf, format='JPEG', quality=75)
     data = bytearray(buf.getvalue())
-    # Find SOF0..SOF3,SOF5..SOF7,SOF9..SOF11,SOF13..SOF15 marker.
     sof_codes = {
         0xC0, 0xC1, 0xC2, 0xC3, 0xC5, 0xC6, 0xC7,
         0xC9, 0xCA, 0xCB, 0xCD, 0xCE, 0xCF,
@@ -524,7 +567,6 @@ def _forge_jpeg_with_sof_dimensions(real_h: int, real_w: int,
             raise AssertionError("forged JPEG lost marker alignment")
         marker = data[i + 1]
         if marker in sof_codes:
-            # SOF: 0xFF Cx | len(2) | precision(1) | H(2) | W(2) | components(1)
             data[i + 5] = (declared_h >> 8) & 0xFF
             data[i + 6] = declared_h & 0xFF
             data[i + 7] = (declared_w >> 8) & 0xFF
@@ -541,24 +583,13 @@ def _forge_jpeg_with_sof_dimensions(real_h: int, real_w: int,
 @pytest.mark.skipif(not _HAS_PILLOW, reason="Pillow not installed")
 class TestJpegDirect:
     def test_jpeg_bomb_raises(self):
-        """A JPEG whose SOF dimensions exceed the per-tile cap must raise.
-
-        The forged JPEG payload itself is small (a real 16x16 image with
-        the SOF marker rewritten to declare 8000x8000x3). The wrapper
-        is given a per-tile expected size of 32x32x3 = 3072 bytes; the
-        declared output 8000*8000*3 = 192_000_000 bytes is well above
-        the ``3072 * 1.05 + 1`` cap and the decode must be refused.
-        """
+        """A JPEG whose SOF dimensions exceed the per-tile cap must raise."""
         blob = _forge_jpeg_with_sof_dimensions(
             real_h=16, real_w=16, real_c=3,
             declared_h=8000, declared_w=8000,
         )
         from xrspatial.geotiff._compression import jpeg_decompress
 
-        # Match the full diagnostic so a regression that swaps in a
-        # different error path (e.g. Pillow's own DecompressionBombError
-        # with a different wording, or a numeric overflow before the
-        # explicit guard) fails the test instead of silently passing.
         with pytest.raises(
             ValueError,
             match=r"jpeg decode would exceed.*Likely a decompression bomb",
@@ -579,20 +610,11 @@ def test_jpeg_legitimate_passes(self):
         assert arr.shape == (32, 32, 3)
 
     def test_jpeg_no_cap_when_size_kwargs_default(self):
-        """Backward-compat: omitting size kwargs falls back to Pillow's guard.
-
-        Direct callers and round-trip tests pass no dimensions; the
-        pre-check must be a no-op so those keep working.
-        """
+        """Backward-compat: omitting size kwargs falls back to Pillow's guard."""
         blob = _forge_jpeg_with_sof_dimensions(
             real_h=16, real_w=16, real_c=3,
             declared_h=64, declared_w=64,
         )
-        # With no dimension kwargs, the cap is disabled. The forged JPEG
-        # declares 64x64 but encodes only 16x16 of payload -- libjpeg
-        # raises on the truncation; the bomb cap is what we're checking
-        # is *not* the source of any exception here. Catch whatever
-        # Pillow raises and assert it isn't our bomb message.
         from PIL import Image as _Img  # noqa: F401
 
         from xrspatial.geotiff._compression import jpeg_decompress
@@ -601,42 +623,20 @@ def test_jpeg_no_cap_when_size_kwargs_default(self):
         except ValueError as exc:
             assert "Likely a decompression bomb" not in str(exc)
         except Exception:
-            # libjpeg/Pillow errors are acceptable -- we only care that
-            # the bomb cap did not fire.
             pass
 
     def test_jpeg_malformed_falls_through_to_pillow(self):
-        """A JPEG without a parseable SOF defers to Pillow's own guard.
-
-        We don't want the pre-check to misclassify weird-but-valid
-        streams; if the helper can't read the SOF it should return
-        ``None`` and let Pillow raise its own error.
-        """
-        # SOI followed by EOI -- a syntactically valid but empty stream
-        # with no SOF marker.
+        """A JPEG without a parseable SOF defers to Pillow's own guard."""
         blob = bytes([0xFF, 0xD8, 0xFF, 0xD9])
         from xrspatial.geotiff._compression import jpeg_decompress
 
-        # No SOF -> bomb cap returns None -> Pillow raises on the empty
-        # stream.
         with pytest.raises(Exception):
             jpeg_decompress(blob, width=32, height=32, samples=3)
 
     def test_jpeg_sof_with_truncated_segment_length_returns_none(self):
-        """A SOF segment whose declared length runs past EOF returns None.
-
-        Without segment-length validation, ``_read_jpeg_sof`` would happily
-        read height/width/components at fixed offsets even when those
-        offsets pointed past the segment. The pre-check now demands
-        ``seg_len >= 8`` and ``i + 2 + seg_len <= n`` before reading;
-        truncated SOFs are treated as "unknown size" and the bomb cap
-        defers to Pillow.
-        """
+        """A SOF segment whose declared length runs past EOF returns None."""
         from xrspatial.geotiff._compression import _read_jpeg_sof
 
-        # SOI | SOF0 | seg_len=64 (advertises 64 bytes of segment, but
-        # the buffer ends after only 10 bytes of segment payload).
-        # Truncation -> _read_jpeg_sof must return None.
         truncated = bytes([
             0xFF, 0xD8,                  # SOI
             0xFF, 0xC0,                  # SOF0
diff --git a/xrspatial/geotiff/tests/read/test_coords.py b/xrspatial/geotiff/tests/read/test_coords.py
new file mode 100644
index 000000000..0fe2f32b4
--- /dev/null
+++ b/xrspatial/geotiff/tests/read/test_coords.py
@@ -0,0 +1,129 @@
+"""Coordinate / geotransform reconstruction on read.
+
+Consolidates the descending / ascending coord round-trip coverage
+formerly in ``test_descending_coords_1716.py``. The reader has to
+reconstruct the original axis direction from the file's
+``ModelTransformationTag`` (34264) when the writer chose a non-standard
+orientation, so the round-trip check pins both halves of the contract.
+"""
+from __future__ import annotations
+
+import numpy as np
+import xarray as xr
+
+from xrspatial.geotiff import open_geotiff, to_geotiff
+from xrspatial.geotiff._geotags import (TAG_MODEL_PIXEL_SCALE, TAG_MODEL_TIEPOINT,
+                                        TAG_MODEL_TRANSFORMATION)
+from xrspatial.geotiff._header import parse_all_ifds, parse_header
+
+
+def _ifd_tag_ids(path: str) -> set[int]:
+    with open(path, 'rb') as fh:
+        data = fh.read()
+    header = parse_header(data)
+    ifds = parse_all_ifds(data, header)
+    return set(ifds[0].entries.keys())
+
+
+def _make_da(x_coords: np.ndarray, y_coords: np.ndarray) -> xr.DataArray:
+    arr = np.arange(len(y_coords) * len(x_coords), dtype=np.float32)
+    arr = arr.reshape(len(y_coords), len(x_coords))
+    return xr.DataArray(
+        arr,
+        dims=('y', 'x'),
+        coords={'y': y_coords, 'x': x_coords},
+    )
+
+
+class TestDescendingCoordsRoundTrip:
+    """Round-trip read of non-standard-orientation rasters."""
+
+    def test_descending_x_roundtrip(self, tmp_path):
+        """Descending x coords survive the round trip."""
+        # x decreases left-to-right (unusual but valid)
+        x = np.array([200.0, 190.0, 180.0, 170.0, 160.0], dtype=np.float64)
+        y = np.array([50.0, 40.0, 30.0, 20.0], dtype=np.float64)  # north-up
+        da = _make_da(x, y)
+
+        out = tmp_path / 'desc_x.tif'
+        to_geotiff(da, str(out), crs=4326)
+
+        loaded = open_geotiff(str(out))
+        np.testing.assert_allclose(loaded.coords['x'].values, x)
+        np.testing.assert_allclose(loaded.coords['y'].values, y)
+        np.testing.assert_array_equal(loaded.values, da.values)
+
+    def test_ascending_y_roundtrip(self, tmp_path):
+        """Ascending y coords survive the round trip."""
+        x = np.array([160.0, 170.0, 180.0, 190.0, 200.0], dtype=np.float64)
+        # y increases top-to-bottom (south-up)
+        y = np.array([20.0, 30.0, 40.0, 50.0], dtype=np.float64)
+        da = _make_da(x, y)
+
+        out = tmp_path / 'asc_y.tif'
+        to_geotiff(da, str(out), crs=4326)
+
+        loaded = open_geotiff(str(out))
+        np.testing.assert_allclose(loaded.coords['x'].values, x)
+        np.testing.assert_allclose(loaded.coords['y'].values, y)
+        np.testing.assert_array_equal(loaded.values, da.values)
+
+    def test_descending_x_and_ascending_y_roundtrip(self, tmp_path):
+        """Both axes flipped relative to north-up."""
+        x = np.array([200.0, 190.0, 180.0, 170.0, 160.0], dtype=np.float64)
+        y = np.array([20.0, 30.0, 40.0, 50.0], dtype=np.float64)
+        da = _make_da(x, y)
+
+        out = tmp_path / 'desc_x_asc_y.tif'
+        to_geotiff(da, str(out), crs=4326)
+
+        loaded = open_geotiff(str(out))
+        np.testing.assert_allclose(loaded.coords['x'].values, x)
+        np.testing.assert_allclose(loaded.coords['y'].values, y)
+        np.testing.assert_array_equal(loaded.values, da.values)
+
+
+class TestOrientationTagSelection:
+    """The writer picks the right tags for the orientation; the reader
+    has to be able to read either flavour."""
+
+    def test_north_up_uses_pixel_scale_and_tiepoint(self, tmp_path):
+        """North-up keeps ModelPixelScale + ModelTiepoint (no transformation)."""
+        x = np.array([160.0, 170.0, 180.0, 190.0, 200.0], dtype=np.float64)
+        y = np.array([50.0, 40.0, 30.0, 20.0], dtype=np.float64)
+        da = _make_da(x, y)
+
+        out = tmp_path / 'north_up.tif'
+        to_geotiff(da, str(out), crs=4326)
+
+        tag_ids = _ifd_tag_ids(str(out))
+        assert TAG_MODEL_PIXEL_SCALE in tag_ids
+        assert TAG_MODEL_TIEPOINT in tag_ids
+        assert TAG_MODEL_TRANSFORMATION not in tag_ids
+
+    def test_descending_x_uses_transformation_tag(self, tmp_path):
+        """Non-standard orientation emits ModelTransformationTag."""
+        x = np.array([200.0, 190.0, 180.0, 170.0, 160.0], dtype=np.float64)
+        y = np.array([50.0, 40.0, 30.0, 20.0], dtype=np.float64)
+        da = _make_da(x, y)
+
+        out = tmp_path / 'desc_x_tags.tif'
+        to_geotiff(da, str(out), crs=4326)
+
+        tag_ids = _ifd_tag_ids(str(out))
+        assert TAG_MODEL_TRANSFORMATION in tag_ids
+        assert TAG_MODEL_PIXEL_SCALE not in tag_ids
+        assert TAG_MODEL_TIEPOINT not in tag_ids
+
+    def test_ascending_y_uses_transformation_tag(self, tmp_path):
+        x = np.array([160.0, 170.0, 180.0, 190.0, 200.0], dtype=np.float64)
+        y = np.array([20.0, 30.0, 40.0, 50.0], dtype=np.float64)
+        da = _make_da(x, y)
+
+        out = tmp_path / 'asc_y_tags.tif'
+        to_geotiff(da, str(out), crs=4326)
+
+        tag_ids = _ifd_tag_ids(str(out))
+        assert TAG_MODEL_TRANSFORMATION in tag_ids
+        assert TAG_MODEL_PIXEL_SCALE not in tag_ids
+        assert TAG_MODEL_TIEPOINT not in tag_ids
diff --git a/xrspatial/geotiff/tests/read/test_dtypes.py b/xrspatial/geotiff/tests/read/test_dtypes.py
new file mode 100644
index 000000000..3de820ee9
--- /dev/null
+++ b/xrspatial/geotiff/tests/read/test_dtypes.py
@@ -0,0 +1,530 @@
+"""Reader dtype handling.
+
+Consolidates:
+
+* ``test_dtype_read.py`` -- ``dtype=`` kwarg on ``open_geotiff`` (eager
+  + dask, float -> float / int -> int casts, float -> int rejection).
+* ``test_float16_read_1941.py`` -- IEEE half-precision auto-promotion to
+  float32 on read (eager + dask).
+* ``test_float16_read_gpu_1941.py`` -- the same float16 promotion on
+  ``read_geotiff_gpu`` and ``open_geotiff(gpu=True)``.
+"""
+from __future__ import annotations
+
+import numpy as np
+import pytest
+import xarray as xr
+
+from xrspatial.geotiff import open_geotiff, read_geotiff_dask, to_geotiff
+from xrspatial.geotiff._dtypes import (SAMPLE_FORMAT_FLOAT, SAMPLE_FORMAT_INT, SAMPLE_FORMAT_UINT,
+                                       tiff_dtype_to_numpy, tiff_storage_dtype)
+
+from .._helpers.markers import requires_gpu as _gpu_only
+
+
+# ---------------------------------------------------------------------------
+# Fixtures
+# ---------------------------------------------------------------------------
+
+
+@pytest.fixture
+def float64_tif(tmp_path):
+    """Write a float64 GeoTIFF for dtype cast tests."""
+    arr = np.random.default_rng(99).random((80, 80)).astype(np.float64)
+    y = np.linspace(40.0, 41.0, 80)
+    x = np.linspace(-105.0, -104.0, 80)
+    da = xr.DataArray(arr, dims=['y', 'x'],
+                      coords={'y': y, 'x': x},
+                      attrs={'crs': 4326})
+    path = str(tmp_path / 'dtype_f64.tif')
+    to_geotiff(da, path, compression='none')
+    return path, arr
+
+
+@pytest.fixture
+def uint16_tif(tmp_path):
+    """Write a uint16 GeoTIFF for dtype cast tests."""
+    arr = np.random.default_rng(77).integers(0, 10000, (60, 60),
+                                             dtype=np.uint16)
+    y = np.linspace(40.0, 41.0, 60)
+    x = np.linspace(-105.0, -104.0, 60)
+    da = xr.DataArray(arr, dims=['y', 'x'],
+                      coords={'y': y, 'x': x},
+                      attrs={'crs': 4326})
+    path = str(tmp_path / 'dtype_u16.tif')
+    to_geotiff(da, path, compression='none')
+    return path, arr
+
+
+@pytest.fixture
+def float16_tif(tmp_path):
+    """Write a small float16 GeoTIFF using tifffile.
+
+    tifffile encodes numpy float16 with ``BitsPerSample=16`` and
+    ``SampleFormat=3``, which is what an external rasterio / GDAL caller
+    would produce.
+    """
+    tifffile = pytest.importorskip("tifffile")
+    arr = np.array(
+        [[0.0, 1.0, 2.0, 3.0],
+         [-1.0, -2.0, -3.0, -4.0],
+         [0.5, 1.5, 2.5, 3.5],
+         [100.0, 200.0, 300.0, 400.0]],
+        dtype=np.float16,
+    )
+    path = tmp_path / "f16.tif"
+    tifffile.imwrite(str(path), arr, compression=None)
+    return path, arr
+
+
+@pytest.fixture
+def float16_stripped_tif(tmp_path):
+    """Stripped float16 GeoTIFF: triggers the bps_mismatch CPU fallback."""
+    tifffile = pytest.importorskip("tifffile")
+    arr = np.array(
+        [[0.0, 1.0, 2.0, 3.0],
+         [-1.0, -2.0, -3.0, -4.0],
+         [0.5, 1.5, 2.5, 3.5],
+         [100.0, 200.0, 300.0, 400.0]],
+        dtype=np.float16,
+    )
+    path = tmp_path / "f16_stripped.tif"
+    tifffile.imwrite(str(path), arr, compression=None)
+    return path, arr
+
+
+@pytest.fixture
+def float16_tiled_tif(tmp_path):
+    """Multi-tile float16 GeoTIFF: 32x32 image, 16x16 tiles (2x2 grid)."""
+    tifffile = pytest.importorskip("tifffile")
+    arr = np.arange(1024, dtype=np.float16).reshape(32, 32)
+    path = tmp_path / "f16_tiled.tif"
+    tifffile.imwrite(
+        str(path), arr, compression="deflate", tile=(16, 16))
+    return path, arr
+
+
+@pytest.fixture
+def float16_tiled_uncompressed_tif(tmp_path):
+    """Tiled uncompressed float16 GeoTIFF."""
+    tifffile = pytest.importorskip("tifffile")
+    arr = np.arange(256, dtype=np.float16).reshape(16, 16)
+    path = tmp_path / "f16_tiled_none.tif"
+    tifffile.imwrite(
+        str(path), arr, compression=None, tile=(16, 16))
+    return path, arr
+
+
+# ---------------------------------------------------------------------------
+# dtype= kwarg on open_geotiff (eager)
+# ---------------------------------------------------------------------------
+
+
+class TestDtypeEager:
+    def test_float64_to_float32(self, float64_tif):
+        path, orig = float64_tif
+        result = open_geotiff(path, dtype='float32')
+        assert result.dtype == np.float32
+        np.testing.assert_array_almost_equal(
+            result.values, orig.astype(np.float32), decimal=6)
+
+    def test_float64_to_float16(self, float64_tif):
+        path, orig = float64_tif
+        result = open_geotiff(path, dtype=np.float16)
+        assert result.dtype == np.float16
+
+    def test_uint16_to_int32(self, uint16_tif):
+        path, orig = uint16_tif
+        result = open_geotiff(path, dtype='int32')
+        assert result.dtype == np.int32
+        np.testing.assert_array_equal(result.values, orig.astype(np.int32))
+
+    def test_uint16_to_uint8(self, uint16_tif):
+        path, _ = uint16_tif
+        result = open_geotiff(path, dtype='uint8')
+        assert result.dtype == np.uint8
+
+    def test_float_to_int_raises(self, float64_tif):
+        path, _ = float64_tif
+        with pytest.raises(ValueError, match='float.*int'):
+            open_geotiff(path, dtype='int32')
+
+    def test_dtype_none_preserves_native(self, float64_tif):
+        path, _ = float64_tif
+        result = open_geotiff(path, dtype=None)
+        assert result.dtype == np.float64
+
+    def test_int_with_nodata_float_to_int_raises(self, tmp_path):
+        """uint16 file with nodata: nodata masking promotes to float64, so float->int validation fires."""  # noqa: E501
+        arr = np.array([[1, 2], [3, 9999]], dtype=np.uint16)
+        y = np.linspace(40.0, 41.0, 2)
+        x = np.linspace(-105.0, -104.0, 2)
+        da = xr.DataArray(arr, dims=['y', 'x'],
+                          coords={'y': y, 'x': x},
+                          attrs={'crs': 4326, 'nodata': 9999.0})
+        path = str(tmp_path / 'dtype_nodata_int_eager.tif')
+        to_geotiff(da, path, compression='none')
+        with pytest.raises(ValueError, match='float.*int'):
+            open_geotiff(path, dtype='int32')
+
+
+# ---------------------------------------------------------------------------
+# dtype= kwarg on open_geotiff (dask)
+# ---------------------------------------------------------------------------
+
+
+class TestDtypeDask:
+    def test_float64_to_float32_dask(self, float64_tif):
+        path, orig = float64_tif
+        result = open_geotiff(path, dtype='float32', chunks=40)
+        assert result.dtype == np.float32
+        computed = result.values
+        np.testing.assert_array_almost_equal(
+            computed, orig.astype(np.float32), decimal=6)
+
+    def test_chunks_are_target_dtype(self, float64_tif):
+        path, _ = float64_tif
+        result = open_geotiff(path, dtype='float32', chunks=40)
+        assert result.data.dtype == np.float32
+
+    def test_float_to_int_raises_dask(self, float64_tif):
+        path, _ = float64_tif
+        with pytest.raises(ValueError, match='float.*int'):
+            open_geotiff(path, dtype='int32', chunks=40)
+
+    def test_int_with_nodata_float_to_int_raises_dask(self, tmp_path):
+        """uint16 file with nodata: nodata masking promotes to float64, so float->int validation fires."""  # noqa: E501
+        arr = np.array([[1, 2], [3, 9999]], dtype=np.uint16)
+        y = np.linspace(40.0, 41.0, 2)
+        x = np.linspace(-105.0, -104.0, 2)
+        da = xr.DataArray(arr, dims=['y', 'x'],
+                          coords={'y': y, 'x': x},
+                          attrs={'crs': 4326, 'nodata': 9999.0})
+        path = str(tmp_path / 'dtype_nodata_int_dask.tif')
+        to_geotiff(da, path, compression='none')
+        with pytest.raises(ValueError, match='float.*int'):
+            open_geotiff(path, dtype='int32', chunks=2)
+
+
+# ---------------------------------------------------------------------------
+# Float16 dtype-map: auto-promotion on read
+# ---------------------------------------------------------------------------
+
+
+class TestFloat16DtypeMap:
+    """The dtype map auto-promotes float16 on read."""
+
+    def test_tiff_dtype_to_numpy_float16(self):
+        assert tiff_dtype_to_numpy(16, SAMPLE_FORMAT_FLOAT) == np.float32
+
+    def test_tiff_storage_dtype_float16(self):
+        assert tiff_storage_dtype(16, SAMPLE_FORMAT_FLOAT) == np.float16
+
+    def test_tiff_storage_dtype_delegates_for_non_promoted(self):
+        # Non-promoted keys behave identically.
+        for bps, sf in [
+            (8, SAMPLE_FORMAT_UINT),
+            (16, SAMPLE_FORMAT_UINT),
+            (16, SAMPLE_FORMAT_INT),
+            (32, SAMPLE_FORMAT_FLOAT),
+            (64, SAMPLE_FORMAT_FLOAT),
+        ]:
+            assert tiff_storage_dtype(bps, sf) == tiff_dtype_to_numpy(bps, sf)
+
+
+# ---------------------------------------------------------------------------
+# Float16 eager + dask reads
+# ---------------------------------------------------------------------------
+
+
+class TestEagerFloat16Read:
+    """``open_geotiff`` decodes an external float16 file to float32."""
+
+    def test_open_geotiff_returns_float32(self, float16_tif):
+        path, arr = float16_tif
+        result = open_geotiff(str(path))
+        assert result.dtype == np.float32
+        # Float16 values fit exactly in float32, so equality is well-defined.
+        np.testing.assert_array_equal(result.values, arr.astype(np.float32))
+
+    def test_open_geotiff_dask_returns_float32(self, float16_tif):
+        path, arr = float16_tif
+        result = read_geotiff_dask(str(path), chunks=2)
+        assert result.dtype == np.float32
+        np.testing.assert_array_equal(
+            result.compute().values, arr.astype(np.float32))
+
+
+class TestPredictor3Float16:
+    """Predictor=3 + float16 on disk also decodes correctly."""
+
+    def test_predictor3_float16_round_trip(self, tmp_path):
+        tifffile = pytest.importorskip("tifffile")
+        pytest.importorskip("imagecodecs")  # required for predictor=3
+        arr = np.linspace(-1.0, 1.0, 16).astype(np.float16).reshape(4, 4)
+        path = tmp_path / "pred3_f16.tif"
+        tifffile.imwrite(
+            str(path), arr, predictor=3, compression="deflate")
+
+        result = open_geotiff(str(path))
+        assert result.dtype == np.float32
+        np.testing.assert_array_equal(
+            result.values, arr.astype(np.float32))
+
+
+class TestFloat16RegressionGuards:
+    """The float16 promotion did not change non-float16 behaviour."""
+
+    def test_float32_still_float32(self, tmp_path):
+        tifffile = pytest.importorskip("tifffile")
+        arr = np.arange(16, dtype=np.float32).reshape(4, 4)
+        path = tmp_path / "f32.tif"
+        tifffile.imwrite(str(path), arr)
+
+        result = open_geotiff(str(path))
+        assert result.dtype == np.float32
+        np.testing.assert_array_equal(result.values, arr)
+
+    def test_float64_still_float64(self, tmp_path):
+        tifffile = pytest.importorskip("tifffile")
+        arr = np.arange(16, dtype=np.float64).reshape(4, 4)
+        path = tmp_path / "f64.tif"
+        tifffile.imwrite(str(path), arr)
+
+        result = open_geotiff(str(path))
+        assert result.dtype == np.float64
+        np.testing.assert_array_equal(result.values, arr)
+
+    def test_uint16_still_uint16(self, tmp_path):
+        tifffile = pytest.importorskip("tifffile")
+        arr = np.arange(16, dtype=np.uint16).reshape(4, 4)
+        path = tmp_path / "u16.tif"
+        tifffile.imwrite(str(path), arr)
+
+        result = open_geotiff(str(path))
+        assert result.dtype == np.uint16
+        np.testing.assert_array_equal(result.values, arr)
+
+
+# ---------------------------------------------------------------------------
+# Float16 GPU read paths
+# ---------------------------------------------------------------------------
+
+
+class TestEagerGPUReadFloat16:
+    """``read_geotiff_gpu`` returns float32 for stripped float16 input."""
+
+    @_gpu_only
+    def test_read_geotiff_gpu_stripped_returns_float32(
+        self, float16_stripped_tif
+    ):
+        from xrspatial.geotiff import read_geotiff_gpu
+
+        path, arr = float16_stripped_tif
+        result = read_geotiff_gpu(str(path))
+        assert result.dtype == np.float32, (
+            f"GPU read of float16 must return float32, got {result.dtype}"
+        )
+        np.testing.assert_array_equal(
+            result.data.get(), arr.astype(np.float32))
+
+    @_gpu_only
+    def test_read_geotiff_gpu_tiled_returns_float32(
+        self, float16_tiled_tif
+    ):
+        from xrspatial.geotiff import read_geotiff_gpu
+
+        path, arr = float16_tiled_tif
+        result = read_geotiff_gpu(str(path))
+        assert result.dtype == np.float32
+        np.testing.assert_array_equal(
+            result.data.get(), arr.astype(np.float32))
+
+    @_gpu_only
+    def test_read_geotiff_gpu_tiled_uncompressed_returns_float32(
+        self, float16_tiled_uncompressed_tif
+    ):
+        from xrspatial.geotiff import read_geotiff_gpu
+
+        path, arr = float16_tiled_uncompressed_tif
+        result = read_geotiff_gpu(str(path))
+        assert result.dtype == np.float32
+        np.testing.assert_array_equal(
+            result.data.get(), arr.astype(np.float32))
+
+    @_gpu_only
+    def test_open_geotiff_gpu_dispatcher_float16(self, float16_tiled_tif):
+        """``open_geotiff(gpu=True)`` dispatches correctly for float16."""
+        path, arr = float16_tiled_tif
+        result = open_geotiff(str(path), gpu=True)
+        assert result.dtype == np.float32
+        np.testing.assert_array_equal(
+            result.data.get(), arr.astype(np.float32))
+
+
+class TestGPUWindowedFloat16:
+    """Windowed GPU reads honour the bps_mismatch fallback path."""
+
+    @_gpu_only
+    def test_read_geotiff_gpu_windowed_stripped(self, float16_stripped_tif):
+        from xrspatial.geotiff import read_geotiff_gpu
+
+        path, arr = float16_stripped_tif
+        result = read_geotiff_gpu(str(path), window=(0, 0, 2, 2))
+        assert result.dtype == np.float32
+        assert result.shape == (2, 2)
+        np.testing.assert_array_equal(
+            result.data.get(), arr[:2, :2].astype(np.float32))
+
+    @_gpu_only
+    def test_read_geotiff_gpu_windowed_tiled(self, float16_tiled_tif):
+        from xrspatial.geotiff import read_geotiff_gpu
+
+        path, arr = float16_tiled_tif
+        result = read_geotiff_gpu(str(path), window=(0, 0, 8, 8))
+        assert result.dtype == np.float32
+        assert result.shape == (8, 8)
+        np.testing.assert_array_equal(
+            result.data.get(), arr[:8, :8].astype(np.float32))
+
+
+class TestDaskGPUFloat16:
+    """``open_geotiff(chunks=, gpu=True)`` decodes float16 correctly."""
+
+    @_gpu_only
+    def test_dask_gpu_tiled_float16(self, float16_tiled_tif):
+        path, arr = float16_tiled_tif
+        result = open_geotiff(str(path), chunks=8, gpu=True)
+        assert result.dtype == np.float32, (
+            f"dask+GPU read of float16 must return float32, got {result.dtype}"
+        )
+        computed = result.compute()
+        np.testing.assert_array_equal(
+            computed.data.get(), arr.astype(np.float32))
+
+    @_gpu_only
+    def test_read_geotiff_gpu_chunks_kwarg_float16(self, float16_tiled_tif):
+        """``read_geotiff_gpu(chunks=)`` also routes correctly."""
+        from xrspatial.geotiff import read_geotiff_gpu
+
+        path, arr = float16_tiled_tif
+        result = read_geotiff_gpu(str(path), chunks=8)
+        assert result.dtype == np.float32
+        computed = result.compute()
+        np.testing.assert_array_equal(
+            computed.data.get(), arr.astype(np.float32))
+
+
+class TestGDSPathGatedOffForFloat16:
+    """``_gds_chunk_path_available`` returns False for (bps=16, sf=3)."""
+
+    @_gpu_only
+    def test_gds_path_gated_off_for_float16(self, float16_tiled_tif):
+        pytest.importorskip("kvikio", exc_type=ImportError)
+
+        from xrspatial.geotiff._backends.gpu import _gds_chunk_path_available
+        from xrspatial.geotiff._header import parse_all_ifds, parse_header
+
+        path, _ = float16_tiled_tif
+        with open(str(path), "rb") as f:
+            data = f.read()
+        header = parse_header(data)
+        ifds = parse_all_ifds(data, header)
+        ifd = ifds[0]
+
+        assert ifd.is_tiled, "fixture sanity: tiled layout expected"
+        bps_first = ifd.bits_per_sample
+        if isinstance(bps_first, tuple):
+            bps = bps_first[0] if bps_first else 0
+        else:
+            bps = bps_first
+        assert bps == 16, "fixture sanity: bps=16 expected"
+        assert ifd.sample_format == SAMPLE_FORMAT_FLOAT
+
+        result = _gds_chunk_path_available(
+            str(path), ifd, has_sparse_tile=False, orientation=1)
+        assert result is False, (
+            "_gds_chunk_path_available must return False for "
+            "(bps=16, sf=float) so the GDS chunked path does not "
+            "mis-decode half-precision tiles."
+        )
+
+    @_gpu_only
+    def test_gds_path_allowed_for_float32_tiled(self, tmp_path):
+        """Sanity: GDS path remains allowed for a float32 tiled file."""
+        tifffile = pytest.importorskip("tifffile")
+        pytest.importorskip("kvikio", exc_type=ImportError)
+
+        arr = np.arange(256, dtype=np.float32).reshape(16, 16)
+        path = tmp_path / "f32_tiled.tif"
+        tifffile.imwrite(
+            str(path), arr, compression="deflate", tile=(16, 16))
+
+        from xrspatial.geotiff._backends.gpu import _gds_chunk_path_available
+        from xrspatial.geotiff._header import parse_all_ifds, parse_header
+
+        with open(str(path), "rb") as f:
+            data = f.read()
+        header = parse_header(data)
+        ifds = parse_all_ifds(data, header)
+
+        result = _gds_chunk_path_available(
+            str(path), ifds[0], has_sparse_tile=False, orientation=1)
+        assert result is True, (
+            "_gds_chunk_path_available must remain True for "
+            "(bps=32, sf=float) tiled files so the kvikio GDS chunk "
+            "path still applies."
+        )
+
+
+class TestBackendParityFloat16:
+    """All four backends agree pixel-exact on float16 input."""
+
+    @_gpu_only
+    def test_eager_numpy_equals_gpu(self, float16_tiled_tif):
+        path, _ = float16_tiled_tif
+        cpu = open_geotiff(str(path))
+        gpu = open_geotiff(str(path), gpu=True)
+
+        assert cpu.dtype == gpu.dtype == np.float32
+        np.testing.assert_array_equal(np.asarray(cpu), gpu.data.get())
+
+    @_gpu_only
+    def test_eager_numpy_equals_dask_gpu(self, float16_tiled_tif):
+        path, _ = float16_tiled_tif
+        cpu = open_geotiff(str(path))
+        dask_gpu = open_geotiff(str(path), chunks=8, gpu=True).compute()
+
+        assert cpu.dtype == dask_gpu.dtype == np.float32
+        np.testing.assert_array_equal(
+            np.asarray(cpu), dask_gpu.data.get())
+
+    @_gpu_only
+    def test_dask_numpy_equals_dask_gpu(self, float16_tiled_tif):
+        path, _ = float16_tiled_tif
+        dask_cpu = read_geotiff_dask(str(path), chunks=8).compute()
+        dask_gpu = open_geotiff(str(path), chunks=8, gpu=True).compute()
+
+        np.testing.assert_array_equal(
+            np.asarray(dask_cpu), dask_gpu.data.get())
+
+
+class TestPredictor3Float16GPU:
+    """Predictor=3 + float16 on disk also decodes correctly on GPU."""
+
+    @_gpu_only
+    def test_predictor3_float16_gpu_round_trip(self, tmp_path):
+        tifffile = pytest.importorskip("tifffile")
+        pytest.importorskip("imagecodecs")  # required for predictor=3
+
+        from xrspatial.geotiff import read_geotiff_gpu
+
+        arr = np.linspace(-1.0, 1.0, 16).astype(np.float16).reshape(4, 4)
+        path = tmp_path / "pred3_f16.tif"
+        tifffile.imwrite(
+            str(path), arr, predictor=3, compression="deflate")
+
+        result = read_geotiff_gpu(str(path))
+        assert result.dtype == np.float32
+        np.testing.assert_array_equal(
+            result.data.get(), arr.astype(np.float32))
diff --git a/xrspatial/geotiff/tests/test_gpu_byteswap_1508.py b/xrspatial/geotiff/tests/read/test_endianness.py
similarity index 69%
rename from xrspatial/geotiff/tests/test_gpu_byteswap_1508.py
rename to xrspatial/geotiff/tests/read/test_endianness.py
index 4cde5cc40..152051d7c 100644
--- a/xrspatial/geotiff/tests/test_gpu_byteswap_1508.py
+++ b/xrspatial/geotiff/tests/read/test_endianness.py
@@ -1,16 +1,11 @@
-"""Regression test for issue #1508.
-
-Big-endian multi-byte TIFFs read via ``read_geotiff_gpu`` used to crash
-inside the GPU decode pipeline with::
-
-    AttributeError: 'ndarray' object has no attribute 'byteswap'
-
-because ``cupy.ndarray`` (as of cupy 13.x) does not expose ``byteswap()``.
-The dispatcher in ``read_geotiff_gpu`` caught the error and silently fell
-back to CPU, so results stayed correct but the GPU fast path was lost.
-
-These tests confirm the GPU path now decodes BE multi-byte data directly
-(result is a CuPy array, not a NumPy fallback) and matches the CPU read.
+"""Big-endian / little-endian GeoTIFF reader paths.
+
+Consolidates the GPU byteswap regression coverage formerly in
+``test_gpu_byteswap_1508.py``. Pre-fix big-endian multi-byte TIFFs read
+via ``read_geotiff_gpu`` crashed inside the GPU decode pipeline because
+``cupy.ndarray`` does not expose ``byteswap()``. The dispatcher caught
+the error and silently fell back to CPU, so results stayed correct but
+the GPU fast path was lost.
 """
 from __future__ import annotations
 
@@ -19,22 +14,11 @@
 import numpy as np
 import pytest
 
+from .._helpers.markers import gpu_available
 
-def _gpu_available() -> bool:
-    """True if cupy is importable and CUDA is initialised."""
-    if importlib.util.find_spec("cupy") is None:
-        return False
-    try:
-        import cupy
-        return bool(cupy.cuda.is_available())
-    except Exception:
-        return False
-
-
-_HAS_GPU = _gpu_available()
 _HAS_TIFFFILE = importlib.util.find_spec("tifffile") is not None
 _gpu_only = pytest.mark.skipif(
-    not (_HAS_GPU and _HAS_TIFFFILE),
+    not (gpu_available() and _HAS_TIFFFILE),
     reason="cupy + CUDA + tifffile required",
 )
 
@@ -69,20 +53,12 @@ def test_read_geotiff_gpu_big_endian_multibyte(tmp_path, dtype):
 
     gpu_da = read_geotiff_gpu(str(path))
 
-    # The GPU path was actually exercised (no silent CPU fallback masking
-    # a crash inside gpu_decode_tiles_from_file).
     assert isinstance(gpu_da.data, cupy.ndarray), (
         "expected cupy-backed DataArray, got "
         f"{type(gpu_da.data).__name__} -- the GPU path likely fell back "
         "to CPU again"
     )
 
-    # The fix must preserve the native dtype contract. An earlier version
-    # used ``arr.view(arr.dtype.newbyteorder()).copy()`` which produced an
-    # array tagged with non-native byteorder (``>u2`` instead of ``<u2``).
-    # That is values-correct but breaks downstream consumers that expect
-    # native dtypes (numba ``@ngjit`` rejects non-native arrays -- this is
-    # the same class of bug PR #1507 fixed for predictor=2 BE).
     assert gpu_da.data.dtype == np.dtype(dtype), (
         f"GPU result dtype {gpu_da.data.dtype} drifted from native "
         f"{np.dtype(dtype)}"
diff --git a/xrspatial/geotiff/tests/test_apply_nodata_mask_gpu_inplace_1934.py b/xrspatial/geotiff/tests/read/test_nodata.py
similarity index 60%
rename from xrspatial/geotiff/tests/test_apply_nodata_mask_gpu_inplace_1934.py
rename to xrspatial/geotiff/tests/read/test_nodata.py
index 9b176eff0..36f27c5e9 100644
--- a/xrspatial/geotiff/tests/test_apply_nodata_mask_gpu_inplace_1934.py
+++ b/xrspatial/geotiff/tests/read/test_nodata.py
@@ -1,32 +1,24 @@
-"""Regression tests for issue #1934.
-
-``_apply_nodata_mask_gpu`` used to replace the sentinel pixels via
-``cupy.where(arr_gpu == sentinel, nan, arr_gpu)``, which allocates a fresh
-output buffer the same shape as the input. Every call site passes a
-freshly decoded GPU buffer that no caller-visible state aliases, so the
-fix writes NaN into the existing buffer with ``cupy.putmask`` and drops
-one chunk-sized device allocation per call.
-
-Two guards here:
-
-1. Correctness -- float and integer paths match the pre-fix behaviour on
-   a representative input (sentinel masked to NaN, non-sentinel pixels
-   preserved, integer dtype promoted to float64).
-2. In-place mutation -- on the float path the output array shares the
-   same device pointer as the input, confirming no fresh allocation. The
-   integer path still allocates via ``astype(float64)``; the test checks
-   the post-astype buffer is then mutated in place rather than copied
-   again by ``cupy.where``.
+"""Nodata propagation on read.
+
+Consolidates the GPU nodata-mask reader coverage:
+
+* ``test_apply_nodata_mask_gpu_inplace_1934.py`` -- in-place mask
+  semantics for ``_apply_nodata_mask_gpu`` (float and integer paths).
+* ``test_apply_nodata_mask_gpu_with_presence_removed_2208.py`` -- the
+  removed sibling helper stays gone after #2207 wired every GPU eager
+  site through ``_finalize_eager_read``.
 """
 from __future__ import annotations
 
 import numpy as np
+import pytest
 
+from xrspatial.geotiff._backends import _gpu_helpers
 from xrspatial.geotiff.tests.conftest import requires_gpu as _gpu_only
 
 
 @_gpu_only
-def test_apply_nodata_mask_gpu_float_masks_sentinel_to_nan_1934():
+def test_apply_nodata_mask_gpu_float_masks_sentinel_to_nan():
     """Float path masks the sentinel to NaN and leaves other pixels alone."""
     import cupy
 
@@ -45,13 +37,8 @@ def test_apply_nodata_mask_gpu_float_masks_sentinel_to_nan_1934():
 
 
 @_gpu_only
-def test_apply_nodata_mask_gpu_float_in_place_no_copy_1934():
-    """Float path mutates the input buffer in place.
-
-    Before the fix, ``cupy.where`` returned a fresh array, so ``out`` had
-    a different device pointer than ``arr_gpu``. After the fix the input
-    buffer is reused and the pointers match.
-    """
+def test_apply_nodata_mask_gpu_float_in_place_no_copy():
+    """Float path mutates the input buffer in place."""
     import cupy
 
     from xrspatial.geotiff import _apply_nodata_mask_gpu
@@ -65,14 +52,8 @@ def test_apply_nodata_mask_gpu_float_in_place_no_copy_1934():
 
 
 @_gpu_only
-def test_apply_nodata_mask_gpu_float_alloc_count_unchanged_1934():
-    """Float path does not pull a fresh chunk-sized buffer from the pool.
-
-    Uses an isolated ``MemoryPool`` and measures ``total_bytes`` (which
-    counts free blocks too) after a ``free_all_blocks`` so the pre-fix
-    ``cupy.where`` allocation cannot be masked by the input buffer being
-    refcount-freed back to the pool before the assertion.
-    """
+def test_apply_nodata_mask_gpu_float_alloc_count_unchanged():
+    """Float path does not pull a fresh chunk-sized buffer from the pool."""
     import cupy
 
     from xrspatial.geotiff import _apply_nodata_mask_gpu
@@ -81,7 +62,6 @@ def test_apply_nodata_mask_gpu_float_alloc_count_unchanged_1934():
     prev_allocator = cupy.cuda.get_allocator()
     cupy.cuda.set_allocator(isolated_pool.malloc)
     try:
-        # Large enough that an extra allocation would be visible.
         arr_gpu = cupy.full((512, 512), -9999.0, dtype=cupy.float32)
         arr_gpu[0, 0] = 1.0  # plant a non-sentinel pixel
 
@@ -94,17 +74,12 @@ def test_apply_nodata_mask_gpu_float_alloc_count_unchanged_1934():
         isolated_pool.free_all_blocks()
         total_after = isolated_pool.total_bytes()
 
-        # The mask is a transient bool array (1/4 the byte count of float32),
-        # so total_bytes can rise by the mask size but must not rise by the
-        # array's full byte count. Pre-fix would add at least one float32
-        # buffer of the same shape (512*512*4 = 1 MiB).
         array_bytes = arr_gpu.nbytes
         growth = total_after - total_before
         assert growth < array_bytes, (
             f"unexpected allocation growth {growth} bytes >= "
             f"array_bytes {array_bytes}; in-place mutation regressed"
         )
-        # And the returned buffer is the same one we passed in.
         assert out.data.ptr == arr_gpu.data.ptr
     finally:
         cupy.cuda.set_allocator(prev_allocator)
@@ -112,7 +87,7 @@ def test_apply_nodata_mask_gpu_float_alloc_count_unchanged_1934():
 
 
 @_gpu_only
-def test_apply_nodata_mask_gpu_int_promotes_and_masks_1934():
+def test_apply_nodata_mask_gpu_int_promotes_and_masks():
     """Integer path still promotes to float64 and masks the sentinel."""
     import cupy
 
@@ -131,13 +106,8 @@ def test_apply_nodata_mask_gpu_int_promotes_and_masks_1934():
 
 
 @_gpu_only
-def test_apply_nodata_mask_gpu_int_no_extra_buffer_after_astype_1934():
-    """Integer path: only the ``astype(float64)`` buffer is allocated.
-
-    Before the fix the trailing ``cupy.where`` allocated a second
-    chunk-sized float64 buffer. After the fix the ``astype`` buffer is
-    mutated in place.
-    """
+def test_apply_nodata_mask_gpu_int_no_extra_buffer_after_astype():
+    """Integer path: only the ``astype(float64)`` buffer is allocated."""
     import cupy
 
     from xrspatial.geotiff import _apply_nodata_mask_gpu
@@ -158,13 +128,8 @@ def test_apply_nodata_mask_gpu_int_no_extra_buffer_after_astype_1934():
         isolated_pool.free_all_blocks()
         total_after = isolated_pool.total_bytes()
 
-        # Required: one float64 buffer (512*512*8 = 2 MiB) from astype.
-        # Pre-fix would have allocated a second float64 buffer for
-        # cupy.where (another 2 MiB) on top of that.
         float64_bytes = out.nbytes
         growth = total_after - total_before
-        # Allow some slack for the bool mask + .any() scalar (well under
-        # one float64 buffer of slack).
         assert growth < 2 * float64_bytes, (
             f"unexpected allocation growth {growth} bytes >= "
             f"2 * float64_bytes {2 * float64_bytes}; pre-fix double-alloc"
@@ -175,7 +140,7 @@ def test_apply_nodata_mask_gpu_int_no_extra_buffer_after_astype_1934():
 
 
 @_gpu_only
-def test_apply_nodata_mask_gpu_float_nan_sentinel_noop_1934():
+def test_apply_nodata_mask_gpu_float_nan_sentinel_noop():
     """NaN nodata on a float array stays a no-op."""
     import cupy
 
@@ -186,13 +151,12 @@ def test_apply_nodata_mask_gpu_float_nan_sentinel_noop_1934():
     )
     input_ptr = arr_gpu.data.ptr
     out = _apply_nodata_mask_gpu(arr_gpu, float('nan'))
-    # Same buffer back, untouched.
     assert out.data.ptr == input_ptr
     np.testing.assert_array_equal(out.get(), [[1.0, 2.0], [3.0, 4.0]])
 
 
 @_gpu_only
-def test_apply_nodata_mask_gpu_none_nodata_passthrough_1934():
+def test_apply_nodata_mask_gpu_none_nodata_passthrough():
     """``nodata is None`` returns the input array untouched."""
     import cupy
 
@@ -203,3 +167,22 @@ def test_apply_nodata_mask_gpu_none_nodata_passthrough_1934():
     out = _apply_nodata_mask_gpu(arr_gpu, None)
     assert out.data.ptr == input_ptr
     assert out.dtype == cupy.int32
+
+
+# ---------------------------------------------------------------------------
+# Helper removal pin (#2208)
+# ---------------------------------------------------------------------------
+
+
+def test_apply_nodata_mask_gpu_with_presence_not_importable():
+    """The dead sibling helper stays removed after #2207."""
+    # Covers both module-attribute absence and the import-time surface.
+    with pytest.raises(ImportError):
+        from xrspatial.geotiff._backends._gpu_helpers import \
+            _apply_nodata_mask_gpu_with_presence  # noqa: F401
+
+
+def test_apply_nodata_mask_gpu_still_present():
+    """``_apply_nodata_mask_gpu`` is still on the chunked GPU dask path."""
+    assert hasattr(_gpu_helpers, '_apply_nodata_mask_gpu')
+    assert callable(_gpu_helpers._apply_nodata_mask_gpu)
diff --git a/xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.py b/xrspatial/geotiff/tests/read/test_streaming.py
similarity index 74%
rename from xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.py
rename to xrspatial/geotiff/tests/read/test_streaming.py
index d960d299f..565b8e4fa 100644
--- a/xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.py
+++ b/xrspatial/geotiff/tests/read/test_streaming.py
@@ -1,4 +1,11 @@
-"""Regression tests for issue #1785.
+"""Streaming / chunked read paths.
+
+Folds in the streaming-BigTIFF threshold tests from
+``xrspatial/tests/test_geotiff_streaming_bigtiff_threshold_1785.py``
+per the epic #2390 PR 8 directive. The cluster covers the
+streaming-decision helper that the chunked write/read pipeline uses to
+pick classic vs. BigTIFF, plus the integration check that the user's
+``bigtiff=`` override still wins on the streaming code path.
 
 The streaming writer's auto-BigTIFF decision used to compare only the
 uncompressed pixel-data size against ``UINT32_MAX``. For rasters just
@@ -6,16 +13,15 @@
 file past the classic-TIFF uint32 offset ceiling, and the write failed
 late with ``struct.error``.
 
-These tests pin the corrected decision:
+The pinned contract:
 
 * The helper takes an actual ``ifd_overhead_bytes`` value (computed from
   the real tag list via ``_compute_classic_ifd_overhead``) rather than a
   200-byte fudge constant; large ``gdal_metadata_xml`` or ``extra_tags``
-  payloads must not silently undercount overhead. See the Copilot review
-  on PR #1787.
+  payloads must not silently undercount overhead.
 * The comparison is ``> UINT32_MAX``, matching the eager
-  ``_assemble_tiff`` decision (``estimated_file_size > UINT32_MAX``). A
-  file that is exactly ``UINT32_MAX`` bytes still fits classic.
+  ``_assemble_tiff`` decision. A file that is exactly ``UINT32_MAX``
+  bytes still fits classic.
 * The explicit ``bigtiff=True``/``False`` user override still wins.
 """
 from __future__ import annotations
@@ -29,22 +35,12 @@
 
 from xrspatial.geotiff import to_geotiff
 from xrspatial.geotiff._dtypes import ASCII, LONG, SHORT
-from xrspatial.geotiff._header import (
-    TAG_BITS_PER_SAMPLE,
-    TAG_COMPRESSION,
-    TAG_GDAL_METADATA,
-    TAG_IMAGE_LENGTH,
-    TAG_IMAGE_WIDTH,
-    TAG_PHOTOMETRIC,
-    TAG_SAMPLE_FORMAT,
-    TAG_SAMPLES_PER_PIXEL,
-    TAG_STRIP_BYTE_COUNTS,
-    TAG_STRIP_OFFSETS,
-)
-from xrspatial.geotiff._writer import (
-    _compute_classic_ifd_overhead,
-    _should_use_bigtiff_streaming,
-)
+from xrspatial.geotiff._header import (TAG_BITS_PER_SAMPLE, TAG_COMPRESSION, TAG_GDAL_METADATA,
+                                       TAG_IMAGE_LENGTH, TAG_IMAGE_WIDTH, TAG_PHOTOMETRIC,
+                                       TAG_SAMPLE_FORMAT, TAG_SAMPLES_PER_PIXEL,
+                                       TAG_STRIP_BYTE_COUNTS, TAG_STRIP_OFFSETS)
+from xrspatial.geotiff._writer import (_compute_classic_ifd_overhead,
+                                       _should_use_bigtiff_streaming)
 
 
 UINT32_MAX = 0xFFFFFFFF
@@ -79,11 +75,7 @@ def _minimal_tag_list(n_entries: int, gdal_metadata_size: int = 0) -> list:
 
 class TestShouldUseBigTIFFStreaming:
     def test_just_under_uint32_max_promotes(self):
-        """uncompressed = UINT32_MAX - 50 with non-trivial overhead promotes.
-
-        Even ~50 bytes of slack disappears once IFD + strip-table overhead
-        is added, so this case must promote to BigTIFF.
-        """
+        """uncompressed = UINT32_MAX - 50 with non-trivial overhead promotes."""
         # 1024 entries: strip table contributes 8 * 1024 = 8 KiB.
         tags = _minimal_tag_list(n_entries=1024)
         overhead = _compute_classic_ifd_overhead(tags)
@@ -104,14 +96,7 @@ def test_half_uint32_max_stays_classic(self):
         ) is False
 
     def test_exactly_uint32_max_stays_classic(self):
-        """Boundary: total file size == UINT32_MAX bytes still fits classic.
-
-        Eager ``_assemble_tiff`` uses ``estimated_file_size > UINT32_MAX``;
-        the streaming helper must match. A file of exactly ``UINT32_MAX``
-        bytes has its last byte at offset ``UINT32_MAX - 1``, which is a
-        valid classic-TIFF offset.
-        """
-        # Construct uncompressed_bytes so total = exactly UINT32_MAX.
+        """Boundary: total file size == UINT32_MAX bytes still fits classic."""
         tags = _minimal_tag_list(n_entries=1)
         overhead = _compute_classic_ifd_overhead(tags)
         header = 8
@@ -139,15 +124,7 @@ def test_small_raster_no_overhead_stays_classic(self):
         ) is False
 
     def test_large_strip_table_alone_can_promote(self):
-        """Even a small pixel payload can need BigTIFF if n_entries is huge.
-
-        Documents the strip-table contribution: ~536 M entries puts the
-        table itself near 4 GiB and forces BigTIFF with no pixel data.
-        Driven through the ``n_entries`` parameter (8 bytes per entry)
-        to avoid allocating a 536 M-element Python list at test time;
-        the ``ifd_overhead_bytes`` path is exercised by
-        ``test_overhead_pushes_just_under_threshold_over``.
-        """
+        """Even a small pixel payload can need BigTIFF if n_entries is huge."""
         n_entries = (UINT32_MAX // 8) + 1
         assert _should_use_bigtiff_streaming(
             uncompressed_bytes=0,
@@ -156,13 +133,12 @@ def test_large_strip_table_alone_can_promote(self):
         ) is True
 
     def test_overhead_pushes_just_under_threshold_over(self):
-        """Regression: a payload that fits classic by raw bytes but not
-        once header + IFD + strip table is added must promote.
+        """A payload that fits classic by raw bytes but not once header +
+        IFD + strip table is added must promote.
         """
         n_entries = 100_000  # ~800 KB strip table
         tags = _minimal_tag_list(n_entries=n_entries)
         overhead = _compute_classic_ifd_overhead(tags)
-        # Choose uncompressed so the total equals exactly UINT32_MAX + 1.
         header = 8
         uncompressed = UINT32_MAX + 1 - header - overhead
         assert _should_use_bigtiff_streaming(
@@ -178,13 +154,7 @@ def test_overhead_pushes_just_under_threshold_over(self):
         ) is False
 
     def test_large_gdal_metadata_flips_decision(self):
-        """A 5000-byte gdal_metadata blob must flip a borderline case.
-
-        Under the old 200-byte fudge, ``uncompressed + 200 < UINT32_MAX``
-        could stay classic even when a multi-KB gdal_metadata overflow
-        pushed real overhead well past 200 bytes. With the actual
-        overhead computed from the tag list, the decision flips.
-        """
+        """A 5000-byte gdal_metadata blob must flip a borderline case."""
         n_entries = 1024
         big_blob = 5000  # ASCII overflow heap entry
         plain_tags = _minimal_tag_list(n_entries=n_entries)
@@ -193,21 +163,15 @@ def test_large_gdal_metadata_flips_decision(self):
 
         plain_overhead = _compute_classic_ifd_overhead(plain_tags)
         meta_overhead = _compute_classic_ifd_overhead(meta_tags)
-        # Metadata blob really does increase computed overhead.
         assert meta_overhead - plain_overhead >= big_blob
 
-        # Pick uncompressed so plain_overhead path stays classic but
-        # the metadata path tips over.
         header = 8
         uncompressed = UINT32_MAX - header - plain_overhead
-        # Plain: total == UINT32_MAX -> classic.
         assert _should_use_bigtiff_streaming(
             uncompressed_bytes=uncompressed,
             n_entries=0,
             ifd_overhead_bytes=plain_overhead,
         ) is False
-        # With the large metadata blob folded into the real overhead,
-        # the total now exceeds UINT32_MAX and we must promote.
         assert _should_use_bigtiff_streaming(
             uncompressed_bytes=uncompressed,
             n_entries=0,
@@ -217,6 +181,7 @@ def test_large_gdal_metadata_flips_decision(self):
 
 # -- Integration tests against the writer ------------------------------------
 
+
 def _read_tiff_magic(path: str) -> int:
     """Return the TIFF version field: 42 (0x002A) classic, 43 (0x002B) BigTIFF."""
     with open(path, 'rb') as f:
@@ -246,18 +211,18 @@ def small_dask_raster():
 class TestStreamingBigTIFFUserOverride:
     def test_bigtiff_true_forces_bigtiff_on_small_raster(
             self, small_dask_raster, tmp_path):
-        path = str(tmp_path / 'force_bigtiff_1785.tif')
+        path = str(tmp_path / 'force_bigtiff.tif')
         to_geotiff(small_dask_raster, path, bigtiff=True)
         assert _read_tiff_magic(path) == 43
 
     def test_bigtiff_false_forces_classic_on_small_raster(
             self, small_dask_raster, tmp_path):
-        path = str(tmp_path / 'force_classic_1785.tif')
+        path = str(tmp_path / 'force_classic.tif')
         to_geotiff(small_dask_raster, path, bigtiff=False)
         assert _read_tiff_magic(path) == 42
 
     def test_bigtiff_none_small_raster_stays_classic(
             self, small_dask_raster, tmp_path):
-        path = str(tmp_path / 'auto_classic_1785.tif')
+        path = str(tmp_path / 'auto_classic.tif')
         to_geotiff(small_dask_raster, path, bigtiff=None)
         assert _read_tiff_magic(path) == 42
diff --git a/xrspatial/geotiff/tests/test_local_tile_byte_cap_1664.py b/xrspatial/geotiff/tests/read/test_tiling.py
similarity index 59%
rename from xrspatial/geotiff/tests/test_local_tile_byte_cap_1664.py
rename to xrspatial/geotiff/tests/read/test_tiling.py
index 25216f539..bdff3631f 100644
--- a/xrspatial/geotiff/tests/test_local_tile_byte_cap_1664.py
+++ b/xrspatial/geotiff/tests/read/test_tiling.py
@@ -1,14 +1,11 @@
-"""Local-file tile/strip byte-count cap (issue #1664).
+"""Tiled-read paths, tile boundaries, byte caps.
 
-Before #1664, ``XRSPATIAL_COG_MAX_TILE_BYTES`` only fired in the HTTP
-fetch path. A crafted local TIFF with a huge ``TileByteCounts`` /
-``StripByteCounts`` could still feed an enormous slice into the
-decompressor, which can balloon into gigabytes of decoded output even
-when the underlying mmap slice is bounded by the file size.
+Consolidates:
 
-These tests fabricate small COGs / strip-TIFFs, rewrite their byte
-counts to oversized values, and check that the cap raises before the
-decoder runs.
+* ``test_local_tile_byte_cap_1664.py`` -- local-file ``TileByteCounts`` /
+  ``StripByteCounts`` cap and the env-driven override (CPU path).
+* ``test_gpu_tile_byte_cap_2026_05_18.py`` -- the matching GPU eager and
+  dask + GPU chunked paths.
 """
 from __future__ import annotations
 
@@ -17,20 +14,23 @@
 import xarray as xr
 
 from xrspatial.geotiff import _reader as _reader_mod
-from xrspatial.geotiff import open_geotiff, to_geotiff
+from xrspatial.geotiff import open_geotiff, read_geotiff_gpu, to_geotiff
+
+from .._helpers.markers import requires_gpu as _gpu_only
+from .._helpers.tiff_surgery import patch_byte_counts as _patch_byte_counts
 
-from ._helpers.tiff_surgery import patch_byte_counts as _patch_byte_counts  # noqa: E402
 
 # ---------------------------------------------------------------------------
-# Helpers -- patch in-place IFD entries for tile / strip byte counts
+# Helpers
 # ---------------------------------------------------------------------------
 
 
-def _build_forged_tiled_cog(tmp_path, byte_count_value: int) -> str:
+def _build_forged_tiled_cog(tmp_path, byte_count_value: int,
+                            *, basename: str = "forged_tiles") -> str:
     """Write a real tiled COG, patch every TileByteCounts entry, return path."""
     arr = np.arange(64 * 64, dtype=np.float32).reshape(64, 64)
     da = xr.DataArray(arr, dims=['y', 'x'])
-    path = str(tmp_path / "forged_local_tiles_1664.tif")
+    path = str(tmp_path / f"{basename}.tif")
     to_geotiff(da, path, tile_size=32, compression='deflate')
     with open(path, 'rb') as f:
         data = bytearray(f.read())
@@ -44,9 +44,7 @@ def _build_forged_stripped_tif(tmp_path, byte_count_value: int) -> str:
     """Write a strip-organized TIFF, patch every StripByteCounts entry."""
     arr = np.arange(64 * 64, dtype=np.float32).reshape(64, 64)
     da = xr.DataArray(arr, dims=['y', 'x'])
-    path = str(tmp_path / "forged_local_strips_1664.tif")
-    # tiled=False forces strip layout; deflate gets the decompressor on
-    # the hot path so a huge declared size matters.
+    path = str(tmp_path / "forged_strips.tif")
     to_geotiff(da, path, tiled=False, compression='deflate')
     with open(path, 'rb') as f:
         data = bytearray(f.read())
@@ -64,7 +62,6 @@ def _build_forged_stripped_tif(tmp_path, byte_count_value: int) -> str:
 class TestLocalTileByteCap:
     def test_huge_tile_byte_count_rejected(self, tmp_path, monkeypatch):
         """A local tile with a huge TileByteCount raises before decode."""
-        # 100 MB > the 1 MB cap we set below.
         path = _build_forged_tiled_cog(tmp_path, 100 * 1024 * 1024)
         monkeypatch.setenv('XRSPATIAL_COG_MAX_TILE_BYTES', str(1024 * 1024))
 
@@ -78,7 +75,6 @@ def test_error_message_names_value_and_cap(self, tmp_path, monkeypatch):
         with pytest.raises(ValueError) as excinfo:
             open_geotiff(path)
         msg = str(excinfo.value)
-        # The forged value (52,428,800) and the cap (1,024) both appear.
         assert "52,428,800" in msg or "52428800" in msg
         assert "1,024" in msg or "1024" in msg
         assert "denial-of-service" in msg.lower() or "malformed" in msg
@@ -87,7 +83,7 @@ def test_normal_local_cog_under_default_cap(self, tmp_path):
         """Legitimate local reads with the default cap still succeed."""
         arr = np.arange(64 * 64, dtype=np.float32).reshape(64, 64)
         da = xr.DataArray(arr, dims=['y', 'x'])
-        path = str(tmp_path / "normal_local_1664.tif")
+        path = str(tmp_path / "normal_local.tif")
         to_geotiff(da, path, tile_size=32, compression='deflate')
 
         result = open_geotiff(path)
@@ -95,27 +91,16 @@ def test_normal_local_cog_under_default_cap(self, tmp_path):
 
     def test_env_override_lifts_cap(self, tmp_path, monkeypatch):
         """A user with legitimate large tiles can lift the cap via env."""
-        # 50 MB declared. With cap=64 MB the read succeeds even though
-        # the underlying compressed slice is smaller (mmap truncates at
-        # EOF).
         path = _build_forged_tiled_cog(tmp_path, 50 * 1024 * 1024)
         monkeypatch.setenv(
             'XRSPATIAL_COG_MAX_TILE_BYTES', str(64 * 1024 * 1024))
 
-        # Read may raise inside the decompressor (the truncated mmap
-        # slice is garbage to deflate) but it must NOT raise the cap
-        # error. The thing we care about is that the cap check passes.
         try:
             open_geotiff(path)
         except ValueError as e:
             assert "exceeds the per-tile safety cap" not in str(e)
 
 
-# ---------------------------------------------------------------------------
-# Strip-organized local reads
-# ---------------------------------------------------------------------------
-
-
 class TestLocalStripByteCap:
     def test_huge_strip_byte_count_rejected(self, tmp_path, monkeypatch):
         path = _build_forged_stripped_tif(tmp_path, 100 * 1024 * 1024)
@@ -141,13 +126,7 @@ def test_strip_error_message_mentions_strip(self, tmp_path, monkeypatch):
 
 
 def test_max_tile_bytes_env_negative_falls_back(monkeypatch):
-    """Negative env value falls back to the default, not a 1-byte cap.
-
-    Earlier drafts clamped to ``max(1, val)`` which made a typo
-    (``XRSPATIAL_COG_MAX_TILE_BYTES=-1``) silently reject every tile.
-    The current policy matches ``_http_timeout_from_env``: any non-
-    positive integer is ignored.
-    """
+    """Negative env value falls back to the default, not a 1-byte cap."""
     monkeypatch.setenv('XRSPATIAL_COG_MAX_TILE_BYTES', '-5')
     assert (
         _reader_mod._max_tile_bytes_from_env()
@@ -170,3 +149,73 @@ def test_max_tile_bytes_env_garbage_falls_back(monkeypatch):
         _reader_mod._max_tile_bytes_from_env()
         == _reader_mod.MAX_TILE_BYTES_DEFAULT
     )
+
+
+# ---------------------------------------------------------------------------
+# GPU eager path: per-tile byte cap
+# ---------------------------------------------------------------------------
+
+
+class TestGpuTileByteCap:
+    @_gpu_only
+    def test_huge_tile_byte_count_rejected(self, tmp_path, monkeypatch):
+        """A local tile with a huge TileByteCount raises before GPU decode."""
+        path = _build_forged_tiled_cog(
+            tmp_path, 100 * 1024 * 1024, basename="forged_gpu_tiles")
+        monkeypatch.setenv("XRSPATIAL_COG_MAX_TILE_BYTES", str(1024 * 1024))
+
+        with pytest.raises(ValueError, match="TileByteCount"):
+            read_geotiff_gpu(path)
+
+    @_gpu_only
+    def test_error_message_names_value_and_cap(self, tmp_path, monkeypatch):
+        path = _build_forged_tiled_cog(
+            tmp_path, 50 * 1024 * 1024, basename="forged_gpu_tiles_msg")
+        monkeypatch.setenv("XRSPATIAL_COG_MAX_TILE_BYTES", str(1024))
+
+        with pytest.raises(ValueError) as excinfo:
+            read_geotiff_gpu(path)
+        msg = str(excinfo.value)
+        assert "52,428,800" in msg or "52428800" in msg
+        assert "1,024" in msg or "1024" in msg
+        assert "denial-of-service" in msg.lower() or "malformed" in msg
+
+    @_gpu_only
+    def test_normal_gpu_read_under_default_cap(self, tmp_path):
+        """Legitimate GPU reads with the default cap still succeed."""
+        arr = np.arange(64 * 64, dtype=np.float32).reshape(64, 64)
+        da = xr.DataArray(arr, dims=["y", "x"])
+        path = str(tmp_path / "normal_gpu.tif")
+        to_geotiff(da, path, tile_size=32, compression="deflate")
+
+        result = read_geotiff_gpu(path)
+        np.testing.assert_array_equal(result.data.get(), arr)
+
+    @_gpu_only
+    def test_env_override_lifts_cap(self, tmp_path, monkeypatch):
+        """A user with legitimate large tiles can lift the cap via env."""
+        path = _build_forged_tiled_cog(
+            tmp_path, 50 * 1024 * 1024, basename="forged_gpu_tiles_override")
+        monkeypatch.setenv(
+            "XRSPATIAL_COG_MAX_TILE_BYTES", str(64 * 1024 * 1024))
+
+        try:
+            read_geotiff_gpu(path)
+        except Exception as exc:
+            assert "exceeds the per-tile safety cap" not in str(exc), (
+                "cap loop fired despite the env override lifting the cap"
+            )
+
+
+class TestGpuChunkedTileByteCap:
+    @_gpu_only
+    def test_chunked_huge_tile_byte_count_rejected(
+            self, tmp_path, monkeypatch):
+        """Sibling check on the dask + GPU chunked path."""
+        path = _build_forged_tiled_cog(
+            tmp_path, 100 * 1024 * 1024, basename="forged_gpu_chunked")
+        monkeypatch.setenv(
+            "XRSPATIAL_COG_MAX_TILE_BYTES", str(1024 * 1024))
+
+        with pytest.raises(ValueError, match="TileByteCount"):
+            read_geotiff_gpu(path, chunks=32)
diff --git a/xrspatial/geotiff/tests/test_apply_nodata_mask_gpu_with_presence_removed_2208.py b/xrspatial/geotiff/tests/test_apply_nodata_mask_gpu_with_presence_removed_2208.py
deleted file mode 100644
index 9a4b27207..000000000
--- a/xrspatial/geotiff/tests/test_apply_nodata_mask_gpu_with_presence_removed_2208.py
+++ /dev/null
@@ -1,34 +0,0 @@
-"""Regression test for issue #2208.
-
-After #2207 routed all three GPU eager sites through
-``_finalize_eager_read``, the sibling helper
-``_apply_nodata_mask_gpu_with_presence`` had no remaining callers. The
-helper was removed in this PR. This test pins the removal so a future
-PR cannot quietly re-introduce a dead callable.
-
-``_apply_nodata_mask_gpu`` is still alive on the chunked GPU dask path
-(``_backends/gpu.py`` ``_chunk_task``), so this test also asserts that
-helper is still importable as a sanity check that the removal was
-surgical.
-"""
-import pytest
-
-from xrspatial.geotiff._backends import _gpu_helpers
-
-
-def test_apply_nodata_mask_gpu_with_presence_not_importable_2208():
-    # Covers both module-attribute absence and the import-time surface.
-    # _apply_nodata_mask_gpu_with_presence was removed in #2208 after
-    # #2207 routed all GPU eager sites through _finalize_eager_read;
-    # the helper had zero remaining callers.
-    with pytest.raises(ImportError):
-        from xrspatial.geotiff._backends._gpu_helpers import \
-            _apply_nodata_mask_gpu_with_presence  # noqa: F401
-
-
-def test_apply_nodata_mask_gpu_still_present_2208():
-    # _apply_nodata_mask_gpu is still on the chunked GPU dask path
-    # (_chunk_task in _backends/gpu.py); removal in #2208 was scoped
-    # to the dead sibling only.
-    assert hasattr(_gpu_helpers, '_apply_nodata_mask_gpu')
-    assert callable(_gpu_helpers._apply_nodata_mask_gpu)
diff --git a/xrspatial/geotiff/tests/test_band_validation_1673.py b/xrspatial/geotiff/tests/test_band_validation_1673.py
deleted file mode 100644
index 36e4ede92..000000000
--- a/xrspatial/geotiff/tests/test_band_validation_1673.py
+++ /dev/null
@@ -1,125 +0,0 @@
-"""Regression tests for issue #1673.
-
-``read_to_array`` accepts a ``band`` argument and applies it to the
-decoded array via ``arr[:, :, band]`` without validating the index.
-Two failure modes follow:
-
-* ``band=-1`` silently selects the last channel via numpy negative
-  indexing. The public contract is "0-based non-negative index", so
-  this is a silent semantic shift, not an explicit selection.
-* ``band=N`` with ``N >= samples_per_pixel`` raises a raw numpy
-  ``IndexError`` whose message ("index N is out of bounds for axis
-  2 with size M") leaks the internal slice shape.
-
-The dask path (``read_geotiff_dask``) and the GPU path both validate
-``band`` up front and raise ``IndexError("band=N out of range for
-M-band file.")``. These tests pin the local eager path to the same
-contract so backend parity holds.
-"""
-from __future__ import annotations
-
-import numpy as np
-import pytest
-import xarray as xr
-
-
-@pytest.fixture
-def multiband_tiff_path(tmp_path):
-    """4x6 three-band tiled tiff for band-validation tests."""
-    from xrspatial.geotiff import to_geotiff
-
-    arr = np.arange(72, dtype=np.float32).reshape(4, 6, 3)
-    da = xr.DataArray(
-        arr,
-        dims=['y', 'x', 'band'],
-        coords={
-            'y': np.array([0.5, 1.5, 2.5, 3.5]),
-            'x': np.array([0.5, 1.5, 2.5, 3.5, 4.5, 5.5]),
-            'band': [0, 1, 2],
-        },
-        attrs={'crs': 4326},
-    )
-    p = tmp_path / 'mb_1673.tif'
-    to_geotiff(da, str(p), tile_size=16)
-    return str(p), arr
-
-
-def test_read_to_array_negative_band_rejected(multiband_tiff_path):
-    """``band=-1`` no longer silently selects the last channel."""
-    from xrspatial.geotiff._reader import read_to_array
-
-    path, _ = multiband_tiff_path
-    with pytest.raises(IndexError, match="band=-1 out of range"):
-        read_to_array(path, band=-1)
-
-
-def test_read_to_array_band_equal_to_samples_rejected(multiband_tiff_path):
-    """``band=samples_per_pixel`` (off-by-one) raises a typed error."""
-    from xrspatial.geotiff._reader import read_to_array
-
-    path, _ = multiband_tiff_path
-    # File has 3 bands; valid indices are 0, 1, 2.
-    with pytest.raises(IndexError, match="band=3 out of range"):
-        read_to_array(path, band=3)
-
-
-def test_read_to_array_band_far_above_samples_rejected(multiband_tiff_path):
-    """A wildly out-of-range band index gives the same typed error."""
-    from xrspatial.geotiff._reader import read_to_array
-
-    path, _ = multiband_tiff_path
-    with pytest.raises(IndexError, match="band=103 out of range"):
-        read_to_array(path, band=103)
-
-
-def test_read_to_array_valid_band_still_works(multiband_tiff_path):
-    """Valid band indices keep working after the validation guard."""
-    from xrspatial.geotiff._reader import read_to_array
-
-    path, arr = multiband_tiff_path
-    out, _ = read_to_array(path, band=1)
-    np.testing.assert_array_equal(out, arr[:, :, 1])
-
-
-def test_read_to_array_band_none_still_returns_all_bands(multiband_tiff_path):
-    """``band=None`` still returns the full multi-band array."""
-    from xrspatial.geotiff._reader import read_to_array
-
-    path, arr = multiband_tiff_path
-    out, _ = read_to_array(path)
-    np.testing.assert_array_equal(out, arr)
-
-
-def test_backend_parity_negative_band(multiband_tiff_path):
-    """Local eager and dask paths raise the same error for ``band=-1``."""
-    from xrspatial.geotiff import read_geotiff_dask
-    from xrspatial.geotiff._reader import read_to_array
-
-    path, _ = multiband_tiff_path
-
-    with pytest.raises(IndexError) as eager_exc:
-        read_to_array(path, band=-1)
-    with pytest.raises(IndexError) as dask_exc:
-        read_geotiff_dask(path, chunks=4, band=-1)
-
-    # Same error type and same diagnostic substring; the dask message
-    # is "band=-1 out of range for 3-band file." so any 0-based caller
-    # gets identical signal regardless of which backend they pick.
-    assert "band=-1 out of range" in str(eager_exc.value)
-    assert "band=-1 out of range" in str(dask_exc.value)
-
-
-def test_backend_parity_band_equal_to_samples(multiband_tiff_path):
-    """Local eager and dask paths agree on the off-by-one rejection."""
-    from xrspatial.geotiff import read_geotiff_dask
-    from xrspatial.geotiff._reader import read_to_array
-
-    path, _ = multiband_tiff_path
-
-    with pytest.raises(IndexError) as eager_exc:
-        read_to_array(path, band=3)
-    with pytest.raises(IndexError) as dask_exc:
-        read_geotiff_dask(path, chunks=4, band=3)
-
-    assert "band=3 out of range" in str(eager_exc.value)
-    assert "band=3 out of range" in str(dask_exc.value)
diff --git a/xrspatial/geotiff/tests/test_compression.py b/xrspatial/geotiff/tests/test_compression.py
deleted file mode 100644
index b8f5bc5d1..000000000
--- a/xrspatial/geotiff/tests/test_compression.py
+++ /dev/null
@@ -1,118 +0,0 @@
-"""Tests for compression codecs."""
-from __future__ import annotations
-
-import numpy as np
-import pytest
-
-from xrspatial.geotiff._compression import (COMPRESSION_DEFLATE, COMPRESSION_LZW, COMPRESSION_NONE,
-                                            compress, decompress, deflate_compress,
-                                            deflate_decompress, lzw_compress, lzw_decompress,
-                                            predictor_decode, predictor_encode)
-
-
-class TestDeflate:
-    def test_round_trip(self):
-        data = b'hello world! ' * 100
-        compressed = deflate_compress(data)
-        assert compressed != data
-        assert deflate_decompress(compressed) == data
-
-    def test_empty(self):
-        compressed = deflate_compress(b'')
-        assert deflate_decompress(compressed) == b''
-
-    def test_binary_data(self):
-        data = bytes(range(256)) * 10
-        compressed = deflate_compress(data)
-        assert deflate_decompress(compressed) == data
-
-
-class TestLZW:
-    def test_round_trip_simple(self):
-        data = b'ABCABCABCABC'
-        compressed = lzw_compress(data)
-        decompressed = lzw_decompress(compressed, len(data))
-        assert decompressed.tobytes() == data
-
-    def test_round_trip_repetitive(self):
-        data = b'\x00' * 1000
-        compressed = lzw_compress(data)
-        decompressed = lzw_decompress(compressed, len(data))
-        assert decompressed.tobytes() == data
-
-    def test_round_trip_sequential(self):
-        data = bytes(range(256))
-        compressed = lzw_compress(data)
-        decompressed = lzw_decompress(compressed, len(data))
-        assert decompressed.tobytes() == data
-
-    def test_round_trip_random(self):
-        rng = np.random.RandomState(42)
-        data = bytes(rng.randint(0, 256, size=500, dtype=np.uint8))
-        compressed = lzw_compress(data)
-        decompressed = lzw_decompress(compressed, len(data))
-        assert decompressed.tobytes() == data
-
-    def test_round_trip_large(self):
-        rng = np.random.RandomState(123)
-        data = bytes(rng.randint(0, 256, size=10000, dtype=np.uint8))
-        compressed = lzw_compress(data)
-        decompressed = lzw_decompress(compressed, len(data))
-        assert decompressed.tobytes() == data
-
-    def test_empty(self):
-        compressed = lzw_compress(b'')
-        decompressed = lzw_decompress(compressed, 0)
-        assert decompressed.tobytes() == b''
-
-
-class TestPredictor:
-    def test_round_trip_uint8(self):
-        # 4x4 image, 1 byte per sample
-        data = np.array([10, 20, 30, 40, 50, 60, 70, 80,
-                         90, 100, 110, 120, 130, 140, 150, 160],
-                        dtype=np.uint8)
-        encoded = predictor_encode(data.copy(), 4, 4, 1)
-        decoded = predictor_decode(encoded.copy(), 4, 4, 1)
-        np.testing.assert_array_equal(decoded, data)
-
-    def test_round_trip_float32(self):
-        # 2x3 image, 4 bytes per sample
-        arr = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], dtype=np.float32)
-        raw = np.frombuffer(arr.tobytes(), dtype=np.uint8).copy()
-        encoded = predictor_encode(raw.copy(), 3, 2, 4)
-        decoded = predictor_decode(encoded.copy(), 3, 2, 4)
-        np.testing.assert_array_equal(decoded, raw)
-
-    def test_predictor_encode_differences(self):
-        # First pixel unchanged, rest are differences
-        data = np.array([10, 20, 30, 40], dtype=np.uint8)
-        encoded = predictor_encode(data.copy(), 4, 1, 1)
-        assert encoded[0] == 10
-        assert encoded[1] == 10  # 20 - 10
-        assert encoded[2] == 10  # 30 - 20
-        assert encoded[3] == 10  # 40 - 30
-
-
-class TestDispatch:
-    def test_none(self):
-        data = b'hello'
-        assert decompress(data, COMPRESSION_NONE).tobytes() == data
-        assert compress(data, COMPRESSION_NONE) == data
-
-    def test_deflate(self):
-        data = b'test data ' * 50
-        compressed = compress(data, COMPRESSION_DEFLATE)
-        assert decompress(compressed, COMPRESSION_DEFLATE).tobytes() == data
-
-    def test_lzw(self):
-        data = b'ABCABC' * 20
-        compressed = compress(data, COMPRESSION_LZW)
-        decompressed = decompress(compressed, COMPRESSION_LZW, len(data))
-        assert decompressed.tobytes() == data
-
-    def test_unsupported(self):
-        with pytest.raises(ValueError, match="Unsupported compression"):
-            decompress(b'', 99)
-        with pytest.raises(ValueError, match="Unsupported compression"):
-            compress(b'', 99)
diff --git a/xrspatial/geotiff/tests/test_descending_coords_1716.py b/xrspatial/geotiff/tests/test_descending_coords_1716.py
deleted file mode 100644
index ff045b820..000000000
--- a/xrspatial/geotiff/tests/test_descending_coords_1716.py
+++ /dev/null
@@ -1,127 +0,0 @@
-"""Regression tests for issue #1716.
-
-``to_geotiff`` previously stored ``abs(pixel_width)`` / ``abs(pixel_height)``
-in ModelPixelScaleTag and the reader hard-coded a north-up reconstruction.
-DataArrays with descending x or ascending y silently round-tripped with the
-wrong georeference.  The writer now emits ModelTransformationTag (34264)
-for non-standard orientations so the sign survives the round trip.
-"""
-from __future__ import annotations
-
-import numpy as np
-import xarray as xr
-
-from xrspatial.geotiff import open_geotiff, to_geotiff
-from xrspatial.geotiff._geotags import (TAG_MODEL_PIXEL_SCALE, TAG_MODEL_TIEPOINT,
-                                        TAG_MODEL_TRANSFORMATION)
-from xrspatial.geotiff._header import parse_all_ifds, parse_header
-
-
-def _ifd_tag_ids(path: str) -> set[int]:
-    with open(path, 'rb') as fh:
-        data = fh.read()
-    header = parse_header(data)
-    ifds = parse_all_ifds(data, header)
-    return set(ifds[0].entries.keys())
-
-
-def _make_da(x_coords: np.ndarray, y_coords: np.ndarray) -> xr.DataArray:
-    arr = np.arange(len(y_coords) * len(x_coords), dtype=np.float32)
-    arr = arr.reshape(len(y_coords), len(x_coords))
-    return xr.DataArray(
-        arr,
-        dims=('y', 'x'),
-        coords={'y': y_coords, 'x': x_coords},
-    )
-
-
-def test_descending_x_roundtrip(tmp_path):
-    """Descending x coords survive the round trip."""
-    # x decreases left-to-right (unusual but valid)
-    x = np.array([200.0, 190.0, 180.0, 170.0, 160.0], dtype=np.float64)
-    y = np.array([50.0, 40.0, 30.0, 20.0], dtype=np.float64)  # north-up
-    da = _make_da(x, y)
-
-    out = tmp_path / 'tmp_1716_desc_x.tif'
-    to_geotiff(da, str(out), crs=4326)
-
-    loaded = open_geotiff(str(out))
-    np.testing.assert_allclose(loaded.coords['x'].values, x)
-    np.testing.assert_allclose(loaded.coords['y'].values, y)
-    np.testing.assert_array_equal(loaded.values, da.values)
-
-
-def test_ascending_y_roundtrip(tmp_path):
-    """Ascending y coords survive the round trip."""
-    x = np.array([160.0, 170.0, 180.0, 190.0, 200.0], dtype=np.float64)
-    # y increases top-to-bottom (south-up)
-    y = np.array([20.0, 30.0, 40.0, 50.0], dtype=np.float64)
-    da = _make_da(x, y)
-
-    out = tmp_path / 'tmp_1716_asc_y.tif'
-    to_geotiff(da, str(out), crs=4326)
-
-    loaded = open_geotiff(str(out))
-    np.testing.assert_allclose(loaded.coords['x'].values, x)
-    np.testing.assert_allclose(loaded.coords['y'].values, y)
-    np.testing.assert_array_equal(loaded.values, da.values)
-
-
-def test_descending_x_and_ascending_y_roundtrip(tmp_path):
-    """Both axes flipped relative to north-up."""
-    x = np.array([200.0, 190.0, 180.0, 170.0, 160.0], dtype=np.float64)
-    y = np.array([20.0, 30.0, 40.0, 50.0], dtype=np.float64)
-    da = _make_da(x, y)
-
-    out = tmp_path / 'tmp_1716_desc_x_asc_y.tif'
-    to_geotiff(da, str(out), crs=4326)
-
-    loaded = open_geotiff(str(out))
-    np.testing.assert_allclose(loaded.coords['x'].values, x)
-    np.testing.assert_allclose(loaded.coords['y'].values, y)
-    np.testing.assert_array_equal(loaded.values, da.values)
-
-
-def test_north_up_still_uses_pixel_scale_and_tiepoint(tmp_path):
-    """Standard north-up orientation keeps ModelPixelScale + ModelTiepoint."""
-    x = np.array([160.0, 170.0, 180.0, 190.0, 200.0], dtype=np.float64)
-    y = np.array([50.0, 40.0, 30.0, 20.0], dtype=np.float64)
-    da = _make_da(x, y)
-
-    out = tmp_path / 'tmp_1716_north_up.tif'
-    to_geotiff(da, str(out), crs=4326)
-
-    tag_ids = _ifd_tag_ids(str(out))
-    assert TAG_MODEL_PIXEL_SCALE in tag_ids
-    assert TAG_MODEL_TIEPOINT in tag_ids
-    assert TAG_MODEL_TRANSFORMATION not in tag_ids
-
-
-def test_descending_x_uses_transformation_tag(tmp_path):
-    """Non-standard orientation emits ModelTransformationTag and skips
-    the scale/tiepoint pair."""
-    x = np.array([200.0, 190.0, 180.0, 170.0, 160.0], dtype=np.float64)
-    y = np.array([50.0, 40.0, 30.0, 20.0], dtype=np.float64)
-    da = _make_da(x, y)
-
-    out = tmp_path / 'tmp_1716_desc_x_tags.tif'
-    to_geotiff(da, str(out), crs=4326)
-
-    tag_ids = _ifd_tag_ids(str(out))
-    assert TAG_MODEL_TRANSFORMATION in tag_ids
-    assert TAG_MODEL_PIXEL_SCALE not in tag_ids
-    assert TAG_MODEL_TIEPOINT not in tag_ids
-
-
-def test_ascending_y_uses_transformation_tag(tmp_path):
-    x = np.array([160.0, 170.0, 180.0, 190.0, 200.0], dtype=np.float64)
-    y = np.array([20.0, 30.0, 40.0, 50.0], dtype=np.float64)
-    da = _make_da(x, y)
-
-    out = tmp_path / 'tmp_1716_asc_y_tags.tif'
-    to_geotiff(da, str(out), crs=4326)
-
-    tag_ids = _ifd_tag_ids(str(out))
-    assert TAG_MODEL_TRANSFORMATION in tag_ids
-    assert TAG_MODEL_PIXEL_SCALE not in tag_ids
-    assert TAG_MODEL_TIEPOINT not in tag_ids
diff --git a/xrspatial/geotiff/tests/test_dtype_read.py b/xrspatial/geotiff/tests/test_dtype_read.py
deleted file mode 100644
index 0026e1364..000000000
--- a/xrspatial/geotiff/tests/test_dtype_read.py
+++ /dev/null
@@ -1,116 +0,0 @@
-"""Tests for dtype parameter on open_geotiff."""
-import numpy as np
-import pytest
-import xarray as xr
-
-from xrspatial.geotiff import open_geotiff, to_geotiff
-
-
-@pytest.fixture
-def float64_tif(tmp_path):
-    """Write a float64 GeoTIFF for dtype cast tests."""
-    arr = np.random.default_rng(99).random((80, 80)).astype(np.float64)
-    y = np.linspace(40.0, 41.0, 80)
-    x = np.linspace(-105.0, -104.0, 80)
-    da = xr.DataArray(arr, dims=['y', 'x'],
-                      coords={'y': y, 'x': x},
-                      attrs={'crs': 4326})
-    path = str(tmp_path / 'test_1083_f64.tif')
-    to_geotiff(da, path, compression='none')
-    return path, arr
-
-
-@pytest.fixture
-def uint16_tif(tmp_path):
-    """Write a uint16 GeoTIFF for dtype cast tests."""
-    arr = np.random.default_rng(77).integers(0, 10000, (60, 60),
-                                             dtype=np.uint16)
-    y = np.linspace(40.0, 41.0, 60)
-    x = np.linspace(-105.0, -104.0, 60)
-    da = xr.DataArray(arr, dims=['y', 'x'],
-                      coords={'y': y, 'x': x},
-                      attrs={'crs': 4326})
-    path = str(tmp_path / 'test_1083_u16.tif')
-    to_geotiff(da, path, compression='none')
-    return path, arr
-
-
-class TestDtypeEager:
-    def test_float64_to_float32(self, float64_tif):
-        path, orig = float64_tif
-        result = open_geotiff(path, dtype='float32')
-        assert result.dtype == np.float32
-        np.testing.assert_array_almost_equal(
-            result.values, orig.astype(np.float32), decimal=6)
-
-    def test_float64_to_float16(self, float64_tif):
-        path, orig = float64_tif
-        result = open_geotiff(path, dtype=np.float16)
-        assert result.dtype == np.float16
-
-    def test_uint16_to_int32(self, uint16_tif):
-        path, orig = uint16_tif
-        result = open_geotiff(path, dtype='int32')
-        assert result.dtype == np.int32
-        np.testing.assert_array_equal(result.values, orig.astype(np.int32))
-
-    def test_uint16_to_uint8(self, uint16_tif):
-        path, _ = uint16_tif
-        result = open_geotiff(path, dtype='uint8')
-        assert result.dtype == np.uint8
-
-    def test_float_to_int_raises(self, float64_tif):
-        path, _ = float64_tif
-        with pytest.raises(ValueError, match='float.*int'):
-            open_geotiff(path, dtype='int32')
-
-    def test_dtype_none_preserves_native(self, float64_tif):
-        path, _ = float64_tif
-        result = open_geotiff(path, dtype=None)
-        assert result.dtype == np.float64
-
-    def test_int_with_nodata_float_to_int_raises(self, tmp_path):
-        """uint16 file with nodata: nodata masking promotes to float64, so float->int validation fires."""  # noqa: E501
-        arr = np.array([[1, 2], [3, 9999]], dtype=np.uint16)
-        y = np.linspace(40.0, 41.0, 2)
-        x = np.linspace(-105.0, -104.0, 2)
-        da = xr.DataArray(arr, dims=['y', 'x'],
-                          coords={'y': y, 'x': x},
-                          attrs={'crs': 4326, 'nodata': 9999.0})
-        path = str(tmp_path / 'test_1083_nodata_int_eager.tif')
-        to_geotiff(da, path, compression='none')
-        with pytest.raises(ValueError, match='float.*int'):
-            open_geotiff(path, dtype='int32')
-
-
-class TestDtypeDask:
-    def test_float64_to_float32_dask(self, float64_tif):
-        path, orig = float64_tif
-        result = open_geotiff(path, dtype='float32', chunks=40)
-        assert result.dtype == np.float32
-        computed = result.values
-        np.testing.assert_array_almost_equal(
-            computed, orig.astype(np.float32), decimal=6)
-
-    def test_chunks_are_target_dtype(self, float64_tif):
-        path, _ = float64_tif
-        result = open_geotiff(path, dtype='float32', chunks=40)
-        assert result.data.dtype == np.float32
-
-    def test_float_to_int_raises_dask(self, float64_tif):
-        path, _ = float64_tif
-        with pytest.raises(ValueError, match='float.*int'):
-            open_geotiff(path, dtype='int32', chunks=40)
-
-    def test_int_with_nodata_float_to_int_raises_dask(self, tmp_path):
-        """uint16 file with nodata: nodata masking promotes to float64, so float->int validation fires."""  # noqa: E501
-        arr = np.array([[1, 2], [3, 9999]], dtype=np.uint16)
-        y = np.linspace(40.0, 41.0, 2)
-        x = np.linspace(-105.0, -104.0, 2)
-        da = xr.DataArray(arr, dims=['y', 'x'],
-                          coords={'y': y, 'x': x},
-                          attrs={'crs': 4326, 'nodata': 9999.0})
-        path = str(tmp_path / 'test_1083_nodata_int_dask.tif')
-        to_geotiff(da, path, compression='none')
-        with pytest.raises(ValueError, match='float.*int'):
-            open_geotiff(path, dtype='int32', chunks=2)
diff --git a/xrspatial/geotiff/tests/test_float16_read_1941.py b/xrspatial/geotiff/tests/test_float16_read_1941.py
deleted file mode 100644
index e642bc90d..000000000
--- a/xrspatial/geotiff/tests/test_float16_read_1941.py
+++ /dev/null
@@ -1,138 +0,0 @@
-"""Regression tests for issue #1941.
-
-External GeoTIFFs that store IEEE half-precision floats (``BitsPerSample
-=16`` + ``SampleFormat=3``) used to raise ``ValueError("Unsupported
-BitsPerSample=16, SampleFormat=3")`` from ``tiff_dtype_to_numpy``. The
-writer auto-promotes float16 inputs to float32 before encoding, so the
-write side could not produce such a file, but reads from rasterio /
-GDAL / tifffile-produced files broke read-parity.
-
-The fix:
-
-* ``tiff_dtype_to_numpy(16, 3)`` returns ``np.float32`` (symmetric with
-  the writer's auto-promotion).
-* A new ``tiff_storage_dtype`` returns ``np.float16`` for the same key
-  so the byte-view in ``_decode_strip_or_tile`` reads the raw 2-byte
-  samples correctly before casting to float32.
-* The GPU paths fall back to CPU decode when bps != dtype.itemsize * 8,
-  matching the existing stripped-layout fallback.
-"""
-from __future__ import annotations
-
-import numpy as np
-import pytest
-
-from xrspatial.geotiff import open_geotiff, read_geotiff_dask
-from xrspatial.geotiff._dtypes import (SAMPLE_FORMAT_FLOAT, SAMPLE_FORMAT_INT, SAMPLE_FORMAT_UINT,
-                                       tiff_dtype_to_numpy, tiff_storage_dtype)
-
-
-class TestDtypeMap:
-    """The dtype map auto-promotes float16 on read."""
-
-    def test_tiff_dtype_to_numpy_float16(self):
-        assert tiff_dtype_to_numpy(16, SAMPLE_FORMAT_FLOAT) == np.float32
-
-    def test_tiff_storage_dtype_float16(self):
-        assert tiff_storage_dtype(16, SAMPLE_FORMAT_FLOAT) == np.float16
-
-    def test_tiff_storage_dtype_delegates_for_non_promoted(self):
-        # Non-promoted keys behave identically.
-        for bps, sf in [
-            (8, SAMPLE_FORMAT_UINT),
-            (16, SAMPLE_FORMAT_UINT),
-            (16, SAMPLE_FORMAT_INT),
-            (32, SAMPLE_FORMAT_FLOAT),
-            (64, SAMPLE_FORMAT_FLOAT),
-        ]:
-            assert tiff_storage_dtype(bps, sf) == tiff_dtype_to_numpy(bps, sf)
-
-
-@pytest.fixture
-def float16_tif(tmp_path):
-    """Write a small float16 GeoTIFF using tifffile.
-
-    tifffile encodes numpy float16 with ``BitsPerSample=16`` and
-    ``SampleFormat=3``, which is what an external rasterio / GDAL caller
-    would produce.
-    """
-    tifffile = pytest.importorskip("tifffile")
-    arr = np.array(
-        [[0.0, 1.0, 2.0, 3.0],
-         [-1.0, -2.0, -3.0, -4.0],
-         [0.5, 1.5, 2.5, 3.5],
-         [100.0, 200.0, 300.0, 400.0]],
-        dtype=np.float16,
-    )
-    path = tmp_path / "f16.tif"
-    tifffile.imwrite(str(path), arr, compression=None)
-    return path, arr
-
-
-class TestEagerFloat16Read:
-    """``open_geotiff`` decodes an external float16 file to float32."""
-
-    def test_open_geotiff_returns_float32(self, float16_tif):
-        path, arr = float16_tif
-        result = open_geotiff(str(path))
-        assert result.dtype == np.float32
-        # Float16 values fit exactly in float32, so equality is well-defined.
-        np.testing.assert_array_equal(result.values, arr.astype(np.float32))
-
-    def test_open_geotiff_dask_returns_float32(self, float16_tif):
-        path, arr = float16_tif
-        result = read_geotiff_dask(str(path), chunks=2)
-        assert result.dtype == np.float32
-        np.testing.assert_array_equal(
-            result.compute().values, arr.astype(np.float32))
-
-
-class TestPredictor3Float16:
-    """Predictor=3 + float16 on disk also decodes correctly."""
-
-    def test_predictor3_float16_round_trip(self, tmp_path):
-        tifffile = pytest.importorskip("tifffile")
-        pytest.importorskip("imagecodecs")  # required for predictor=3
-        arr = np.linspace(-1.0, 1.0, 16).astype(np.float16).reshape(4, 4)
-        path = tmp_path / "pred3_f16.tif"
-        tifffile.imwrite(
-            str(path), arr, predictor=3, compression="deflate")
-
-        result = open_geotiff(str(path))
-        assert result.dtype == np.float32
-        np.testing.assert_array_equal(
-            result.values, arr.astype(np.float32))
-
-
-class TestRegressionGuards:
-    """The promotion did not change non-float16 behaviour."""
-
-    def test_float32_still_float32(self, tmp_path):
-        tifffile = pytest.importorskip("tifffile")
-        arr = np.arange(16, dtype=np.float32).reshape(4, 4)
-        path = tmp_path / "f32.tif"
-        tifffile.imwrite(str(path), arr)
-
-        result = open_geotiff(str(path))
-        assert result.dtype == np.float32
-        np.testing.assert_array_equal(result.values, arr)
-
-    def test_float64_still_float64(self, tmp_path):
-        tifffile = pytest.importorskip("tifffile")
-        arr = np.arange(16, dtype=np.float64).reshape(4, 4)
-        path = tmp_path / "f64.tif"
-        tifffile.imwrite(str(path), arr)
-
-        result = open_geotiff(str(path))
-        assert result.dtype == np.float64
-        np.testing.assert_array_equal(result.values, arr)
-
-    def test_uint16_still_uint16(self, tmp_path):
-        tifffile = pytest.importorskip("tifffile")
-        arr = np.arange(16, dtype=np.uint16).reshape(4, 4)
-        path = tmp_path / "u16.tif"
-        tifffile.imwrite(str(path), arr)
-
-        result = open_geotiff(str(path))
-        assert result.dtype == np.uint16
-        np.testing.assert_array_equal(result.values, arr)
diff --git a/xrspatial/geotiff/tests/test_float16_read_gpu_1941.py b/xrspatial/geotiff/tests/test_float16_read_gpu_1941.py
deleted file mode 100644
index 3983e08e0..000000000
--- a/xrspatial/geotiff/tests/test_float16_read_gpu_1941.py
+++ /dev/null
@@ -1,340 +0,0 @@
-"""GPU backend coverage for issue #1941 (float16 read).
-
-#1941 added float16 auto-promotion on read by making
-``tiff_dtype_to_numpy(16, SAMPLE_FORMAT_FLOAT)`` return ``float32`` and
-adding the on-disk ``tiff_storage_dtype`` companion. The eager numpy and
-dask paths are covered by ``test_float16_read_1941.py``; this module
-closes the GPU and dask+GPU coverage gap.
-
-A regression that:
-
-* dropped the ``bps_mismatch`` stripped/odd-bps fallback at
-  ``_backends/gpu.py:357`` would route float16 stripped reads through
-  the tiled GPU decoder and mis-decode the half-precision samples;
-* dropped the ``bps_first == 16 and sample_format == SAMPLE_FORMAT_FLOAT``
-  early-out at ``_backends/gpu.py:791`` in ``_gds_chunk_path_available``
-  would send tiled float16 chunked reads down the kvikIO GDS path and
-  mis-stride the buffer;
-* dropped the entry at ``(16, SAMPLE_FORMAT_FLOAT) -> float32`` in
-  ``tiff_dtype_to_numpy`` would surface as ``ValueError("Unsupported
-  BitsPerSample=16, SampleFormat=3")`` from the GPU read paths.
-
-Every test ships through ``read_geotiff_gpu`` directly or through
-``open_geotiff(..., gpu=True)`` so the dispatcher path is also wired in.
-``cuda-unavailable`` builds skip the suite via the project's standard
-``CUDA_AVAILABLE`` gate.
-"""
-from __future__ import annotations
-
-import importlib.util
-
-import numpy as np
-import pytest
-
-
-def _gpu_available() -> bool:
-    if importlib.util.find_spec("cupy") is None:
-        return False
-    try:
-        import cupy
-
-        return bool(cupy.cuda.is_available())
-    except Exception:
-        return False
-
-
-_HAS_GPU = _gpu_available()
-pytestmark = pytest.mark.skipif(
-    not _HAS_GPU, reason="cupy + CUDA required for GPU float16 read tests",
-)
-
-
-@pytest.fixture
-def float16_stripped_tif(tmp_path):
-    """Stripped float16 GeoTIFF: triggers the bps_mismatch CPU fallback.
-
-    ``tifffile.imwrite`` without ``tile=`` produces a stripped layout, so
-    the GPU reader hits ``bps_mismatch=True`` (file_dtype.itemsize*8 == 32
-    but bps == 16) and falls back to ``_read_to_array`` on CPU before
-    copying to device.
-    """
-    tifffile = pytest.importorskip("tifffile")
-    arr = np.array(
-        [[0.0, 1.0, 2.0, 3.0],
-         [-1.0, -2.0, -3.0, -4.0],
-         [0.5, 1.5, 2.5, 3.5],
-         [100.0, 200.0, 300.0, 400.0]],
-        dtype=np.float16,
-    )
-    path = tmp_path / "f16_stripped.tif"
-    tifffile.imwrite(str(path), arr, compression=None)
-    return path, arr
-
-
-@pytest.fixture
-def float16_tiled_tif(tmp_path):
-    """Multi-tile float16 GeoTIFF: 32x32 image, 16x16 tiles (2x2 grid).
-
-    Tiled and deflate-compressed. The 2x2 tile grid exercises inter-tile
-    reassembly in the decoder path so a regression that mis-stitched
-    adjacent tiles would surface here. ``bps_mismatch`` short-circuits
-    the tiled GPU decode path and routes through the CPU decoder; the
-    GDS path is also gated off via ``_gds_chunk_path_available``
-    returning False for (bps=16, sf=3).
-    """
-    tifffile = pytest.importorskip("tifffile")
-    arr = np.arange(1024, dtype=np.float16).reshape(32, 32)
-    path = tmp_path / "f16_tiled.tif"
-    tifffile.imwrite(
-        str(path), arr, compression="deflate", tile=(16, 16))
-    return path, arr
-
-
-@pytest.fixture
-def float16_tiled_uncompressed_tif(tmp_path):
-    """Tiled uncompressed float16 GeoTIFF.
-
-    Mirrors ``float16_tiled_tif`` but with ``compression=None`` so the
-    tile-decode path is exercised without an extra deflate codec call.
-    Tile size 16 is the smallest tifffile allows.
-    """
-    tifffile = pytest.importorskip("tifffile")
-    arr = np.arange(256, dtype=np.float16).reshape(16, 16)
-    path = tmp_path / "f16_tiled_none.tif"
-    tifffile.imwrite(
-        str(path), arr, compression=None, tile=(16, 16))
-    return path, arr
-
-
-class TestEagerGPUReadFloat16:
-    """``read_geotiff_gpu`` returns float32 for stripped float16 input."""
-
-    def test_read_geotiff_gpu_stripped_returns_float32(
-        self, float16_stripped_tif
-    ):
-        from xrspatial.geotiff import read_geotiff_gpu
-
-        path, arr = float16_stripped_tif
-        result = read_geotiff_gpu(str(path))
-        assert result.dtype == np.float32, (
-            f"GPU read of float16 must return float32, got {result.dtype}"
-        )
-        np.testing.assert_array_equal(
-            result.data.get(), arr.astype(np.float32))
-
-    def test_read_geotiff_gpu_tiled_returns_float32(
-        self, float16_tiled_tif
-    ):
-        from xrspatial.geotiff import read_geotiff_gpu
-
-        path, arr = float16_tiled_tif
-        result = read_geotiff_gpu(str(path))
-        assert result.dtype == np.float32
-        np.testing.assert_array_equal(
-            result.data.get(), arr.astype(np.float32))
-
-    def test_read_geotiff_gpu_tiled_uncompressed_returns_float32(
-        self, float16_tiled_uncompressed_tif
-    ):
-        from xrspatial.geotiff import read_geotiff_gpu
-
-        path, arr = float16_tiled_uncompressed_tif
-        result = read_geotiff_gpu(str(path))
-        assert result.dtype == np.float32
-        np.testing.assert_array_equal(
-            result.data.get(), arr.astype(np.float32))
-
-    def test_open_geotiff_gpu_dispatcher_float16(self, float16_tiled_tif):
-        """``open_geotiff(gpu=True)`` dispatches correctly for float16."""
-        from xrspatial.geotiff import open_geotiff
-
-        path, arr = float16_tiled_tif
-        result = open_geotiff(str(path), gpu=True)
-        assert result.dtype == np.float32
-        np.testing.assert_array_equal(
-            result.data.get(), arr.astype(np.float32))
-
-
-class TestGPUWindowedFloat16:
-    """Windowed GPU reads honour the bps_mismatch fallback path."""
-
-    def test_read_geotiff_gpu_windowed_stripped(self, float16_stripped_tif):
-        from xrspatial.geotiff import read_geotiff_gpu
-
-        path, arr = float16_stripped_tif
-        result = read_geotiff_gpu(str(path), window=(0, 0, 2, 2))
-        assert result.dtype == np.float32
-        assert result.shape == (2, 2)
-        np.testing.assert_array_equal(
-            result.data.get(), arr[:2, :2].astype(np.float32))
-
-    def test_read_geotiff_gpu_windowed_tiled(self, float16_tiled_tif):
-        from xrspatial.geotiff import read_geotiff_gpu
-
-        path, arr = float16_tiled_tif
-        result = read_geotiff_gpu(str(path), window=(0, 0, 8, 8))
-        assert result.dtype == np.float32
-        assert result.shape == (8, 8)
-        np.testing.assert_array_equal(
-            result.data.get(), arr[:8, :8].astype(np.float32))
-
-
-class TestDaskGPUFloat16:
-    """``open_geotiff(chunks=, gpu=True)`` decodes float16 correctly."""
-
-    def test_dask_gpu_tiled_float16(self, float16_tiled_tif):
-        from xrspatial.geotiff import open_geotiff
-
-        path, arr = float16_tiled_tif
-        result = open_geotiff(str(path), chunks=8, gpu=True)
-        assert result.dtype == np.float32, (
-            f"dask+GPU read of float16 must return float32, got {result.dtype}"
-        )
-        # Compute the dask array; under dask+cupy, .compute() yields a
-        # cupy-backed DataArray, so the .data.get() step pulls to host.
-        computed = result.compute()
-        np.testing.assert_array_equal(
-            computed.data.get(), arr.astype(np.float32))
-
-    def test_read_geotiff_gpu_chunks_kwarg_float16(self, float16_tiled_tif):
-        """``read_geotiff_gpu(chunks=)`` also routes correctly."""
-        from xrspatial.geotiff import read_geotiff_gpu
-
-        path, arr = float16_tiled_tif
-        result = read_geotiff_gpu(str(path), chunks=8)
-        assert result.dtype == np.float32
-        computed = result.compute()
-        np.testing.assert_array_equal(
-            computed.data.get(), arr.astype(np.float32))
-
-
-class TestGDSPathGatedOffForFloat16:
-    """``_gds_chunk_path_available`` returns False for (bps=16, sf=3).
-
-    Direct structural test of the gating logic added in #1941 to keep the
-    KvikIO GDS chunked path from mis-decoding half-precision tiles. A
-    regression dropping the float16 guard would silently corrupt every
-    chunked GPU read of a float16 source.
-    """
-
-    def test_gds_path_gated_off_for_float16(self, float16_tiled_tif):
-        pytest.importorskip("kvikio", exc_type=ImportError)
-
-        from xrspatial.geotiff._backends.gpu import _gds_chunk_path_available
-        from xrspatial.geotiff._header import parse_all_ifds, parse_header
-
-        path, _ = float16_tiled_tif
-        with open(str(path), "rb") as f:
-            data = f.read()
-        header = parse_header(data)
-        ifds = parse_all_ifds(data, header)
-        ifd = ifds[0]
-
-        # Sanity-check fixture: tiled, bps=16, sample_format=3 (float)
-        from xrspatial.geotiff._dtypes import SAMPLE_FORMAT_FLOAT
-        assert ifd.is_tiled, "fixture sanity: tiled layout expected"
-        # Mirror the production unpacking pattern at gpu.py:791
-        # (bps_first[0] if bps_first else 0) so an empty BitsPerSample
-        # tuple would not raise IndexError here.
-        bps_first = ifd.bits_per_sample
-        if isinstance(bps_first, tuple):
-            bps = bps_first[0] if bps_first else 0
-        else:
-            bps = bps_first
-        assert bps == 16, "fixture sanity: bps=16 expected"
-        assert ifd.sample_format == SAMPLE_FORMAT_FLOAT
-
-        result = _gds_chunk_path_available(
-            str(path), ifd, has_sparse_tile=False, orientation=1)
-        assert result is False, (
-            "_gds_chunk_path_available must return False for "
-            "(bps=16, sf=float) so the GDS chunked path does not "
-            "mis-decode half-precision tiles."
-        )
-
-    def test_gds_path_allowed_for_float32_tiled(self, tmp_path):
-        """Sanity: GDS path remains allowed for a float32 tiled file.
-
-        Pins that the float16 guard at gpu.py:791 fires only on
-        (bps=16, sf=float), not on every tiled float file. A regression
-        widening the guard to all floats would silently disable the
-        GDS path on every float32 tiled COG.
-        """
-        tifffile = pytest.importorskip("tifffile")
-        pytest.importorskip("kvikio", exc_type=ImportError)
-
-        arr = np.arange(256, dtype=np.float32).reshape(16, 16)
-        path = tmp_path / "f32_tiled.tif"
-        tifffile.imwrite(
-            str(path), arr, compression="deflate", tile=(16, 16))
-
-        from xrspatial.geotiff._backends.gpu import _gds_chunk_path_available
-        from xrspatial.geotiff._header import parse_all_ifds, parse_header
-
-        with open(str(path), "rb") as f:
-            data = f.read()
-        header = parse_header(data)
-        ifds = parse_all_ifds(data, header)
-
-        result = _gds_chunk_path_available(
-            str(path), ifds[0], has_sparse_tile=False, orientation=1)
-        assert result is True, (
-            "_gds_chunk_path_available must remain True for "
-            "(bps=32, sf=float) tiled files so the kvikio GDS chunk "
-            "path still applies."
-        )
-
-
-class TestBackendParityFloat16:
-    """All four backends agree pixel-exact on float16 input."""
-
-    def test_eager_numpy_equals_gpu(self, float16_tiled_tif):
-        from xrspatial.geotiff import open_geotiff
-
-        path, _ = float16_tiled_tif
-        cpu = open_geotiff(str(path))
-        gpu = open_geotiff(str(path), gpu=True)
-
-        assert cpu.dtype == gpu.dtype == np.float32
-        np.testing.assert_array_equal(np.asarray(cpu), gpu.data.get())
-
-    def test_eager_numpy_equals_dask_gpu(self, float16_tiled_tif):
-        from xrspatial.geotiff import open_geotiff
-
-        path, _ = float16_tiled_tif
-        cpu = open_geotiff(str(path))
-        dask_gpu = open_geotiff(str(path), chunks=8, gpu=True).compute()
-
-        assert cpu.dtype == dask_gpu.dtype == np.float32
-        np.testing.assert_array_equal(
-            np.asarray(cpu), dask_gpu.data.get())
-
-    def test_dask_numpy_equals_dask_gpu(self, float16_tiled_tif):
-        from xrspatial.geotiff import open_geotiff, read_geotiff_dask
-
-        path, _ = float16_tiled_tif
-        dask_cpu = read_geotiff_dask(str(path), chunks=8).compute()
-        dask_gpu = open_geotiff(str(path), chunks=8, gpu=True).compute()
-
-        np.testing.assert_array_equal(
-            np.asarray(dask_cpu), dask_gpu.data.get())
-
-
-class TestPredictor3Float16GPU:
-    """Predictor=3 + float16 on disk also decodes correctly on GPU."""
-
-    def test_predictor3_float16_gpu_round_trip(self, tmp_path):
-        tifffile = pytest.importorskip("tifffile")
-        pytest.importorskip("imagecodecs")  # required for predictor=3
-
-        from xrspatial.geotiff import read_geotiff_gpu
-
-        arr = np.linspace(-1.0, 1.0, 16).astype(np.float16).reshape(4, 4)
-        path = tmp_path / "pred3_f16.tif"
-        tifffile.imwrite(
-            str(path), arr, predictor=3, compression="deflate")
-
-        result = read_geotiff_gpu(str(path))
-        assert result.dtype == np.float32
-        np.testing.assert_array_equal(
-            result.data.get(), arr.astype(np.float32))
diff --git a/xrspatial/geotiff/tests/test_gpu_tile_byte_cap_2026_05_18.py b/xrspatial/geotiff/tests/test_gpu_tile_byte_cap_2026_05_18.py
deleted file mode 100644
index 127aef86a..000000000
--- a/xrspatial/geotiff/tests/test_gpu_tile_byte_cap_2026_05_18.py
+++ /dev/null
@@ -1,156 +0,0 @@
-"""GPU read path per-tile byte cap (security sweep follow-up).
-
-The CPU readers ``_read_tiles`` (xrspatial/geotiff/_reader.py:2084) and
-``_fetch_decode_cog_http_tiles`` (xrspatial/geotiff/_reader.py:2563)
-reject a tile whose declared ``TileByteCount`` exceeds the env-driven
-``_max_tile_bytes_from_env()`` cap (default 256 MiB). The eager GPU
-read path in ``xrspatial.geotiff._backends.gpu.read_geotiff_gpu`` did
-not run the same check; ``validate_tile_layout`` bounds the offsets
-array length but not the byte-count entries. A crafted local TIFF with
-a multi-hundred-MB ``TileByteCount`` could then pass through to GPU
-decode, where ``_check_gpu_memory`` only catches the aggregate at
-~90% of free VRAM and not the per-tile asymmetry between the CPU and
-GPU paths.
-
-The GPU eager path now applies the same per-tile cap so the CPU and
-GPU contracts agree. These tests cover the rejection, the wording of
-the rejection message, the env-override escape hatch, and the legit-
-read pass-through under the default cap.
-
-Mirrors the structure of ``test_local_tile_byte_cap_1664.py`` for the
-CPU paths so a side-by-side comparison is easy.
-"""
-from __future__ import annotations
-
-import importlib.util
-
-import numpy as np
-import pytest
-import xarray as xr
-
-from xrspatial.geotiff import read_geotiff_gpu, to_geotiff
-
-from ._helpers.tiff_surgery import patch_byte_counts as _patch_byte_counts
-
-
-def _cupy_available() -> bool:
-    if importlib.util.find_spec("cupy") is None:
-        return False
-    try:
-        import cupy
-
-        return bool(cupy.cuda.is_available())
-    except Exception:
-        return False
-
-
-_HAS_GPU = _cupy_available()
-_gpu_only = pytest.mark.skipif(
-    not _HAS_GPU, reason="cupy + CUDA required for the GPU read path",
-)
-
-
-def _build_forged_tiled_cog(tmp_path, byte_count_value: int) -> str:
-    """Write a real tiled COG, patch every TileByteCounts entry, return path."""
-    arr = np.arange(64 * 64, dtype=np.float32).reshape(64, 64)
-    da = xr.DataArray(arr, dims=["y", "x"])
-    path = str(tmp_path / "forged_gpu_tiles_2026_05_18.tif")
-    to_geotiff(da, path, tile_size=32, compression="deflate")
-    with open(path, "rb") as f:
-        data = bytearray(f.read())
-    _patch_byte_counts(data, 325, byte_count_value)
-    with open(path, "wb") as f:
-        f.write(data)
-    return path
-
-
-# ---------------------------------------------------------------------------
-# GPU eager path: per-tile byte cap
-# ---------------------------------------------------------------------------
-
-
-class TestGpuTileByteCap:
-    @_gpu_only
-    def test_huge_tile_byte_count_rejected(self, tmp_path, monkeypatch):
-        """A local tile with a huge TileByteCount raises before GPU decode."""
-        path = _build_forged_tiled_cog(tmp_path, 100 * 1024 * 1024)
-        monkeypatch.setenv("XRSPATIAL_COG_MAX_TILE_BYTES", str(1024 * 1024))
-
-        with pytest.raises(ValueError, match="TileByteCount"):
-            read_geotiff_gpu(path)
-
-    @_gpu_only
-    def test_error_message_names_value_and_cap(self, tmp_path, monkeypatch):
-        path = _build_forged_tiled_cog(tmp_path, 50 * 1024 * 1024)
-        monkeypatch.setenv("XRSPATIAL_COG_MAX_TILE_BYTES", str(1024))
-
-        with pytest.raises(ValueError) as excinfo:
-            read_geotiff_gpu(path)
-        msg = str(excinfo.value)
-        # The forged value (52,428,800) and the cap (1,024) both appear.
-        assert "52,428,800" in msg or "52428800" in msg
-        assert "1,024" in msg or "1024" in msg
-        assert "denial-of-service" in msg.lower() or "malformed" in msg
-
-    @_gpu_only
-    def test_normal_gpu_read_under_default_cap(self, tmp_path):
-        """Legitimate GPU reads with the default cap still succeed."""
-        arr = np.arange(64 * 64, dtype=np.float32).reshape(64, 64)
-        da = xr.DataArray(arr, dims=["y", "x"])
-        path = str(tmp_path / "normal_gpu_2026_05_18.tif")
-        to_geotiff(da, path, tile_size=32, compression="deflate")
-
-        result = read_geotiff_gpu(path)
-        # CuPy -> numpy for comparison.
-        np.testing.assert_array_equal(result.data.get(), arr)
-
-    @_gpu_only
-    def test_env_override_lifts_cap(self, tmp_path, monkeypatch):
-        """A user with legitimate large tiles can lift the cap via env.
-
-        The truncated forged payload makes the downstream codec raise;
-        the assertion below asserts only that whatever error fires is
-        *not* the cap rejection. Catch the broad ``Exception`` so the
-        test stays focused on the cap-loop contract rather than
-        chasing every decoder failure mode, but still inspect the
-        message string to make sure a regression that re-fires the cap
-        through a different error path would be visible.
-        """
-        path = _build_forged_tiled_cog(tmp_path, 50 * 1024 * 1024)
-        monkeypatch.setenv(
-            "XRSPATIAL_COG_MAX_TILE_BYTES", str(64 * 1024 * 1024))
-
-        try:
-            read_geotiff_gpu(path)
-        except Exception as exc:
-            assert "exceeds the per-tile safety cap" not in str(exc), (
-                "cap loop fired despite the env override lifting the cap"
-            )
-
-
-# ---------------------------------------------------------------------------
-# Dask + GPU chunked path: same per-tile cap (added in the review pass)
-# ---------------------------------------------------------------------------
-
-
-class TestGpuChunkedTileByteCap:
-    @_gpu_only
-    def test_chunked_huge_tile_byte_count_rejected(
-            self, tmp_path, monkeypatch):
-        """Sibling check on the dask + GPU chunked path.
-
-        ``_read_geotiff_gpu_chunked_gds`` parses the IFDs and then fans
-        out per-chunk GDS reads. Without the cap, the chunked path
-        would build a graph that still pulls the forged tile per task;
-        the metadata-time check rejects the file before any graph is
-        built.
-        """
-        path = _build_forged_tiled_cog(tmp_path, 100 * 1024 * 1024)
-        monkeypatch.setenv(
-            "XRSPATIAL_COG_MAX_TILE_BYTES", str(1024 * 1024))
-
-        with pytest.raises(ValueError, match="TileByteCount"):
-            # ``chunks`` enables the dask + GPU pipeline; the read path
-            # internally routes through ``_read_geotiff_gpu_chunked_gds``
-            # when the file qualifies for the GDS chunked fast path.
-            read_geotiff_gpu(path, chunks=32)