Skip to content

Reject cog=True with tiled=False at the writer boundary (#2312)#2318

Merged
brendancol merged 2 commits into
mainfrom
issue-2312
May 22, 2026
Merged

Reject cog=True with tiled=False at the writer boundary (#2312)#2318
brendancol merged 2 commits into
mainfrom
issue-2312

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes #2312.

  • Reject cog=True combined with tiled=False at the to_geotiff public boundary with a ValueError that names the COG-spec constraint and both caller-side fixes (tiled=True or cog=False). Same shape as the COG input gates pinned in PR Pin actionable failure modes for unsupported COG writer inputs (#2286 prod-ready wave B) #2301 / commit f5fbad5.
  • Add a defense-in-depth gate inside _writer._write so direct callers of the array-level entry point cannot bypass the public wrapper to produce the malformed file.
  • Comment the now-dead cog=True arm of the "tile_size is ignored when tiled=False" warning. Under the new gate, cog=True always implies tiled=True, so the warning fires only on the cog=False, tiled=False path now.

The writer used to silently produce a non-COG strip TIFF for cog=True, tiled=False, violating the stable COG contract promoted in #2300.

Backend coverage

Writer-side fix. Both the eager (to_geotiff) and array-level (_write) entry points gate the combination, which covers the numpy and dask CPU paths. The GPU writer (write_geotiff_gpu) already rejects tiled=False independently, so the four-backend matrix (numpy / cupy / dask+numpy / dask+cupy) is covered.

Test plan

  • Public to_geotiff(cog=True, tiled=False) raises with the expected message tokens.
  • The tile_size kwarg with cog=True, tiled=False still raises and does not emit the misleading "tile_size is ignored" warning.
  • Low-level _writer.write(cog=True, tiled=False) also raises.
  • Valid tiled-COG path (cog=True, tiled=True) still round-trips.
  • cog=True with default tiled still works.
  • cog=False, tiled=False strip path still works.
  • Full xrspatial/geotiff/tests/ suite: 5220 passed, 68 skipped locally.

The COG specification requires a tiled internal layout. The writer
used to accept cog=True with tiled=False, warn that tile_size was
ignored, then write strips through _write -- producing a malformed
file that claimed to be a COG. Reject the combination at to_geotiff's
public boundary and add a defense-in-depth gate in _writer._write so
direct callers cannot bypass the wrapper. Comment the now-dead
tile_size-ignored warning arm for the cog=True case.

Tests in test_cog_requires_tiled_2312.py pin the rejection (public
and low-level), the absence of the misleading tile_size warning on
the cog=True arm, the valid tiled-COG smoke path, and the
cog=False/tiled=False strip path as a negative control.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 22, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Reject cog=True with tiled=False at the writer boundary (#2312)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • xrspatial/geotiff/_writers/eager.py lines 143-150 (tiled docstring entry): the new rejection is invisible to a reader scanning only that entry. Add a sentence saying tiled=False is incompatible with cog=True and raises ValueError. Symmetric with the tile_size entry that already documents the strip-mode warning.
  • xrspatial/geotiff/_writers/eager.py lines 159-163 (cog docstring entry): mirror the same note on the cog side. A caller reading "how do I write a COG" lands here first, so the tiled=True requirement should appear here too.

Nits (optional improvements)

  • xrspatial/geotiff/_writer.py line 481 and xrspatial/geotiff/_writers/eager.py line 569: the two ValueError message strings are byte-identical. If one gets reworded, the other will drift. Pulling the message into a module-level constant (e.g. _COG_REQUIRES_TILED_MSG) would keep them in sync. The tests already pin substrings on both sides, so the drift risk is low; the constant is just an explicit invariant.
  • xrspatial/geotiff/_writer.py line 473: the defense-in-depth gate is inline rather than inside _validate_lowlevel_write_kwargs (line 135), where the other push-down byte-affecting checks live. Moving it would broaden that helper's signature with tiled and cog, which may not be worth it. Flagging only for consideration; the inline gate is fine.

What looks good

  • The error message is actionable: it names the violated constraint (COG spec requires tiled layout), the two fixes the caller can apply (tiled=True or cog=False), and links to the issue. Same shape as the COG gates pinned in PR #2301 / commit f5fbad5.
  • The defense-in-depth gate in _writer._write is at the right depth. It fires before _write_stripped runs and before the overview-pyramid block, so a direct caller cannot produce the malformed strip-plus-overviews file by bypassing to_geotiff.
  • Test file is well-structured: separate rows for the public rejection, the low-level rejection, the "tile_size warning must not fire" pin, both happy-path smoke tests (explicit and default tiled), and a negative control for the strip-without-COG path.
  • The dead "tile_size is ignored" warning branch is documented in place rather than deleted. The branch still serves the cog=False, tiled=False, tile_size != 256 case, so the comment correctly scopes the change to the cog=True arm.
  • Temp file names carry the _2312 issue suffix, consistent with the project convention.

Checklist

  • Algorithm matches reference/paper: COG spec requires tiled layout.
  • All implemented backends produce consistent results: the eager and array-level entry points both gate the combination; dask streaming falls through to _write (which gates again); the GPU path already rejected tiled=False independently.
  • NaN handling: not applicable (validation-only change).
  • Edge cases covered by tests: explicit tiled=False, with tile_size, default tiled, strip-without-COG control.
  • Dask chunk boundaries handled correctly: not applicable (no dask compute paths changed).
  • No premature materialization or unnecessary copies: not applicable.
  • Benchmark exists or is not needed: not needed.
  • README feature matrix updated: not applicable (no new function or backend change).
  • Docstrings present: the test file docstring documents the new ValueError; the suggestion above is to also mention the constraint on the tiled / cog parameter entries in the public docstring.

Apply review suggestions from PR #2318:

- Document the cog=True / tiled=False incompatibility in the public
  to_geotiff docstring on both the tiled and cog parameter entries.
  A reader scanning either entry now sees the constraint without
  having to discover it by raising ValueError.
- Extract the two byte-identical ValueError message strings into a
  module-level constant _COG_REQUIRES_TILED_MSG in _writer.py and
  import it into _writers/eager.py. Eliminates the drift risk
  between the public-boundary raise and the defense-in-depth raise.

The substring assertions in test_cog_requires_tiled_2312.py still
pin the actionable tokens (tiled=True, cog=False, COG) on both raise
sites, so the contract is unchanged.

The other review nit (move the gate inside _validate_lowlevel_write_kwargs)
is dismissed: that helper would have to grow tiled and cog params,
which broadens its signature past what the rest of its checks need.
The inline gate is fine.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review (follow-up): Reject cog=True with tiled=False at the writer boundary (#2312)

Re-review after commit dfc617f.

Status of original findings

Suggestions (both fixed):

  • tiled docstring entry now says tiled=False is incompatible with cog=True and raises ValueError.
  • cog docstring entry now says the parameter requires tiled=True.

Nits:

  • Fixed: shared error message extracted to _COG_REQUIRES_TILED_MSG in _writer.py, imported into _writers/eager.py. Both raise sites use the constant, so a future reword cannot make them drift.
  • Dismissed: moving the defense-in-depth gate into _validate_lowlevel_write_kwargs would require adding tiled and cog parameters to that helper. Its existing signature only carries kwargs whose validation is independent of the layout decision (compression, max_z_error, crs_*, allow_unparseable_crs). Two more parameters for one check is wider than the consolidation is worth. The inline gate is fine and the comment at the call site explains why it lives there.

New checks

No new findings. The 6 tests in test_cog_requires_tiled_2312.py still pass after the follow-up commit; the wider test_cog_invalid_input_errors_2286.py suite (19 tests) is unaffected.

What looks good (continued)

The shared constant lives in _writer.py (the array-level module), which is the lower of the two modules that raise the error. Both raise sites import from a single source of truth. The docstring updates are parameter-scoped, so the rest of the long to_geotiff docstring did not shift, and the diff stays narrow.

@brendancol brendancol merged commit 577db26 into main May 22, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

to_geotiff(cog=True, tiled=False) writes a non-COG strip TIFF

1 participant