Skip to content

[Bug] Some windows that do overlap an ingested raster in world coordinates are silently dropped during rslearn raster materialization because the pixel-bounds intersection logic can treat them as non-overlapping. #560

@robmarkcole

Description

@robmarkcole

Some windows that do overlap an ingested raster in world coordinates are silently dropped during rslearn raster materialization because the pixel-bounds intersection logic can treat them as non-overlapping.

What happened

We were materializing an ERA5LandMonthlyMeans raster layer into field windows.

The source ERA5 tile was valid and overlapped the target window in world coordinates. A direct reprojection from the tile-store GeoTIFF into the target window grid produced valid temperature values.

However, rslearn materialized the window as all zeros and still marked the layer completed.

Relevant code

  • rslearn/dataset/materialize.py, around line 88
  • rslearn/tile_stores/default.py, around line 189

The current flow is:

  1. get_raster_bounds() warps the source raster to the target CRS and returns normalized pixel bounds as (minx, miny, maxx, maxy).
  2. read_raster_window_from_tiles() intersects those source bounds with the window bounds.
  3. If the computed intersection has non-positive width or height, the item is skipped.

Suspected bug

get_raster_bounds() normalizes the warped source bounds to (minx, miny, maxx, maxy), but the window bounds are effectively being carried as (left, top, right, bottom) in pixel space.

That mismatch can make a real overlap look empty.

Concrete example

For one failing window:

  • Window bounds: (39098, -479739, 39154, -479639)
  • Source ERA5 bounds in the same projection: (37833, -480633, 40370, -479787)

The current intersection logic computes:

intersection = (
    max(bounds[0], src_bounds[0]),
    max(bounds[1], src_bounds[1]),
    min(bounds[2], src_bounds[2]),
    min(bounds[3], src_bounds[3]),
)
# -> (39098, -479739, 39154, -479787)

This yields a negative height (-48), so rslearn skips the raster read.

But a direct reprojection from the tile-store GeoTIFF into the same target window grid produced valid data:

valid pixels: 5600
temperature range: about 282.18 K

So the source tile and window do overlap in world coordinates, but the pixel-bounds intersection drops them.

This causes silent data loss:

  • materialized rasters can be all zeros even though valid source data exist
  • the layer may still be marked completed
  • downstream analysis can treat the zero raster as real data

Suggested fix

Normalize window bounds and source bounds to the same pixel-bounds convention before intersecting them.

At minimum, it would help to:

  • make the bounds convention explicit and consistent across materialization
  • raise an error or strong warning when a source item overlaps in world coordinates but is skipped due to pixel-bounds intersection
  • avoid marking the layer completed if every source item was skipped for this reason

Additional context

This was easier to trigger because the ERA5 request bbox was tight enough that CDS returned a very small raster (1 x 3 cells), so edge cases near the tile boundary became more obvious. But the underlying issue appears to be in pixel-bounds handling during materialization, not in the data source itself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions