Fix review feedback: CRS guard, logging, upgrade loop#20
Conversation
- Guard against src.crs being None (AttributeError → epsg=None) - Log failed metadata extractions instead of silently swallowing - Fix infinite re-extraction for files with no EPSG: detect old-format rows by checking transform column (not epsg) to distinguish genuinely missing metadata from files that were extracted but lack a CRS - Accurate docstring: two remote reads per URL during extraction Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ReviewChanges are correct and fix the three issues cleanly. Upgrade loop fix (item_create.py:156-161): The transform column is the right sentinel. On a successful extraction, transform is always a non-null JSON string even when epsg=None. On a failed extraction, transform=None (stored as NaN in CSV). So pd.isna(row.get('transform')) correctly distinguishes 'never extracted' from 'extracted but no CRS'. The previous epsg-based check conflated those two states. One edge case to be aware of: Line 158 checks row.get('is_geotiff'). Pandas reads CSV boolean columns as bool by default in recent versions, but if is_geotiff loads as the string 'True' due to mixed-type inference, the truthiness check still works correctly — not a bug, just worth knowing. CRS guard (stac_utils.py:113): Correct. src.crs can be None for GeoTIFFs missing a CRS definition; the guard prevents AttributeError. Logging (stac_utils.py:131-132): Better than silent swallow. No issues. No bugs introduced. LGTM. |
Summary
Fixes from Claude Code review of #19:
src.crs.to_epsg()againstNoneCRS (returnsepsg=Noneinstead ofAttributeError)transformcolumn (notepsg) so files with valid metadata but no EPSG are not re-extracted every runTest plan
🤖 Generated with Claude Code