Skip to content

fix: handle MPC comet data with unparseable orbital elements#401

Open
mrosseel wants to merge 1 commit intobrickbots:mainfrom
mrosseel:fix/comet-dtype-crash
Open

fix: handle MPC comet data with unparseable orbital elements#401
mrosseel wants to merge 1 commit intobrickbots:mainfrom
mrosseel:fix/comet-dtype-crash

Conversation

@mrosseel
Copy link
Collaborator

Summary

  • MPC's comet data file (CometEls.txt) periodically includes rows that Skyfield's load_comets_dataframe() can't parse correctly (e.g. MPEC 2026-F34 added 27 historical comets with malformed fixed-width fields). When even a single row has an unparseable perihelion_year, pandas falls back to dtype=object for the entire column, turning every year value into a string like '2028' instead of an integer. Skyfield's comet_orbit() then crashes on every comet with numpy UFuncNoLoopError: ufunc 'add' did not contain a loop with signature matching types (dtype('<U4'), dtype('float64')).
  • Force pd.to_numeric(errors="coerce") on all orbital element columns after loading to ensure proper numeric dtypes regardless of bad rows
  • Drop rows where essential fields (perihelion year/month/day, eccentricity, distance) couldn't be parsed
  • Add try/except safety net around individual comet orbit calculations so a single bad entry can't crash the entire catalog

Test plan

  • Verified crash reproduces with March 17 2026 MPC data file (297KB, 1728 rows, 27 unparseable)
  • Verified fix correctly parses remaining 1701 comets after dropping bad rows
  • Verified smaller/older MPC files (no bad rows) still work unchanged
  • CI lint/format passes

🤖 Generated with Claude Code

Skyfield's load_comets_dataframe() can return string dtypes for numeric
columns when the MPC data file contains rows it can't parse (e.g. the
MPEC 2026-F34 batch of 27 historical comets with malformed fields).
Even a few bad rows cause pandas to fall back to dtype=object for the
entire perihelion_year column, which then crashes comet_orbit() for
*every* comet with a numpy UFuncNoLoopError.

Fix: force numeric coercion with pd.to_numeric() after loading, drop
rows with unparseable essential fields, and add a try/except safety
net around individual comet processing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant