Skip to content

Error::InvalidRecord is overloaded across unrelated failure modes #3176

@atharvalade

Description

@atharvalade

In the Iceberg sink's write_data_with_options function (router/mod.rs), five different failure modes all map to the same Error::InvalidRecord:

  • Arrow schema mapping failure (line ~224) — config/schema problem
  • Record batch deserialization failure (line ~239) — data format problem
  • Parquet write failure (line ~243) — I/O problem
  • Transaction apply failure (line ~262) — catalog state problem
  • Transaction commit failure (line ~271) — catalog I/O problem

This makes it impossible to programmatically distinguish between a schema mismatch (fix your table definition), a corrupt message (skip and continue), and a catalog outage (retry later).

Logs help somewhat, but callers of this function only see Error::InvalidRecord regardless of root cause.

Fix: Introduce distinct error variants: Error::SchemaMismatch, Error::WriteFailure,Error::CatalogError, etc. Or at minimum use Error::InitError(detail) for the cases that already accept a string, and reserve InvalidRecord for actual data-level issues.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions