Skip to content

fix(connectors): replace overloaded InvalidRecord with distinct error variants#3194

Open
atharvalade wants to merge 7 commits into
apache:masterfrom
atharvalade:fix/distinct-iceberg-sink-error-variants
Open

fix(connectors): replace overloaded InvalidRecord with distinct error variants#3194
atharvalade wants to merge 7 commits into
apache:masterfrom
atharvalade:fix/distinct-iceberg-sink-error-variants

Conversation

@atharvalade
Copy link
Copy Markdown
Contributor

@atharvalade atharvalade commented Apr 29, 2026

Which issue does this PR close?

Closes #3176

Rationale

Error::InvalidRecord was used for five unrelated failure modes in the Iceberg sink's write_data function, making it impossible for callers to distinguish schema mismatches from I/O failures from catalog outages.

What changed?

The Iceberg sink mapped Arrow schema conversion errors, Parquet write failures, and Iceberg catalog transaction failures all to Error::InvalidRecord. Callers could not programmatically decide whether to fix a table definition, skip a corrupt message, or retry a catalog outage.

Three new SDK error variants — SchemaMismatch(String), WriteFailure(String), CatalogError(String) — replace the overloaded InvalidRecord at the appropriate call sites. InvalidRecord is preserved only for the genuine record-batch deserialization error.

Local Execution

  • Passed
  • Pre-commit hooks ran

AI Usage

  • Opu 4.6
  • Writing comments, writing PR Description
  • Yes

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.53%. Comparing base (1ae123f) to head (e963fe4).

Files with missing lines Patch % Lines
...re/connectors/sinks/iceberg_sink/src/router/mod.rs 0.00% 5 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             master    #3194       +/-   ##
=============================================
- Coverage     73.78%   56.53%   -17.25%     
  Complexity      943      943               
=============================================
  Files          1200     1199        -1     
  Lines        109094    96879    -12215     
  Branches      85994    73779    -12215     
=============================================
- Hits          80492    54770    -25722     
- Misses        25866    39520    +13654     
+ Partials       2736     2589      -147     
Components Coverage Δ
Rust Core 52.16% <0.00%> (-22.75%) ⬇️
Java SDK 58.44% <ø> (ø)
C# SDK 69.47% <ø> (ø)
Python SDK 81.43% <ø> (ø)
Node SDK 91.44% <ø> (ø)
Go SDK 39.80% <ø> (ø)
Files with missing lines Coverage Δ
core/connectors/sdk/src/lib.rs 56.17% <ø> (ø)
...re/connectors/sinks/iceberg_sink/src/router/mod.rs 39.23% <0.00%> (ø)

... and 272 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread core/connectors/sdk/src/lib.rs Outdated
Comment thread core/connectors/sdk/src/lib.rs Outdated
#[error("Write failure: {0}")]
WriteFailure(String),
/// A catalog or transaction-level failure (e.g. applying or committing an
/// Iceberg transaction). Callers may retry on transient catalog outages.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc "callers may retry on transient catalog outages" is misleading. action.apply() at router/mod.rs:213 is in-memory transaction prep - deterministic failures (invalid partition spec, schema validation) cannot be retried. only tx.commit(catalog) at router/mod.rs:222 hits the network. suggest dropping the retry claim, or splitting into ApplyError (deterministic) vs CommitError (transient-eligible).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split CatalogError into TransactionApplyError and CatalogCommitError

@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs.

If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR.

Thank you for your contribution!

@github-actions github-actions Bot added stale Inactive issue or pull request and removed stale Inactive issue or pull request labels May 13, 2026
@hubcio
Copy link
Copy Markdown
Contributor

hubcio commented May 14, 2026

/author

@github-actions github-actions Bot added the S-waiting-on-author PR is waiting on author response label May 14, 2026
@atharvalade
Copy link
Copy Markdown
Contributor Author

/ready

@github-actions github-actions Bot added S-waiting-on-review PR is waiting on a reviewer and removed S-waiting-on-author PR is waiting on author response labels May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review PR is waiting on a reviewer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error::InvalidRecord is overloaded across unrelated failure modes

3 participants