Skip to content

Check if merge is aborted during file integrity checksums#16281

Open
tlrx wants to merge 3 commits into
apache:mainfrom
tlrx:abort-check-in-checksum-entire-file
Open

Check if merge is aborted during file integrity checksums#16281
tlrx wants to merge 3 commits into
apache:mainfrom
tlrx:abort-check-in-checksum-entire-file

Conversation

@tlrx

@tlrx tlrx commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

This pull request follows the discussion on #16086.

It adds a new CodecUtil.checksumEntireFile(IndexInput, OneMerge) that periodically checks whether the merge has been aborted while reading through the file. This avoids spending time checksumming large files when the merge is already cancelled. The check runs every 1 MB. This idea was suggested by Robert (thanks!).

It also adds a checkIntegrity(OneMerge) to all codec reader/producer base classes, with a default implementation that delegates to the existing checkIntegrity(). I think I updated all current codec implementations to propagate the merge to CodecUtil.checksumEntireFile(IndexInput, OneMerge).

An alternative change would be to make checkIntegrity(OneMerge) abstract and checkIntegrity() final, delegating to checkIntegrity(null). That would prevent subclasses from silently ignoring the merge parameter but it would require updating all (backwards and third party) codec implementations. Or just replace checkIntegrity() by checkIntegrity(OneMerge). Happy to hear our thoughts on this.

Close #13354

This pull request follows the discussion on apache#16086.

It adds a new `CodecUtil.checksumEntireFile(IndexInput, OneMerge)` that periodically checks whether the merge has been aborted while reading through the file. This avoids spending time checksumming large files when the merge is already cancelled. The check runs every 1 MB. This idea was suggested by Robert (thanks!).

It also adds a `checkIntegrity(OneMerge)` to all codec reader/producer base classes, with a default implementation that delegates to the existing `checkIntegrity()`. I think I updated all current codec implementations to propagate the merge to CodecUtil.checksumEntireFile(IndexInput, OneMerge).

An alternative change would be to make `checkIntegrity(OneMerge)` abstract and `checkIntegrity()` final, delegating to `checkIntegrity(null)`. That would prevent subclasses from silently ignoring the merge parameter but it would require updating all (backwards and third party) codec implementations. Or just replace `checkIntegrity()` by `checkIntegrity(OneMerge)`. Happy to hear our thoughts on this.
@tlrx

tlrx commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

@mikemccand @rmuir If you have the time I'd be happy to have your feedback on this, thanks!

@rmuir

rmuir commented Jun 22, 2026

Copy link
Copy Markdown
Member

An alternative change would be to make checkIntegrity(OneMerge) abstract and checkIntegrity() final, delegating to checkIntegrity(null). That would prevent subclasses from silently ignoring the merge parameter but it would require updating all (backwards and third party) codec implementations. Or just replace checkIntegrity() by checkIntegrity(OneMerge). Happy to hear our thoughts on this

Maybe we can avoid the delegator and just change the signature of the existing functions? Delegators can sometimes cause problems, maybe it is best to avoid them with this one

}

static long checksumEntireFile(
IndexInput input, MergePolicy.OneMerge merge, long abortCheckInterval) throws IOException {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great. maybe at some point this could be generalized into a built-in functional interface/predicate? I do see it as just "passing in an optional progress function".

I feel like the API would be cleaner, but when I looked more, it doesn't seem easy.

for now it seems passing OneMerge directly, like you have it, is probably the reasonable choice. The MergeAbortedException is not "normal" and gets handled in a special way by IndexWriter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback. Having tried various approach previously, I also think passing OneMerge is the simpler for now, but happy to revisit this if you change your mind.

@tlrx

tlrx commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Maybe we can avoid the delegator and just change the signature of the existing functions?

I pushed 73aa8b9 to change checkIntegrity() to checkIntegrity(OneMerge). It's a large change but the results looks better to me.

@tlrx tlrx requested a review from rmuir June 24, 2026 12:50
@rmuir

rmuir commented Jun 24, 2026

Copy link
Copy Markdown
Member

looks great. I'm just waiting for release activity to settle before merging anything big into main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Merges sometimes do lots of work even after being aborted

2 participants