feat: add IncrementalFileCleanup strategy and dispatch in ExpireSnapshots::Finalize#648
feat: add IncrementalFileCleanup strategy and dispatch in ExpireSnapshots::Finalize#648shangxinli wants to merge 12 commits into
Conversation
Implement the file cleanup logic that was missing from the expire
snapshots feature (the original PR noted "TODO: File recycling will
be added in a followup PR").
Port the "reachable file cleanup" strategy from Java's
ReachableFileCleanup, following the same phased approach:
Phase 1: Collect manifest paths from expired and retained snapshots
Phase 2: Prune manifests still referenced by retained snapshots
Phase 3: Find data files only in manifests being deleted, subtract
files still reachable from retained manifests (kAll only)
Phase 4: Delete orphaned manifest files
Phase 5: Delete manifest lists from expired snapshots
Phase 6: Delete expired statistics and partition statistics files
Key design decisions matching Java parity:
- Best-effort deletion: suppress errors on individual file deletions
to avoid blocking metadata updates (Java suppressFailureWhenFinished)
- Branch/tag awareness: retained snapshot set includes all snapshots
reachable from any ref (branch or tag), preventing false-positive
deletions of files still referenced by non-main branches
- Data file safety: only delete data files from manifests that are
themselves being deleted, then subtract any files still reachable
from retained manifests (two-pass approach from ReachableFileCleanup)
- Respect CleanupLevel: kNone skips all, kMetadataOnly skips data
files, kAll cleans everything
- FileIO abstraction: uses FileIO::DeleteFile for filesystem
compatibility (S3, HDFS, local), with custom DeleteWith() override
- Statistics cleanup via snapshot ID membership in retained set
TODOs for follow-up:
- Multi-threaded file deletion (Java uses Tasks.foreach with executor)
- IncrementalFileCleanup strategy for linear ancestry optimization
(Java uses this when no branches/cherry-picks involved)
- Fix O(M*S) I/O: Pre-cache ManifestFile objects in manifest_cache_ during Phase 1 (ReadManifestsForSnapshot), eliminating repeated manifest list reads in FindDataFilesToDelete. - Fix storage leak: Use LiveEntries() instead of Entries() to match Java's ManifestFiles.readPaths behavior (only ADDED/EXISTING entries). - Fix data loss risk: When reading a retained manifest fails, abort data file deletion entirely instead of silently continuing. Java retries and throws on failure here. - Fix statistics file deletion: Use path-based set difference instead of snapshot_id-only check, preventing erroneous deletion of statistics files shared across snapshots. - Remove goto anti-pattern: Extract ManifestFile lookup into MakeManifestReader() helper and use manifest_cache_ for direct lookup. - Improve API: FindDataFilesToDelete now returns Result<unordered_set<string>> instead of using a mutable out-parameter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mirror Java's file cleanup class hierarchy for expire snapshots: - Add abstract FileCleanupStrategy with shared DeleteFile() and ExpiredStatisticsFilePaths() utilities (path-based set difference) - Add ReachableFileCleanup concrete class owning manifest_cache_, ReadManifestsForSnapshot(), and FindDataFilesToDelete() - Move MakeManifestReader() to a free function in anonymous namespace using ICEBERG_ASSIGN_OR_RAISE - Remove cleanup-specific private methods and manifest_cache_ from ExpireSnapshots class; Finalize() now delegates to the strategy - Clear apply_result_ after consumption in Finalize() - Rename DeleteFilePath to DeleteFile; use std::ignore for FileIO return - Remove manifest_list.h and manifest_reader.h from the header
… stats file deletion P0: ReadManifestsForSnapshot now returns bool. If any retained snapshot's manifest list cannot be read, phases 2-4 (manifest and data file deletion) are skipped entirely. An incomplete retained set makes it unsafe to compute manifests_to_delete, as manifests still referenced by unreadable snapshots would be wrongly included. This matches Java's throwFailureWhenFinished behavior in ReachableFileCleanup. Manifest list deletion (phase 5) is unaffected since it is keyed on expired snapshots only. P1: Remove physical statistics and partition-statistics file deletion (the former phase 6). RemoveStatistics/RemovePartitionStatistics are still not called in RemoveSnapshots (the TODO in table_metadata.cc), so the committed metadata still references those files after they would be deleted on disk. Deletion is deferred until the metadata-level removal is wired in, at which point the two operations can be kept in sync.
…hots::Finalize Mirrors Java's IncrementalFileCleanup for the linear-ancestry case: each manifest is attributed to its writer snapshot, so two passes are enough instead of the full reachability scan. Cherry-pick protection via SnapshotSummaryFields::kSourceSnapshotId is preserved. Finalize() now picks IncrementalFileCleanup when the expiration is "simple" (no explicit snapshot IDs, no removed snapshots outside the current main ancestry, and no retained snapshots outside the current main ancestry), and falls back to ReachableFileCleanup otherwise. The dispatch matches Java RemoveSnapshots.cleanExpiredSnapshots. Two existing cleanup tests (DeletesExpiredFiles, IgnoresExpiredDeleteManifestReadFailures) used an empty current manifest list, which is an unreachable-orphan scenario that only ReachableFileCleanup can resolve. They now call ExpireSnapshotId() to force the reachable path, which keeps their original intent and matches Java behavior. New tests cover both dispatch branches.
The merge of main into pr-a-incremental-cleanup auto-resolved by
keeping both copies in two places:
* transaction.cc: a duplicate Result<const TableMetadata*>
finalize_result definition, which made the file fail to compile.
* expire_snapshots.cc: an orphaned ReachableFileCleanup direct call
plus the obsolete TODO comment after the now-correct dispatch
inside Finalize(), which made clang-format fail in CI.
Drop the duplicates so the file compiles and matches the intended
post-merge state.
| try { | ||
| picked_ancestor_snapshot_ids.insert(std::stoll(it->second)); | ||
| } catch (...) { | ||
| // Malformed source-snapshot-id; skip rather than fail cleanup. |
There was a problem hiding this comment.
Java fails cleanup when source-snapshot-id is malformed. Please fail closed here (return/propagate an error) instead of skipping, so cherry-pick protection cannot be bypassed.
There was a problem hiding this comment.
Good catch — fixed in 7184725. Both parse sites now return InvalidArgument (matching Java propagating NumberFormatException), so a malformed source-snapshot-id aborts cleanup instead of silently bypassing cherry-pick protection.
Java's IncrementalFileCleanup propagates the NumberFormatException when source-snapshot-id can't be parsed, so cherry-pick protection cannot be silently bypassed. Mirror that behavior by returning InvalidArgument from both parse sites instead of skipping the entry. Addresses wgtmac review feedback on PR apache#648.
| auto it = summary.find(SnapshotSummaryFields::kSourceSnapshotId); | ||
| if (it == summary.end()) continue; | ||
| try { | ||
| picked_ancestor_snapshot_ids.insert(std::stoll(it->second)); |
There was a problem hiding this comment.
Please use the repo integer parser instead of std::stoll (e.g. StringUtils::ParseNumber<int64_t>, with + handling if needed). It returns Result and rejects trailing junk like Java.
| } else { | ||
| strategy = std::make_unique<ReachableFileCleanup>(ctx_->table->io(), delete_func_); | ||
| } | ||
| return strategy->CleanFiles(metadata_before_expiration, metadata_after_expiration, |
There was a problem hiding this comment.
This status is ignored by the normal Commit() paths today. Please add a commit-path test and propagate parsing/planning errors, otherwise malformed source-snapshot-id still reports success.
|
|
||
| /// \brief Incremental file cleanup strategy for simple linear-ancestry expirations. | ||
| /// | ||
| /// Mirrors Java's IncrementalFileCleanup. Only safe when: |
There was a problem hiding this comment.
Please avoid Java-specific wording in production comments. Keep this to the invariant and dispatch conditions, and drop the “Mirrors Java” phrasing here and below.
| metadata_after_expiration) && | ||
| !HasNonMainSnapshots(metadata_after_expiration); | ||
|
|
||
| std::unique_ptr<FileCleanupStrategy> strategy; |
There was a problem hiding this comment.
This does not need heap allocation or a virtual dispatch object. Please instantiate the selected strategy in each branch and return CleanFiles(...) directly.
| EXPECT_THAT(deleted_files, testing::Not(testing::Contains(reused_statistics_path))); | ||
| } | ||
|
|
||
| // Linear-ancestry, no specified ID: dispatch must pick IncrementalFileCleanup. |
There was a problem hiding this comment.
This comment is doing too much and calls out Java again. Please trim it to the setup expectation: no explicit ID should take the incremental path and preserve the added data file.
|
|
||
| Status CleanFiles(const TableMetadata& metadata_before_expiration, | ||
| const TableMetadata& metadata_after_expiration, | ||
| const std::unordered_set<int64_t>& expired_snapshot_ids, |
There was a problem hiding this comment.
Java derives expired IDs from beforeExpiration and afterExpiration inside cleanup. Please do the same here instead of passing a separate set that can drift from the two metadata states.
Mirrors Java's IncrementalFileCleanup for the linear-ancestry case: each manifest is attributed to its writer snapshot, so two passes are enough instead of the full reachability scan. Cherry-pick protection via SnapshotSummaryFields::kSourceSnapshotId is preserved.
Finalize() now picks IncrementalFileCleanup when the expiration is "simple" (no explicit snapshot IDs, no removed snapshots outside the current main ancestry, and no retained snapshots outside the current main ancestry), and falls back to ReachableFileCleanup otherwise. The dispatch matches Java RemoveSnapshots.cleanExpiredSnapshots.
Two existing cleanup tests (DeletesExpiredFiles, IgnoresExpiredDeleteManifestReadFailures) used an empty current manifest list, which is an unreachable-orphan scenario that only ReachableFileCleanup can resolve. They now call ExpireSnapshotId() to force the reachable path, which keeps their original intent and matches Java behavior. New tests cover both dispatch branches.