Skip to content

feat(transaction): add ExpireSnapshotsAction#2591

Open
dhruvarya-db wants to merge 1 commit into
apache:mainfrom
dhruvarya-db:feat-expire-snapshots-action
Open

feat(transaction): add ExpireSnapshotsAction#2591
dhruvarya-db wants to merge 1 commit into
apache:mainfrom
dhruvarya-db:feat-expire-snapshots-action

Conversation

@dhruvarya-db

@dhruvarya-db dhruvarya-db commented Jun 5, 2026

Copy link
Copy Markdown

Which issue does this PR close?

Revives the effort to close #2145 (originally filed as #1454), which has been attempted before but never landed. This is the first of a small series of PRs and covers the metadata-only piece; physical file cleanup will build on top of it in follow-ups.

What changes are included in this PR?

Adds an ExpireSnapshotsAction to the transaction API, reachable as tx.expire_snapshots() and following the same convention as the other actions. It selects the snapshots to expire and emits a single RemoveSnapshots update. It deliberately does not touch any files; that is left to a follow-up so this change stays small and easy to review.

Selection mirrors Java's RemoveSnapshots. Explicit ids (expire_snapshot_ids) and age-based expiry (expire_older_than_ms) are combined rather than mutually exclusive: a snapshot is expired if it is named explicitly or is older than the cutoff. retain_last keeps the most recent snapshots even when older than the cutoff and only bounds the age path. Any snapshot referenced by a branch or tag (including the current snapshot) is protected from expiry, and naming such an id explicitly is an error, since TableMetadataBuilder::remove_snapshots would otherwise drop the ref silently. Per-ref retention windows (max-snapshot-age, max-ref-age, per-branch min-snapshots-to-keep) are left for a later PR.

Are these changes tested?

Yes. Unit tests cover explicit-id expiry, age filtering with retain_last, unknown ids being ignored, and that the action registers on a transaction, plus the ref-safety behavior: the current snapshot and a tagged snapshot cannot be expired explicitly, and age-based expiry skips a tagged snapshot.

@dhruvarya-db dhruvarya-db force-pushed the feat-expire-snapshots-action branch 3 times, most recently from fce97ed to 9b45819 Compare June 5, 2026 01:21
Add a metadata-only transaction action that removes snapshots, emitting a
single RemoveSnapshots update. Physical file cleanup is left to a follow-up
maintenance operation built on top of this action.

Ref apache#1454
@dhruvarya-db dhruvarya-db force-pushed the feat-expire-snapshots-action branch from 9b45819 to 0bdd123 Compare June 5, 2026 01:46

@laskoviymishka laskoviymishka left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few things I'd fix before merge:

  • Branch ancestors aren't protected. Age expiry only protects ref heads, so a shared ancestor of two branches can be expired while still reachable via parent_snapshot_id, leaving a dangling pointer. Same root cause makes retain_last global instead of per-branch (Java walks each branch's ancestry). I think this needs a per-branch ancestor walk.
  • Commit requirement is too weak. We only emit UuidMatch, which leaves a race where a concurrent writer's new snapshot gets orphaned. SnapshotProducer guards this with a RefSnapshotIdMatch on main — I'd mirror that.
  • expire_snapshot_ids replaces instead of accumulating, unlike add_data_files and friends.
  • gc.enabled isn't checked — Java refuses to expire when it's false.
  • The combined explicit + age test the description mentions doesn't actually exist.

Happy to take another pass once these are addressed.


let mut snapshot_ids: Vec<i64> = to_expire.into_iter().collect();
snapshot_ids.sort_unstable();
Ok(snapshot_ids)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a subtle correctness issue here that's the deepest thing in the PR.

We protect only ref heads in protected_snapshot_ids, but age expiry walks a global pool. For two branches sharing a common base, the shared ancestor isn't a head, so it gets expired even though it's still reachable via parent_snapshot_id — afterwards that pointer dangles and tools that walk full ancestry (Flink, Trino CDC) see truncated history.

The same root cause bites retain_last: Java's computeBranchSnapshotsToRetain applies it per-branch along each ancestry chain, but here it's the N globally-most-recent across all branches. With branch A [A1,A2,A3] and branch B [B1,B2,B3], retain_last=2 keeps [B2,B3] globally and expires A2, which Java would retain.

I'd build the protected set by walking each branch's ancestor chain via parent_snapshot_id, stopping when count >= retain_last and timestamp < cutoff, then union that with the head set before age expiry. That's a bigger change, but I don't think we can ship age-based expiry without it. wdyt?

use crate::transaction::tests::make_v2_table;

// `make_v2_table` carries an older snapshot (ts 1515100955770) and a current
// snapshot (ts 1555100955770).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UuidMatch alone leaves a race window. If another writer commits a new snapshot between snapshot_ids_to_expire and update_table, that snapshot is now live but its parent may be in our to-expire set — removing the parent orphans the new branch history, and the UUID check won't catch it because the table UUID is unchanged.

SnapshotProducer guards against exactly this with a RefSnapshotIdMatch on main (see snapshot.rs ~517). I'd add the same here:

TableRequirement::RefSnapshotIdMatch {
    r#ref: "main".to_string(),
    snapshot_id: table.metadata().current_snapshot_id(),
},

Thoughts?

pub fn retain_last(mut self, retain_last: usize) -> Self {
self.retain_last = Some(retain_last);
self
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replaces rather than accumulates, so expire_snapshot_ids(vec![A]).expire_snapshot_ids(vec![B]) silently drops A. Every other builder (add_data_files etc.) uses .extend(), so the inconsistency is surprising, and since this is public surface, switching to accumulate later would be a breaking change.

I'd accumulate and take an iterator to match add_data_files:

pub fn expire_snapshot_ids(mut self, snapshot_ids: impl IntoIterator<Item = i64>) -> Self {
    self.snapshot_ids.extend(snapshot_ids);
    self
}

If replace is intentional, I'd rename to set_snapshot_ids and document it — but accumulate feels right here.


#[cfg(test)]
mod tests {
use std::sync::Arc;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never check gc.enabled before expiring. Java's RemoveSnapshots validates it in the constructor and refuses to run when it's false, since expiring metadata defeats a user's explicit decision to disable GC.

I'd read it at the top of commit (or snapshot_ids_to_expire) and bail:

if table.metadata().properties().get("gc.enabled").map(|v| v == "false").unwrap_or(false) {
    return Err(Error::new(ErrorKind::DataInvalid,
        "Cannot expire snapshots: gc.enabled is false"));
}


#[tokio::test]
async fn test_cannot_expire_current_snapshot() {
let table = make_v2_table();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says the combined explicit + age path is tested, but I don't see a test that sets both — and it's the primary interaction between the two code paths, so it's the most likely regression site.

I'd add one with non-overlapping targets: expire_snapshot_ids(vec![X]).expire_older_than_ms(cutoff) where an age-qualifying snapshot Y is distinct from X, asserting both land in the result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ExpireSnapshotsAction for snapshot lifecycle management

3 participants