Skip to content

build(deps): bump Lance 6.0.1 → 7.0.0 (correct-by-design substrate alignment)#229

Merged
ragnorc merged 8 commits into
mainfrom
ragnorc/lance-upgrade-impact-audit
Jun 14, 2026
Merged

build(deps): bump Lance 6.0.1 → 7.0.0 (correct-by-design substrate alignment)#229
ragnorc merged 8 commits into
mainfrom
ragnorc/lance-upgrade-impact-audit

Conversation

@ragnorc

@ragnorc ragnorc commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Summary

Initial Lance 6.0.1 → 7.0.0 substrate bump and the immediate changes required to be correct-by-design and aligned with the substrate. Realizes the dev-graph ticket iss-lance-v7-bump; scoped to the bump only (follow-on adoptions are separate tickets).

Arrow stays 58, DataFusion stays 53 — no change to the compiler/DataFusion surface. The only transitive bump is object_store 0.12.5 → 0.13.2. 141 upstream commits were reviewed (6.0.1 → 7.0.0); no fixes are lost (the 6.0.x release-branch backports are all forward-ported).

Changes (4 commits)

  1. build(deps)lance-* → 7.0.0, object_store → 0.13.2, and a mandatory roaring 0.11.4 pin.
    • object_store 0.13 moved get/put/head/rename/delete behind a new ObjectStoreExt trait → use object_store::ObjectStoreExt in storage.rs + db/manifest/namespace.rs (no call-site changes; mirrors Lance's own PR #6672).
    • roaring 0.11.4 is required: lance 7.0.0's UpdatedFragmentOffsets (lance#6650) derives Eq over HashMap<u64, RoaringBitmap>, which needs RoaringBitmap: Eq (added in roaring 0.11.4). The loose roaring = "0.11" constraint otherwise keeps the broken 0.11.3 and lance itself fails to compile.
    • lance#6774 flips merge-insert INSERT _row_created_at_version to the commit version → lance_version_columns assertion updated; production change-detection (keyed on _row_last_updated_at_version + ID membership) is unaffected.
  2. fix(storage) — pin WriteParams::auto_cleanup = None at all 11 production sites. lance#6755 flipped this default off; on 6.0.1 the on-by-default cleanup could silently GC versions __manifest pins for time-travel. We own cleanup explicitly, so this removes the default dependency and locks in the snapshot-safe behavior.
  3. test(lance) — pin BTREE range-boundary correctness (lance#6796 / issue #6792) via a surface guard reproducing the exact 5-row + BTREE shape.
  4. docs(dev) — new 7.0.0 alignment stanza in docs/dev/lance.md; AGENTS.md substrate 6.x → 7.x; invariants.md deletes/vector gap updated (lance#6658 shipped → MR-A unblocked but separate).

Verification

  • cargo test --workspace --locked: 174 passed, including the full engine suite (surface guards, lance_version_columns, point_in_time, maintenance, consistency, recovery, writes, search, traversal).
  • The 2 failing tests (local_cluster_apply_then_server_boots_from_cluster_state, local_cluster_full_lifecycle_declare_serve_evolve_delete, both spawned-server multi-restart e2es) reproduce identically on unmodified 6.0.1 — they are pre-existing/sandbox-environmental (auth/startup-shaped: 401 missing bearer token; missing queries array), not caused by this bump. testing.md documents these as sandbox-sensitive with OMNIGRAPH_SKIP_SYSTEM_E2E=1. The single-server cluster e2e passes.
  • scripts/check-agents-md.sh: OK (57 links, 54 docs).

Out of scope (separate tickets, unblocked by this bump)

  • MR-A staged two-phase delete_where + retire D2 (iss-950) — lance#6658 shipped in 7.0.0.
  • shared-memory test substrate, incremental index reconciler, stable-row-id traversal.
  • Still upstream-blocked: vector two-phase (lance#6666), blob-column compaction.
  • An omnigraph release version bump (Version surveyed: 0.7.0) is intentionally deferred.

🤖 Generated with Claude Code


Note

High Risk
This is a major storage-substrate bump affecting every Lance write/commit path and version GC behavior; incorrect auto-cleanup handling could delete time-travel/snapshot versions, though the PR adds explicit skips and regression tests for that.

Overview
Bumps the workspace from Lance 6.0.1 to 7.0.0 (Arrow 58 / DataFusion 53 unchanged) and object_store 0.12.5 → 0.13.2, with a Cargo.lock pin on roaring 0.11.4 so Lance 7 compiles.

Dependency / API touch-ups: ObjectStoreExt is imported where object_store 0.13 moved convenience methods off the core trait (storage.rs, manifest namespace). Docs (AGENTS.md, docs/dev/lance.md, invariants.md) now describe Lance 7.x and record the alignment audit (lance#6774, #6755, #6796, #6658).

Snapshot / time-travel safety (lance#6755): Production writes set WriteParams { auto_cleanup: None, skip_auto_cleanup: true } on creates/appends/overwrites, CommitBuilder::with_skip_auto_cleanup(true) on commit_staged, and MergeInsertBuilder::skip_auto_cleanup(true) on __manifest CAS — so Lance’s per-commit hook cannot GC versions still pinned by __manifest, including on graphs upgraded from 6.x with legacy stored auto-cleanup config.

Behavior & tests: Change-feed comments and lance_version_columns expectations reflect lance#6774 (merge-insert INSERT rows stamp _row_created_at_version at commit version; UPDATE still preserves created_at and bumps _row_last_updated_at_version). New guards cover BTREE range boundaries (lance#6796) and skip_auto_cleanup (surface + staged_writes regression). delete_where stays inline; docs note MR-A is unblocked now that lance#6658 shipped in 7.0.0.

Reviewed by Cursor Bugbot for commit e2099fc. Bugbot is set up for automated code reviews on this repo. Configure here.

Greptile Summary

This PR upgrades Lance 6.0.1 → 7.0.0 (object_store 0.12.5 → 0.13.2, roaring 0.11.4 pin via lock file) and makes the codebase correct-by-design against the two critical behavior changes: lance#6755's default-off auto-cleanup and lance#6774's INSERT _row_created_at_version stamping.

  • Auto-cleanup suppression: auto_cleanup: None + skip_auto_cleanup: true are applied at all 11 WriteParams sites; CommitBuilder::with_skip_auto_cleanup(true) is wired into commit_staged; MergeInsertBuilder::skip_auto_cleanup(true) protects the __manifest publisher.
  • Version-column semantics: lance_version_columns.rs assertions updated for lance#6774 and the previously-flagged missing bob.2/bob.3 UPDATE invariant guards are now added.
  • New surface guards: Guard 17 pins the BTREE range-boundary fix (lance#6796); Guard 18 provides a self-validating positive/negative control for skip-auto-cleanup; staged_writes.rs adds a full-stack commit_staged regression against a legacy cleanup config.

Confidence Score: 5/5

Safe to merge; all write paths that commit through CommitBuilder, WriteParams, or MergeInsertBuilder now suppress Lance's per-commit auto-cleanup hook, protecting __manifest-pinned versions on both new and pre-bump graphs.

Every production write path has been updated with skip_auto_cleanup. The previously missing update-invariant assertions in lance_version_columns.rs are now resolved. Guard 18 includes a self-validating negative control. The roaring 0.11.4 transitive pin is correctly locked via Cargo.lock and CI uses --locked.

No files require special attention beyond the already-known stale 6.0.1 comments in namespace.rs flagged in a prior review.

Important Files Changed

Filename Overview
crates/omnigraph/src/table_store.rs Adds auto_cleanup: None + skip_auto_cleanup: true across all WriteParams sites and CommitBuilder::with_skip_auto_cleanup(true) on commit_staged; all staging paths covered.
crates/omnigraph/src/db/manifest/publisher.rs Adds merge_builder.skip_auto_cleanup(true) to the __manifest publisher's committed MergeInsertBuilder path.
crates/omnigraph/tests/lance_surface_guards.rs Adds Guard 17 (BTREE range-boundary fix) and Guard 18 (skip_auto_cleanup suppresses legacy GC config) with self-validating negative control.
crates/omnigraph/tests/lance_version_columns.rs Updates assertions for lance#6774 INSERT stamping change; adds previously missing bob.2 and bob.3 invariant assertions guarding change-feed correctness.
crates/omnigraph/tests/staged_writes.rs Adds commit_staged_skips_auto_cleanup_so_pinned_versions_survive regression test against a pre-bump legacy auto_cleanup config.
crates/omnigraph/src/db/manifest/namespace.rs Adds ObjectStoreExt import for object_store 0.13; two inline comments still reference lance-namespace 6.0.1 — flagged in prior review, not addressed here.
Cargo.toml Bumps all lance-* crates 6.0.1 → 7.0.0 and object_store 0.12.5 → 0.13.2; roaring 0.11.4 locked via Cargo.lock.

Comments Outside Diff (3)

  1. crates/omnigraph/tests/lance_version_columns.rs, line 263-270 (link)

    P1 The update test documents both invariants in comments — "created_at should be preserved (v1), updated_at should be bumped (v2)" — but only asserts bob.1 == 99 (the data value). The load-bearing contract for changes/mod.rs's diff_table_same_lineage is that _row_last_updated_at_version is bumped on every update; a future Lance regression there would silently miss all updates in change feeds without any test turning red. Since this PR is already updating assertions in this file for the lance#6774 behavior change, completing the guard here is natural.

    Fix in Claude Code

  2. crates/omnigraph/src/db/manifest/namespace.rs, line 235-239 (link)

    P2 This inline comment still names lance-namespace 6.0.1 after the bump to 7.0.0. The same stale reference appears in StagedTableNamespace::describe_table ~150 lines below. AGENTS.md rule 6 says to fix stale text in the same change rather than leaving it silently wrong.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

    Fix in Claude Code

  3. crates/omnigraph/tests/lance_version_columns.rs, line 263-270 (link)

    P1 The update test documents two invariants in comments — "created_at should be preserved (v1), updated_at should be bumped (v2)" — but only asserts bob.1 == 99 (the data value). The load-bearing contract for changes/mod.rs's diff_table_same_lineage is that _row_last_updated_at_version is bumped on every merge_insert UPDATE; a future Lance regression here would silently drop all updates from change feeds without any test turning red. Since this PR is already updating assertions in this file for the lance#6774 behavior change, the guard should be completed here.

    Fix in Claude Code

Reviews (3): Last reviewed commit: "test(lance): assert created_at-preserved..." | Re-trigger Greptile

ragnorc added 4 commits June 14, 2026 19:03
….11.4)

Arrow stays 58 and DataFusion stays 53 (no change). The only transitive bump
is object_store 0.12.5 → 0.13.2. 141 upstream commits reviewed; no fixes lost
(the 6.0.x release-branch backports are all forward-ported into 7.0.0).

- object_store 0.13 moved get/put/head/rename/delete behind a new ObjectStoreExt
  trait (list/list_with_delimiter/put_opts stay on the core trait). Add
  `use object_store::ObjectStoreExt` in storage.rs and db/manifest/namespace.rs;
  no call-site changes. Mirrors Lance's own migration in PR #6672.
- roaring pinned to 0.11.4 (cargo update -p roaring --precise 0.11.4). Lance
  7.0.0's UpdatedFragmentOffsets newtype (lance#6650) derives Eq over
  HashMap<u64, RoaringBitmap>, which needs RoaringBitmap: Eq, added in roaring
  0.11.4; the loose `roaring = "0.11"` constraint otherwise resolves 0.11.3 and
  lance itself fails to compile.
- lance#6774: merge-insert INSERT rows now stamp _row_created_at_version with the
  commit version (was a fallback of 1). Flip the lance_version_columns assertion
  to `== v2` and correct the changes/mod.rs rationale comment. Production
  change-detection keys on _row_last_updated_at_version + ID membership, so its
  logic is unaffected.

Refs lance#6650, lance#6774, lance#6672.
…t flip)

lance#6755 flipped the WriteParams::auto_cleanup default from on (a full cleanup
pass every 20th commit) to None. On 6.0.1 the on-by-default hook could silently
GC versions that __manifest pins for snapshots/time-travel. OmniGraph owns
cleanup explicitly (optimize.rs::cleanup_all_tables) and never set auto_cleanup,
so it was relying on a default that is both wrong for our snapshot model and now
changed upstream.

Pin auto_cleanup: None explicitly at all 11 production WriteParams sites
(table_store ×6, commit_graph ×2, recovery_audit ×1, manifest/graph ×2 — the
__manifest + sub-table Create paths). Removes the dependency on a default-flag
value and locks in the snapshot-safe behavior regardless of future upstream
re-flips.

Refs lance#6755.
lance#6796 (issue #6792) fixed a BTREE scalar-index range-query bound
inclusiveness bug: `x <= hi AND x > lo` returned the wrong boundary row.
Add lance_surface_guards.rs::btree_range_query_boundary_is_correct, which
reproduces the exact #6792 shape (5 rows + an explicit BTREE drives the index
path even on tiny data) and pins the corrected inclusive-<= / exclusive->
semantics. It turns red if a future Lance regression reintroduces the bug.

OmniGraph today builds BTREE only on string @key columns and queries them by
equality/IN, so its current patterns do not hit this; the guard protects any
future BTREE-range path (BTREE-on-properties, range-on-key).

Refs lance#6796.
- docs/dev/lance.md: new 2026-06-14 alignment stanza for the 6.0.1 → 7.0.0 bump
  (object_store ObjectStoreExt move, roaring 0.11.4, #6774/#6796/#6755 behavior,
  #6658 shipped → MR-A unblocked but separate, #6666 + blob compaction still
  open); prior 6.0.1 stanza demoted to historical.
- AGENTS.md: storage substrate 6.x → 7.x (line + architecture diagram).
- docs/dev/invariants.md: deletes/vector known gap updated — the staged
  two-phase delete API (lance#6658) now exists and MR-A is unblocked, but
  delete_where stays inline and D2 stays in place until the migration lands;
  create_vector_index still gated on lance#6666.
@ragnorc ragnorc requested a review from aaltshuler as a code owner June 14, 2026 17:04

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 29d61d47a5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

let params = WriteParams {
mode: WriteMode::Append,
allow_external_blob_outside_bases: true,
auto_cleanup: None,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Disable cleanup on staged commits

When writing to a graph whose Lance tables were created before this bump, those datasets can already carry lance.auto_cleanup.interval / older_than config from the old default. Setting auto_cleanup: None here only affects create-time config and is not used by the later commit_staged CommitBuilder, whose skip_auto_cleanup default remains false, so upgraded graphs can still run Lance auto-cleanup after enough writes and delete versions that older OmniGraph manifests/time-travel snapshots still pin. The staged commit path needs to explicitly skip auto-cleanup or clear/migrate the legacy dataset config before writes continue.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid P1 — verified against Lance 7.0.0 source and fixed in 90a2028.

Confirmed: 6.0.1 defaulted WriteParams::auto_cleanup = Some(AutoCleanupParams::default()) (ON), auto_cleanup is create-time-only ("no effect on existing datasets" per write.rs), and the hook fires off the dataset's stored config at commit time (io/commit.rs: if !commit_config.skip_auto_cleanup). So auto_cleanup: None alone left pre-bump graphs exposed, and the staged path (commit_stagedCommitBuilder, skip_auto_cleanup default false) would still GC __manifest-pinned versions.

Fix: skip_auto_cleanup on every commit path —

  • commit_stagedCommitBuilder::with_skip_auto_cleanup(true) (staged data path)
  • __manifest publisher → MergeInsertBuilder::skip_auto_cleanup(true)
  • all 11 WriteParamsskip_auto_cleanup: true (direct write/append; auto_cleanup: None kept so new datasets store no config)

Tests: lance_surface_guards::skip_auto_cleanup_suppresses_version_gc (negative control proves the config GCs without the skip; positive proves the skip preserves the version) and staged_writes::commit_staged_skips_auto_cleanup_so_pinned_versions_survive (omnigraph's commit_staged on a legacy-config dataset). Both green.

ragnorc and others added 4 commits June 14, 2026 19:41
…-impact-audit

# Conflicts:
#	crates/omnigraph/tests/lance_surface_guards.rs
#	docs/dev/lance.md
…sets

Addresses PR #229 review (Codex P1). `WriteParams::auto_cleanup` is create-time
config with no effect on existing datasets (Lance write.rs docs), so the previous
`auto_cleanup: None` change alone did NOT protect graphs created before the v7
bump: 6.0.1 defaulted auto_cleanup ON, leaving `lance.auto_cleanup.*` config on
those datasets, and Lance's per-commit hook (io/commit.rs: `if
!commit_config.skip_auto_cleanup`) fires off that stored config — so omnigraph's
own writes would GC versions the __manifest pins for snapshots/time-travel.

Skip the hook on every commit path, covering new and legacy datasets alike:
- commit_staged: CommitBuilder::with_skip_auto_cleanup(true) — the staged data path.
- __manifest publisher: MergeInsertBuilder::skip_auto_cleanup(true).
- all 11 WriteParams: skip_auto_cleanup: true (direct Dataset::write/append paths;
  auto_cleanup: None retained so new datasets store no cleanup config at all).

Tests:
- lance_surface_guards::skip_auto_cleanup_suppresses_version_gc — substrate:
  negative control (config GCs v1 without skip) + with-skip survival.
- staged_writes::commit_staged_skips_auto_cleanup_so_pinned_versions_survive —
  omnigraph usage: commit_staged on a legacy-config dataset preserves the pinned
  create version.

Refs lance#6755.
…_insert UPDATE

Addresses PR #229 review follow-up. `lance_merge_insert_update_preserves_created_at_version`
documented (in a comment) that a merge_insert UPDATE preserves created_at and
bumps updated_at, but only asserted the value change — leaving the change-feed
invariant unguarded. Add the two missing assertions:

- bob created_at == v1 (preserved across UPDATE; what the test name promises;
  lance#6774 only changed INSERT-row stamping).
- bob updated_at == v2 (bumped to the commit version) — the invariant
  OmniGraph's insert/update classification relies on (changes/mod.rs keys on
  _row_last_updated_at_version). A regression here would silently drop updates
  from the diff/change feed.
@ragnorc

ragnorc commented Jun 14, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the follow-up (lance_merge_insert_update_preserves_created_at_version documented but never asserted the version-column behavior) in e2099fc.

The test now asserts both halves it previously only described in a comment:

  • bob.created_at == v1 — preserved across the UPDATE (what the test name promises; lance#6774 only changed INSERT-row stamping).
  • bob.updated_at == v2 — bumped to the commit version. This is the change-feed invariant OmniGraph's insert/update classification relies on (changes/mod.rs keys on _row_last_updated_at_version); a regression here would silently drop updates from the diff/change feed.

cargo test -p omnigraph-engine --test lance_version_columns green (3/3).

@ragnorc ragnorc merged commit 67baf61 into main Jun 14, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant