Skip to content

MR-A: migrate delete_where to staged two-phase (gated on Lance v7.x bump) #112

@aaltshuler

Description

@aaltshuler

Summary

Migrate delete_where from inline-commit to the staged-write contract via Lance's new DeleteBuilder::execute_uncommitted API. Retires the parse-time D₂ mutation rule, retires the inline_committed reopen branch in exec/mutation.rs, and extends recovery sidecar coverage to delete-only commits.

Trigger / blocker

Gated on Lance v7.x bump. Lance #6658 closed 2026-05-14, but the public API did not backport to the 6.x line — binary search confirms pub async fn execute_uncommitted on DeleteBuilder first appears in v7.0.0-beta.10. v7.0.0-rc.1 dropped 2026-05-21; GA likely within ~1 week.

Order of operations:

  1. Lance v7.0 GA ships.
  2. Bump-PR analogous to chore(lance): bump 4.0.0 → 6.0.1 (DataFusion 52→53, Arrow 57→58) #111 (Lance 6 → 7). Smaller surface than the 4→6 PR — we've already absorbed the DatasetIndexExt move, namespace renames, DataFusion 53, Arrow 58.
  3. This MR lands on top.

Today's state (Lance 6.0.1)

delete_where is an inline-commit residual on TableStorage. Four call sites depend on the inline behavior:

File:line Context
crates/omnigraph/src/exec/mutation.rs:1222 Delete-only mutation path (target table)
crates/omnigraph/src/exec/mutation.rs:1272 Cascade delete on edge tables
crates/omnigraph/src/exec/mutation.rs:1320 Delete-only mutation path (alt branch)
crates/omnigraph/src/exec/merge.rs:1019 Branch merge applying the delete diff (the only mid-query op that advances Lance HEAD today)

Three downstream consequences carried for the inline behavior:

  • The parse-time D₂ rule at crates/omnigraph/src/exec/mutation.rs:837 calls enforce_no_mixed_destructive_constructive(&ir) (defined at line 648). Rejects any mutation that mixes inserts/updates with deletes. User-visible error in docs/user/errors.md:12.
  • The inline_committed reopen branch at crates/omnigraph/src/exec/mutation.rs:598+ — ~70 lines dedicated to handling "Lance HEAD already moved on this table during this query."
  • Recovery sidecar gap. Delete commits are outside the all-or-nothing recovery protocol; mitigation today is D₂ (so the worst case is a single committed delete with no follow-up commit to coordinate).

Scope when unblocked

# Work Effort Notes
1 Bump Lance to v7.x small Mechanical. Add new guard lance_surface_guards.rs::_compile_delete_builder_execute_uncommitted_signature pinning the new method.
2 Add stage_delete to TableStorage trait small New method in crates/omnigraph/src/storage_layer.rs, parallel to stage_append / stage_merge_insert. Signature: async fn stage_delete(&self, snap: &SnapshotHandle, filter: &str) -> Result<StagedHandle>.
3 Implement in table_store.rs small Parallel to inline delete_where (~line 749). Call shape: DeleteBuilder::new(ds, filter).execute_uncommitted().await?UncommittedDelete { transaction, affected_rows, num_deleted_rows }. The transaction feeds commit_staged.
4 Migrate 4 call sites medium The merge call site (exec/merge.rs:1019) is the highest-value — it's the only mid-query op that advances Lance HEAD today. The three exec/mutation.rs sites are mechanical once staging plumbing exists.
5 Retire enforce_no_mixed_destructive_constructive small Delete the helper at exec/mutation.rs:648 and the call site at line 837.
6 Retire the inline_committed reopen branch small-medium The ~70-line block in open_table_for_mutation at exec/mutation.rs:598+. Every delete is staged now — no in-query Lance HEAD advance to coordinate.
7 Extend recovery sidecar to delete-only commits medium New classifier arm in crates/omnigraph/src/db/manifest/recovery.rs for "staged-delete-only commit" (zero new fragments, non-empty removed_fragment_ids).
8 Docs small docs/dev/invariants.md:39,101,124; docs/dev/runs.md:48,80,83,109,118; docs/dev/execution.md:89; docs/user/errors.md:12; docs/user/query-language.md:71; crates/omnigraph/src/storage_layer.rs module doc. All flip from "deletes inline" → "deletes staged, mixed mode supported."
9 Tests medium Add staged_writes::stage_delete_does_not_advance_head_until_commit, staged_writes::stage_delete_then_rollback_leaves_head_unchanged, recovery::recovery_classifies_stage_delete_only_commit. Delete staged_writes::delete_where_advances_head_inline_documents_residual (it pins behavior we no longer want). Add a mutation test asserting INSERT + DELETE in one query succeeds end-to-end (the D₂-retirement canary).

Total: ~3-4 days once Lance v7 lands.

Wins

  1. One inline-commit residual gone. TableStorage gets closer to its goal shape; the forbidden_apis.rs guard tightens.
  2. D₂ parse-time rule disappears. User-visible: graph queries can do DELETE + INSERT in one mutation — today consumers have to split into two.
  3. Recovery coverage extends. Delete-only commits get the same all-or-nothing sidecar treatment as constructive writes. One fewer class of "Lance HEAD ahead of manifest" drift.

The complexity that gets deleted in this MR is roughly equal to the complexity added — the trait grows one method, but the engine sheds ~100 lines (D₂ helper, inline_committed branch, residual disclaimers, the failing-on-mixed-mode test scaffolding).

Cross-refs

Metadata

Metadata

Assignees

No one assigned

    Labels

    acceptedTriaged and validated; open for a PR

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions