docs: datafusion-future-improvements.md (post-Lance-6 DF state + future work)#114
Open
aaltshuler wants to merge 3 commits into
Open
docs: datafusion-future-improvements.md (post-Lance-6 DF state + future work)#114aaltshuler wants to merge 3 commits into
aaltshuler wants to merge 3 commits into
Conversation
Captures the post-PR #111 (Lance 4→6) + PR #113 (structured Expr pushdown) DataFusion state in one place, so future maintainers don't have to re-derive what's done, what's free, and what's still on the table from chat history. Structure: - Direct touchpoints (only 2 — narrow surface) - Shipped: PR-by-PR delta of what's landed - Passive wins active on DF 53 (PR-linked, with where-it-bites-us notes) - Still on the table, ranked by tier: - T1: structural, unblocked today (hydrate_nodes Expr pushdown) - T2: gated on Lance v7 (delete Expr via MR-A / issue #112) - T3: future-shape unlocks (extension planner, expression placement, etc.) - T4: won't reach us without major changes (custom ExecutionPlan territory) - Upstream cadence note (Lance dictates the DF version) - Maintenance section Linked from docs/dev/index.md so the check-agents-md CI guard passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lance 7.0.0 shipped stable 2026-05-28 and still pins datafusion = "^53" / arrow = "^58" (verified against the published 7.0.0 dependency manifest), so the pending 6.0.1 -> 7.0.0 bump is not a DataFusion bump: the "Passive wins" table is unchanged. - Current-pin stanza: note 7.0.0 is available upstream and holds DF ^53. - Tier 2: the delete-Expr item's upstream gate (execute_uncommitted, lance#6658) is now satisfied (in 7.0.0 stable); reframe the trigger as our own 6->7 bump rather than waiting on a Lance release. - Upstream cadence: correct the pre-release speculation — 7.0.0 stayed on DF 53; a DF 54/55 jump is deferred to a later Lance. - Drop the brittle exec/query.rs:771-796 line range (drifted; hydrate_nodes is at 863 on main) in favor of the stable function name. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
New dev doc capturing the DataFusion state of the codebase after PR #111 (Lance 4 → 6) and PR #113 (structured Expr pushdown). The chat-history-only record of "what we did, what's free, what's still on the table" gets a permanent home.
Structure
Why now
We've made two material DataFusion-side moves in two PRs (bumped to 53, switched the bulk of read-path pushdown to structured Expr). Without a doc, the next time someone asks "where are we with DataFusion?" we re-derive it from chat history. Doc costs ~120 lines, saves that re-derivation cost on every future ask.
Test plan
scripts/check-agents-md.shclean (cross-link integrity — 35 links, 34 docs)docs/dev/index.md🤖 Generated with Claude Code