Commit f4ee574
Blog post on writing table providers (#161)
* Initial commit for blog post on writing table providers
* Minor text changes
* Add acknowledgement
* Add note about when to add push down filters
Co-authored-by: Yongting You <2010youy01@gmail.com>
* Address a variety of user feedback
* Update links
* pelican processing didn't handle backticks in links well
* Add an explanation of different ways to use FileFormat for a ListingTable
* Address alamb review feedback
- Clarify intro sentence to mention planning/execution work
- Label TableProvider as Logical Plan and ExecutionPlan as Physical Plan
- Change "four phases" to "several phases" (list has 5 items)
- "Some logical optimizations" and "rewrites such as" to signal non-exhaustive lists
- Clarify scan() comment: "don't do any execution work here"
- Rewrite partitioning section to lead with simple advice (match data layout)
before covering target_partitions and hash partitioning subtleties
- Narrow CPU thread pool advice: spawn_blocking is for blocking/long-running
work, not all CPU work
- Add "scan is single-threaded" as a reason to keep scan() lightweight
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* update date
* Add link to thread_pools example for blocking work section
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* remove use statements from example
* revert section on scan_with_args and drop to single line
* Move 'Choosing the Right Starting Point' before Layer 1
Addresses alamb's suggestion to move the section earlier so readers
understand what level of work is required before diving in.
- Moved section to just before Layer 1: TableProvider
- Trimmed the file-based path detail to a short paragraph with links
(the full trait hierarchy was too deep for an intro-position section)
- Removed RecordBatchStreamAdapter reference (not yet introduced at
that point in the article)
- Added a sentence orienting the reader to what the rest of the post covers
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Fix pre-publish review issues
- Fix use-after-move bug in DatePartitionedExec construction (dirs.len()
called after dirs moved into struct field)
- Fix incorrect import: SessionState → catalog::Session in CountingTable
example
- Remove double space before scan_with_args link
- Add missing blank line before '### Using EXPLAIN' heading
- Split dense 'Only Push Down Filters' paragraph for readability
- Change 'full working example' to 'illustrative example' for the
filter pushdown code that contains todo!() stubs
- Use 'Rerun is building' instead of repeating [Rerun.io] link
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Add reviewer acknowledgements to blog post
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Add 'Get Involved' section
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Final pre-publish fixes
- Fix grammar: "Best practices are" → "Best practice is"
- Remove unused StringArray import from complete example
- Fix outdated arrow-datafusion repo link → apache/datafusion
- Add missing reviewers to acknowledgements: adriangb, kevinjqliu, Omega359
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* make it alphabetical
---------
Co-authored-by: Yongting You <2010youy01@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent 64c98bf commit f4ee574
1 file changed
Lines changed: 916 additions & 0 deletions
0 commit comments