From 1fe3d2eb1e3e4acb223c90374b6f419a3644cedd Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Fri, 20 Mar 2026 18:03:52 -0400
Subject: [PATCH 01/19] Initial commit for blog post on writing table providers

---
 .../2026-03-20-writing-table-providers.md     | 643 ++++++++++++++++++
 1 file changed, 643 insertions(+)
 create mode 100644 content/blog/2026-03-20-writing-table-providers.md
diff --git a/content/blog/2026-03-20-writing-table-providers.md b/content/blog/2026-03-20-writing-table-providers.md
new file mode 100644
index 00000000..f87c2585
--- /dev/null
+++ b/content/blog/2026-03-20-writing-table-providers.md
@@ -0,0 +1,643 @@
+---
+layout: post
+title: "Writing Custom Table Providers in Apache DataFusion"
+date: 2026-03-20
+author: Tim Saucer
+categories: [tutorial]
+---
+<!--
+{% comment %}
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to you under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+{% endcomment %}
+-->
+
+[TOC]
+
+One of DataFusion's greatest strengths is its extensibility. If your data lives
+in a custom format, behind an API, or in a system that DataFusion does not
+natively support, you can teach DataFusion to read it by implementing a
+**custom table provider**. This post walks through the three layers you need to
+understand and explains where your work should actually happen.
+
+## The Three Layers
+
+---
+
+When DataFusion executes a query against a table, three abstractions collaborate
+to produce results:
+
+1. **[`TableProvider`]** -- Describes the table (schema, capabilities) and
+   produces an execution plan when queried.
+2. **[`ExecutionPlan`]** -- Describes *how* to compute the result: partitioning,
+   ordering, and child plan relationships.
+3. **[`SendableRecordBatchStream`]** -- The async stream that *actually does the
+   work*, yielding `RecordBatch`es one at a time.
+
+Think of these as a funnel: `TableProvider::scan()` is called once during
+planning to create an `ExecutionPlan`, then `ExecutionPlan::execute()` is called
+once per partition to create a stream, and those streams are where rows are
+actually produced during execution.
+
+[`TableProvider`]: https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html
+[`ExecutionPlan`]: https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html
+[`SendableRecordBatchStream`]: https://docs.rs/datafusion/latest/datafusion/execution/type.SendableRecordBatchStream.html
+
+## Layer 1: TableProvider
+
+---
+
+A [`TableProvider`] represents a queryable data source. For a minimal read-only
+table, you need four methods:
+
+```rust
+impl TableProvider for MyTable {
+    fn as_any(&self) -> &dyn Any { self }
+
+    fn schema(&self) -> SchemaRef {
+        Arc::clone(&self.schema)
+    }
+
+    fn table_type(&self) -> TableType {
+        TableType::Base
+    }
+
+    async fn scan(
+        &self,
+        state: &dyn Session,
+        projection: Option<&Vec<usize>>,
+        filters: &[Expr],
+        limit: Option<usize>,
+    ) -> Result<Arc<dyn ExecutionPlan>> {
+        // Build and return an ExecutionPlan -- keep this lightweight!
+        Ok(Arc::new(MyExecPlan::new(
+            Arc::clone(&self.schema),
+            projection,
+            limit,
+        )))
+    }
+}
+```
+
+The `scan` method is the heart of `TableProvider`. It receives three pushdown
+hints from the optimizer, each reducing the amount of data your source needs
+to produce:
+
+- **`projection`** -- Which columns are needed. This reduces the **width** of
+  the output. If your source supports it, read only these columns rather than
+  the full schema.
+- **`filters`** -- Predicates the engine would like you to apply during the
+  scan. This reduces the **number of rows** by skipping data that does not
+  match. Implement `supports_filters_pushdown` to advertise which filters you
+  can handle.
+- **`limit`** -- A row count cap. This also reduces the **number of rows** --
+  if you can stop reading early once you have produced enough rows, this avoids
+  unnecessary work.
+
+### Keep `scan()` Lightweight
+
+This is a critical point: **`scan()` runs during planning, not execution.** It
+should return quickly. Best practices are to avoid performing I/O, network
+calls, or heavy computation here. The `scan` method's job is to *describe* how
+the data will be produced, not to produce it. All the real work belongs in the
+stream (Layer 3).
+
+A common pitfall is to fetch data or open connections in `scan()`. This blocks
+the planning thread and can cause timeouts or deadlocks, especially if the query
+involves multiple tables or subqueries that all need to be planned before
+execution begins.
+
+### Existing Implementations to Learn From
+
+DataFusion ships several `TableProvider` implementations that are excellent
+references:
+
+- **[`MemTable`]** -- Holds data in memory as `Vec<RecordBatch>`. The simplest
+  possible provider; great for tests and small datasets.
+- **[`StreamTable`]** -- Wraps a user-provided stream factory. Useful when your
+  data arrives as a continuous stream (e.g., from Kafka or a socket).
+- **[`SortedTableProvider`]** -- Wraps another `TableProvider` and advertises a
+  known sort order, enabling the optimizer to skip redundant sorts.
+
+[`MemTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/memory/struct.MemTable.html
+[`StreamTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/stream/struct.StreamTable.html
+[`SortedTableProvider`]: https://docs.rs/datafusion/latest/datafusion/datasource/struct.SortedTableProvider.html
+
+## Layer 2: ExecutionPlan
+
+---
+
+An [`ExecutionPlan`] is a node in the physical query plan tree. Your table
+provider's `scan()` method returns one. The required methods are:
+
+```rust
+impl ExecutionPlan for MyExecPlan {
+    fn name(&self) -> &str { "MyExecPlan" }
+
+    fn as_any(&self) -> &dyn Any { self }
+
+    fn properties(&self) -> &PlanProperties {
+        &self.properties
+    }
+
+    fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>> {
+        vec![]  // Leaf node -- no children
+    }
+
+    fn with_new_children(
+        self: Arc<Self>,
+        children: Vec<Arc<dyn ExecutionPlan>>,
+    ) -> Result<Arc<dyn ExecutionPlan>> {
+        assert!(children.is_empty());
+        Ok(self)
+    }
+
+    fn execute(
+        &self,
+        partition: usize,
+        context: Arc<TaskContext>,
+    ) -> Result<SendableRecordBatchStream> {
+        // This is where you build and return your stream
+        // ...
+    }
+}
+```
+
+The key properties to set correctly in [`PlanProperties`] are **output
+partitioning** and **output ordering**.
+
+**Output partitioning** tells the engine how many partitions your data has,
+which determines parallelism. If your source naturally partitions data (e.g.,
+by file or by shard), expose that here.
+
+**Output ordering** declares whether your data is naturally sorted. This
+enables the optimizer to avoid inserting a `SortExec` when a query requires
+ordered data. Getting this right can be a significant performance win.
+
+### Partitioning Strategies
+
+Since `execute()` is called once per partition, partitioning directly controls
+the parallelism of your table scan. Each partition runs on its own task, so
+more partitions means more concurrent work -- up to the number of available
+cores.
+
+Consider how your data source naturally divides its data:
+
+- **By file or object:** If you are reading from S3, each file can be a
+  partition. DataFusion will read them in parallel.
+- **By shard or region:** If your source is a sharded database, each shard
+  maps naturally to a partition.
+- **By key range:** If your data is keyed (e.g., by timestamp or customer ID),
+  you can split it into ranges.
+
+Getting partitioning right matters because it affects everything downstream in
+the plan. When DataFusion needs to perform an aggregation or join, it
+repartitions data by hashing the relevant columns. If your source already
+produces data partitioned by the join or group-by key, DataFusion can skip the
+repartition step entirely -- avoiding a potentially expensive shuffle.
+
+For example, if you are building a table provider for a system that stores
+data partitioned by `customer_id`, and a common query groups by `customer_id`:
+
+```sql
+SELECT customer_id, SUM(amount)
+FROM my_table
+GROUP BY customer_id;
+```
+
+If you declare your output partitioning as `Hash([customer_id], N)`, the
+optimizer recognizes that the data is already distributed correctly for the
+aggregation and eliminates the `RepartitionExec` that would otherwise appear
+in the plan. You can verify this with `EXPLAIN` (more on this below).
+
+Conversely, if you report `UnknownPartitioning`, DataFusion must assume the
+worst case and will always insert repartitioning operators as needed.
+
+### Keep `execute()` Lightweight Too
+
+Like `scan()`, the `execute()` method should construct and return a stream
+without doing heavy work. The actual data production happens when the stream
+is polled. Do not block on async operations here -- build the stream and let
+the runtime drive it.
+
+### Existing Implementations to Learn From
+
+- **[`StreamingTableExec`]** -- Executes a streaming table scan. It takes a
+  stream factory (a closure that produces streams) and handles partitioning.
+  Good reference for wrapping external streams.
+- **[`DataSourceExec`]** -- The execution plan behind DataFusion's built-in file
+  scanning (Parquet, CSV, JSON). It demonstrates sophisticated partitioning,
+  filter pushdown, and projection pushdown.
+
+[`StreamingTableExec`]: https://docs.rs/datafusion/latest/datafusion/datasource/stream/struct.StreamingTableExec.html
+[`DataSourceExec`]: https://docs.rs/datafusion/latest/datafusion/datasource/struct.DataSourceExec.html
+[`PlanProperties`]: https://docs.rs/datafusion/latest/datafusion/physical_plan/struct.PlanProperties.html
+
+## Layer 3: SendableRecordBatchStream
+
+---
+
+[`SendableRecordBatchStream`] is where the real work happens. It is defined as:
+
+```rust
+type SendableRecordBatchStream =
+    Pin<Box<dyn RecordBatchStream<Item = Result<RecordBatch>> + Send>>;
+```
+
+This is an async stream of `RecordBatch`es that can be sent across threads. When
+the DataFusion runtime polls this stream, your code runs: reading files, calling
+APIs, transforming data, etc.
+
+### Using RecordBatchStreamAdapter
+
+The easiest way to create a `SendableRecordBatchStream` is with
+[`RecordBatchStreamAdapter`]. It bridges any `futures::Stream<Item =
+Result<RecordBatch>>` into the `SendableRecordBatchStream` type:
+
+```rust
+use datafusion::physical_plan::stream::RecordBatchStreamAdapter;
+
+fn execute(
+    &self,
+    partition: usize,
+    context: Arc<TaskContext>,
+) -> Result<SendableRecordBatchStream> {
+    let schema = self.schema();
+    let config = self.config.clone();
+
+    let stream = futures::stream::once(async move {
+        // ALL the heavy work happens here, inside the stream:
+        // - Open connections
+        // - Read data from external sources
+        // - Transform and batch the results
+        let batches = fetch_data_from_source(&config).await?;
+        Ok(batches)
+    })
+    .flat_map(|result| match result {
+        Ok(batch) => futures::stream::iter(vec![Ok(batch)]),
+        Err(e) => futures::stream::iter(vec![Err(e)]),
+    });
+
+    Ok(Box::pin(RecordBatchStreamAdapter::new(schema, stream)))
+}
+```
+
+[`RecordBatchStreamAdapter`]: https://docs.rs/datafusion/latest/datafusion/physical_plan/stream/struct.RecordBatchStreamAdapter.html
+
+### CPU-Intensive Work: Use a Separate Thread Pool
+
+If your stream performs CPU-intensive work (parsing, decompression, complex
+transformations), avoid blocking the tokio runtime. Instead, offload to a
+dedicated thread pool and send results back through a channel:
+
+```rust
+fn execute(
+    &self,
+    partition: usize,
+    context: Arc<TaskContext>,
+) -> Result<SendableRecordBatchStream> {
+    let schema = self.schema();
+    let config = self.config.clone();
+
+    let (tx, rx) = tokio::sync::mpsc::channel(2);
+
+    // Spawn CPU-heavy work on a blocking thread pool
+    tokio::task::spawn_blocking(move || {
+        let batches = generate_data(&config);
+        for batch in batches {
+            if tx.blocking_send(Ok(batch)).is_err() {
+                break; // Receiver dropped, query was cancelled
+            }
+        }
+    });
+
+    let stream = tokio_stream::wrappers::ReceiverStream::new(rx);
+    Ok(Box::pin(RecordBatchStreamAdapter::new(schema, stream)))
+}
+```
+
+This pattern keeps the async runtime responsive while your data generation
+runs on its own threads.
+
+## Where Should the Work Happen?
+
+---
+
+This table summarizes what belongs at each layer:
+
+| Layer | Runs During | Should Do | Should NOT Do |
+|---|---|---|---|
+| `TableProvider::scan()` | Planning | Build an `ExecutionPlan` with metadata | I/O, network calls, heavy computation |
+| `ExecutionPlan::execute()` | Execution (once per partition) | Construct a stream, set up channels | Block on async work, read data |
+| `RecordBatchStream` (polling) | Execution | All I/O, computation, data production | -- |
+
+The guiding principle: **push work as late as possible.** Planning should be
+fast so the optimizer can do its job. Execution setup should be fast so all
+partitions can start promptly. The stream is where you spend time producing
+data.
+
+### Why This Matters
+
+When `scan()` does heavy work, several problems arise:
+
+1. **Planning becomes slow.** If a query touches 10 tables and each `scan()`
+   takes 500ms, planning alone takes 5 seconds before any data flows.
+2. **The optimizer cannot help.** The optimizer runs between planning and
+   execution. If you have already fetched data during planning, optimizations
+   like predicate pushdown or partition pruning cannot reduce the work.
+3. **Resource management breaks down.** DataFusion manages concurrency and
+   memory during execution. Work done during planning bypasses these controls.
+
+## Filter Pushdown: Doing Less Work
+
+---
+
+One of the most impactful optimizations you can add to a custom table provider
+is **filter pushdown** -- letting the source skip data that the query does not
+need, rather than reading everything and filtering it afterward.
+
+### How Filter Pushdown Works
+
+When DataFusion plans a query with a `WHERE` clause, it passes the filter
+predicates to your `scan()` method as the `filters` parameter. By default,
+DataFusion assumes your provider cannot handle any filters and inserts a
+`FilterExec` node above your scan to apply them. But if your source *can*
+evaluate some predicates during scanning -- for example, by skipping files,
+partitions, or row groups that cannot match -- you can eliminate a huge amount
+of unnecessary I/O.
+
+To opt in, implement `supports_filters_pushdown`:
+
+```rust
+fn supports_filters_pushdown(
+    &self,
+    filters: &[&Expr],
+) -> Result<Vec<TableProviderFilterPushDown>> {
+    Ok(filters.iter().map(|f| {
+        match f {
+            // We can fully evaluate equality filters on
+            // the partition column at the source
+            Expr::BinaryExpr(BinaryExpr {
+                left, op: Operator::Eq, right
+            }) if is_partition_column(left) || is_partition_column(right) => {
+                TableProviderFilterPushDown::Exact
+            }
+            // All other filters: let DataFusion handle them
+            _ => TableProviderFilterPushDown::Unsupported,
+        }
+    }).collect())
+}
+```
+
+The three possible responses for each filter are:
+
+- **`Exact`** -- Your source guarantees that no output rows will have a false
+  value for this predicate. Because the filter is fully evaluated at the source,
+  DataFusion will **not** add a `FilterExec` for it.
+- **`Inexact`** -- Your source has the ability to reduce the data produced, but
+  the output may still include rows that do not satisfy the predicate. For
+  example, you might skip entire files based on metadata statistics but not
+  filter individual rows within a file. DataFusion will still add a `FilterExec`
+  above your scan to remove any remaining rows that slipped through.
+- **`Unsupported`** -- Your source ignores this filter entirely. DataFusion
+  handles it.
+
+### Why Filter Pushdown Matters
+
+Consider a table with 1 billion rows partitioned by `region`, and a query:
+
+```sql
+SELECT * FROM events WHERE region = 'us-east-1' AND event_type = 'click';
+```
+
+**Without filter pushdown:** Your table provider reads all 1 billion rows
+across all regions. DataFusion then applies both filters, discarding the vast
+majority of the data.
+
+**With filter pushdown on `region`:** Your `scan()` method sees the
+`region = 'us-east-1'` filter and constructs an execution plan that only reads
+the `us-east-1` partition. If that partition holds 100 million rows, you have
+just eliminated 90% of the I/O. DataFusion still applies the `event_type`
+filter via `FilterExec` if you reported it as `Unsupported`.
+
+### Using EXPLAIN to Debug Your Table Provider
+
+The `EXPLAIN` statement is your best tool for understanding what DataFusion is
+actually doing with your table provider. It shows the physical plan that
+DataFusion will execute, including any operators it inserted:
+
+```sql
+EXPLAIN SELECT * FROM events WHERE region = 'us-east-1' AND event_type = 'click';
+```
+
+If you are using DataFrames, call `.explain(false, false)` for the logical plan
+or `.explain(false, true)` for the physical plan. You can also print the plans
+in verbose mode with `.explain(true, true)`.
+
+**Before filter pushdown**, the plan might look like:
+
+```text
+FilterExec: region@0 = us-east-1 AND event_type@1 = click
+  MyExecPlan: partitions=50
+```
+
+Here DataFusion is reading all 50 partitions and filtering everything
+afterward. The `FilterExec` above your scan is doing all the predicate work.
+
+**After implementing pushdown for `region`** (reported as `Exact`):
+
+```text
+FilterExec: event_type@1 = click
+  MyExecPlan: partitions=5, filter=[region = us-east-1]
+```
+
+Now your exec reads only the 5 partitions for `us-east-1`, and the remaining
+`FilterExec` only handles the `event_type` predicate. The `region` filter has
+been fully absorbed by your scan.
+
+**After implementing pushdown for both filters** (both `Exact`):
+
+```text
+MyExecPlan: partitions=5, filter=[region = us-east-1 AND event_type = click]
+```
+
+No `FilterExec` at all -- your source handles everything.
+
+Similarly, `EXPLAIN` will reveal whether DataFusion is inserting unnecessary
+`SortExec` or `RepartitionExec` nodes that you could eliminate by declaring
+better output properties. Whenever your queries seem slower than expected,
+`EXPLAIN` is the first place to look.
+
+## Putting It All Together
+
+---
+
+Here is a minimal but complete example of a custom table provider that generates
+data lazily during streaming:
+
+```rust
+use std::any::Any;
+use std::sync::Arc;
+
+use arrow::array::{Int64Array, StringArray};
+use arrow::datatypes::{DataType, Field, Schema, SchemaRef};
+use arrow::record_batch::RecordBatch;
+use datafusion::catalog::TableProvider;
+use datafusion::common::Result;
+use datafusion::datasource::TableType;
+use datafusion::execution::context::SessionState;
+use datafusion::execution::SendableRecordBatchStream;
+use datafusion::logical_expr::Expr;
+use datafusion::physical_expr::EquivalenceProperties;
+use datafusion::physical_plan::execution_plan::{Boundedness, EmissionType};
+use datafusion::physical_plan::stream::RecordBatchStreamAdapter;
+use datafusion::physical_plan::{
+    ExecutionPlan, Partitioning, PlanProperties,
+};
+use futures::stream;
+
+/// A table provider that generates sequential numbers on demand.
+struct CountingTable {
+    schema: SchemaRef,
+    num_partitions: usize,
+    rows_per_partition: usize,
+}
+
+impl CountingTable {
+    fn new(num_partitions: usize, rows_per_partition: usize) -> Self {
+        let schema = Arc::new(Schema::new(vec![
+            Field::new("partition", DataType::Int64, false),
+            Field::new("value", DataType::Int64, false),
+        ]));
+        Self { schema, num_partitions, rows_per_partition }
+    }
+}
+
+#[async_trait::async_trait]
+impl TableProvider for CountingTable {
+    fn as_any(&self) -> &dyn Any { self }
+    fn schema(&self) -> SchemaRef { Arc::clone(&self.schema) }
+    fn table_type(&self) -> TableType { TableType::Base }
+
+    async fn scan(
+        &self,
+        _state: &dyn Session,
+        projection: Option<&Vec<usize>>,
+        _filters: &[Expr],
+        limit: Option<usize>,
+    ) -> Result<Arc<dyn ExecutionPlan>> {
+        // Light work only: build the plan with metadata
+        Ok(Arc::new(CountingExec {
+            schema: Arc::clone(&self.schema),
+            num_partitions: self.num_partitions,
+            rows_per_partition: limit
+                .unwrap_or(self.rows_per_partition)
+                .min(self.rows_per_partition),
+            properties: PlanProperties::new(
+                EquivalenceProperties::new(Arc::clone(&self.schema)),
+                Partitioning::UnknownPartitioning(self.num_partitions),
+                EmissionType::Incremental,
+                Boundedness::Bounded,
+            ),
+        }))
+    }
+}
+
+struct CountingExec {
+    schema: SchemaRef,
+    num_partitions: usize,
+    rows_per_partition: usize,
+    properties: PlanProperties,
+}
+
+impl ExecutionPlan for CountingExec {
+    fn name(&self) -> &str { "CountingExec" }
+    fn as_any(&self) -> &dyn Any { self }
+    fn properties(&self) -> &PlanProperties { &self.properties }
+    fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>> { vec![] }
+
+    fn with_new_children(
+        self: Arc<Self>,
+        _children: Vec<Arc<dyn ExecutionPlan>>,
+    ) -> Result<Arc<dyn ExecutionPlan>> {
+        Ok(self)
+    }
+
+    fn execute(
+        &self,
+        partition: usize,
+        _context: Arc<TaskContext>,
+    ) -> Result<SendableRecordBatchStream> {
+        let schema = Arc::clone(&self.schema);
+        let rows = self.rows_per_partition;
+
+        // The heavy work (data generation) happens inside the stream,
+        // not here in execute().
+        let batch_stream = stream::once(async move {
+            let partitions = Int64Array::from(
+                vec![partition as i64; rows],
+            );
+            let values = Int64Array::from(
+                (0..rows as i64).collect::<Vec<_>>(),
+            );
+            let batch = RecordBatch::try_new(
+                Arc::clone(&schema),
+                vec![Arc::new(partitions), Arc::new(values)],
+            )?;
+            Ok(batch)
+        });
+
+        Ok(Box::pin(RecordBatchStreamAdapter::new(
+            Arc::clone(&self.schema),
+            batch_stream,
+        )))
+    }
+}
+```
+
+## Choosing the Right Starting Point
+
+---
+
+Not every custom data source requires implementing all three layers from
+scratch. DataFusion provides building blocks that let you plug in at whatever
+level makes sense:
+
+| If your data is... | Start with | You implement |
+|---|---|---|
+| Already in `RecordBatch`es in memory | [`MemTable`] | Nothing -- just construct it |
+| An async stream of batches | [`StreamTable`] | A stream factory |
+| A table with known sort order | [`SortedTableProvider`] wrapping another provider | The inner provider |
+| A custom source needing full control | `TableProvider` + `ExecutionPlan` + stream | All three layers |
+
+For most integrations, [`StreamTable`] combined with
+[`RecordBatchStreamAdapter`] provides a good balance of simplicity and
+flexibility. You provide a closure that returns a stream, and DataFusion handles
+the rest.
+
+## Further Reading
+
+---
+
+- [TableProvider API docs][`TableProvider`]
+- [ExecutionPlan API docs][`ExecutionPlan`]
+- [SendableRecordBatchStream API docs][`SendableRecordBatchStream`]
+- [GitHub issue discussing table provider examples](https://github.com/apache/datafusion/issues/16821)
+- [DataFusion examples directory](https://github.com/apache/datafusion/tree/main/datafusion-examples/examples) --
+  contains working examples including custom table providers
+
+---
+
+*Note: Portions of this blog post were written with the assistance of an AI agent.*

From dea6e3faefc0d206338df0914e9c91dca0ab10f8 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Fri, 20 Mar 2026 18:11:19 -0400
Subject: [PATCH 02/19] Minor text changes

---
 content/blog/2026-03-20-writing-table-providers.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/content/blog/2026-03-20-writing-table-providers.md b/content/blog/2026-03-20-writing-table-providers.md
index f87c2585..8321c75f 100644
--- a/content/blog/2026-03-20-writing-table-providers.md
+++ b/content/blog/2026-03-20-writing-table-providers.md
@@ -1,8 +1,8 @@
 ---
 layout: post
-title: "Writing Custom Table Providers in Apache DataFusion"
+title: Writing Custom Table Providers in Apache DataFusion
 date: 2026-03-20
-author: Tim Saucer
+author: timsaucer
 categories: [tutorial]
 ---
 <!--

From 2de8b7b8a0aa081d8cf295a2483eac5a48b105ee Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Fri, 20 Mar 2026 18:20:35 -0400
Subject: [PATCH 03/19] Add acknowledgement

---
 content/blog/2026-03-20-writing-table-providers.md | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/content/blog/2026-03-20-writing-table-providers.md b/content/blog/2026-03-20-writing-table-providers.md
index 8321c75f..26c3238f 100644
--- a/content/blog/2026-03-20-writing-table-providers.md
+++ b/content/blog/2026-03-20-writing-table-providers.md
@@ -2,7 +2,7 @@
 layout: post
 title: Writing Custom Table Providers in Apache DataFusion
 date: 2026-03-20
-author: timsaucer
+author: Tim Saucer (rerun.io)
 categories: [tutorial]
 ---
 <!--
@@ -627,6 +627,14 @@ For most integrations, [`StreamTable`] combined with
 flexibility. You provide a closure that returns a stream, and DataFusion handles
 the rest.
 
+## Acknowledgements
+
+I would like to thank [Rerun.io] for sponsoring the development of this work. [Rerun.io]
+is building a data visualization system for Physical AI and makes heavy use of DataFusion
+table providers for working with data analytics.
+
+[Rerun.io]: https://rerun.io
+
 ## Further Reading
 
 ---

From 0a66a206e67f8568db8dcaebb7789751faa1d487 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Mon, 23 Mar 2026 12:06:17 -0400
Subject: [PATCH 04/19] Add note about when to add push down filters

Co-authored-by: Yongting You <2010youy01@gmail.com>
---
 content/blog/2026-03-20-writing-table-providers.md | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/content/blog/2026-03-20-writing-table-providers.md b/content/blog/2026-03-20-writing-table-providers.md
index 26c3238f..15f1199a 100644
--- a/content/blog/2026-03-20-writing-table-providers.md
+++ b/content/blog/2026-03-20-writing-table-providers.md
@@ -432,6 +432,11 @@ the `us-east-1` partition. If that partition holds 100 million rows, you have
 just eliminated 90% of the I/O. DataFusion still applies the `event_type`
 filter via `FilterExec` if you reported it as `Unsupported`.
 
+### Only Push Down Filters When the Data Source Can Do Better
+
+DataFusion already pushes filters as close to the data source as possible, typically placing them directly above the scan. `FilterExec` is also highly optimized, with vectorized evaluation and type-specialized kernels for fast predicate evaluation.
+
+Because of this, you should only implement filter pushdown when your data source can do strictly better. For example, avoid I/O by skipping data early using metadata. If your data source cannot eliminate I/O in this way, it is usually better to let DataFusion handle the filter, as its in-memory execution is already highly efficient (unless there are additional opportunities for deeper, application-specific optimizations).
 ### Using EXPLAIN to Debug Your Table Provider
 
 The `EXPLAIN` statement is your best tool for understanding what DataFusion is

From df813ec0308ebdeab95a25890ad9c3352f0a59aa Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Mon, 23 Mar 2026 12:36:11 -0400
Subject: [PATCH 05/19] Address a variety of user feedback

---
 .../2026-03-20-writing-table-providers.md     | 273 +++++++++++++++++-
 1 file changed, 266 insertions(+), 7 deletions(-)

diff --git a/content/blog/2026-03-20-writing-table-providers.md b/content/blog/2026-03-20-writing-table-providers.md
index 15f1199a..d8624786 100644
--- a/content/blog/2026-03-20-writing-table-providers.md
+++ b/content/blog/2026-03-20-writing-table-providers.md
@@ -55,6 +55,68 @@ actually produced during execution.
 [`ExecutionPlan`]: https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html
 [`SendableRecordBatchStream`]: https://docs.rs/datafusion/latest/datafusion/execution/type.SendableRecordBatchStream.html
 
+## Background: Logical and Physical Planning
+
+---
+
+Before diving into the three layers, it helps to understand how DataFusion
+processes a query. There are four phases between a SQL string (or DataFrame
+call) and streaming results:
+
+```text
+SQL / DataFrame API
+  → Logical Plan          (abstract: what to compute)
+  → Logical Optimization  (rewrite rules that preserve semantics)
+  → Physical Plan         (concrete: how to compute it)
+  → Physical Optimization (hardware- and data-aware rewrites)
+  → Execution             (streaming RecordBatches)
+```
+
+### Logical Planning
+
+A **logical plan** describes *what* the query computes without specifying *how*.
+It is a tree of relational operators -- `TableScan`, `Filter`, `Projection`,
+`Aggregate`, `Join`, `Sort`, `Limit`, and so on. The logical optimizer rewrites
+this tree to reduce work while preserving the query's meaning. Key logical
+optimizations include:
+
+- **Predicate pushdown** -- moves filters as close to the data source as
+  possible, so fewer rows flow through the rest of the plan.
+- **Projection pruning** -- eliminates columns that are never referenced
+  downstream, reducing memory and I/O.
+- **Expression simplification** -- rewrites expressions like `1 = 1` or
+  `x AND true` into simpler forms.
+- **Subquery decorrelation** -- converts correlated `IN` / `EXISTS` subqueries
+  into more efficient semi-joins.
+- **Limit pushdown** -- pushes `LIMIT` earlier in the plan so operators
+  produce less data.
+
+### Physical Planning
+
+The **physical planner** converts the optimized logical plan into an
+`ExecutionPlan` tree -- the concrete plan that will actually run. This is where
+decisions like "use a hash join vs. a sort-merge join" or "how many partitions
+to scan" are made. The physical optimizer then refines this tree further:
+
+- **Distribution enforcement** -- inserts `RepartitionExec` nodes so that data
+  is partitioned correctly for joins and aggregations.
+- **Sort enforcement** -- inserts `SortExec` nodes where ordering is required,
+  and removes them where the data is already sorted.
+- **Join selection** -- picks the most efficient join strategy based on
+  statistics and table sizes.
+- **Aggregate optimization** -- combines partial and final aggregation stages,
+  and can use exact statistics to skip scanning entirely.
+
+### Why This Matters for Table Providers
+
+Your `TableProvider` sits at the boundary between logical and physical planning.
+During logical optimization, DataFusion determines which filters and projections
+*could* be pushed down to the source. When `scan()` is called during physical
+planning, those hints are passed to you. By implementing capabilities like
+`supports_filters_pushdown`, you influence what the optimizer can do -- and the
+metadata you declare in your `ExecutionPlan` (partitioning, ordering) directly
+affects which physical optimizations apply.
+
 ## Layer 1: TableProvider
 
 ---
@@ -128,12 +190,16 @@ references:
   possible provider; great for tests and small datasets.
 - **[`StreamTable`]** -- Wraps a user-provided stream factory. Useful when your
   data arrives as a continuous stream (e.g., from Kafka or a socket).
-- **[`SortedTableProvider`]** -- Wraps another `TableProvider` and advertises a
-  known sort order, enabling the optimizer to skip redundant sorts.
+- **[`ListingTable`]** -- The file-based data source behind DataFusion's
+  built-in Parquet, CSV, and JSON support. Demonstrates sophisticated filter
+  and projection pushdown, file pruning, and schema inference.
+- **[`ViewTable`]** -- Wraps a logical plan, representing a SQL view. Useful
+  if your provider is best expressed as a transformation of other tables.
 
 [`MemTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/memory/struct.MemTable.html
 [`StreamTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/stream/struct.StreamTable.html
-[`SortedTableProvider`]: https://docs.rs/datafusion/latest/datafusion/datasource/struct.SortedTableProvider.html
+[`ListingTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/listing/struct.ListingTable.html
+[`ViewTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/view/struct.ViewTable.html
 
 ## Layer 2: ExecutionPlan
 
@@ -189,9 +255,39 @@ ordered data. Getting this right can be a significant performance win.
 ### Partitioning Strategies
 
 Since `execute()` is called once per partition, partitioning directly controls
-the parallelism of your table scan. Each partition runs on its own task, so
-more partitions means more concurrent work -- up to the number of available
-cores.
+the parallelism of your table scan. Each partition produces an independent
+stream that DataFusion schedules as a **task** on the tokio runtime. It is
+important to distinguish tasks from threads: tasks are lightweight units of
+async work that are multiplexed onto a thread pool. You can have many more
+tasks (partitions) than physical threads -- the runtime will interleave them
+efficiently as they await I/O or yield.
+
+That said, having *too many* partitions is not free. Each partition adds
+scheduling overhead, and downstream operators like aggregations and joins may
+need to repartition the data to match their requirements. You can use the
+session configuration to find the **target partition count**, which reflects
+how many partitions DataFusion's optimizer expects to work with:
+
+```rust
+async fn scan(
+    &self,
+    state: &dyn Session,
+    projection: Option<&Vec<usize>>,
+    filters: &[Expr],
+    limit: Option<usize>,
+) -> Result<Arc<dyn ExecutionPlan>> {
+    let target_partitions = state.config().target_partitions();
+    // Use target_partitions to decide how many partitions to expose.
+    // If your source naturally has more partitions, consider coalescing
+    // them down to target_partitions to avoid unnecessary repartitioning.
+    // ...
+}
+```
+
+If your source produces data in exactly `target_partitions` partitions, the
+optimizer is less likely to insert a `RepartitionExec` above your scan --
+avoiding an expensive data shuffle. For small datasets, `target_partitions` may
+be set to 1, which avoids any repartitioning overhead entirely.
 
 Consider how your data source naturally divides its data:
 
@@ -485,6 +581,166 @@ Similarly, `EXPLAIN` will reveal whether DataFusion is inserting unnecessary
 better output properties. Whenever your queries seem slower than expected,
 `EXPLAIN` is the first place to look.
 
+### A Complete Filter Pushdown Example
+
+To make filter pushdown concrete, here is a full working example. Imagine a
+table provider that reads from a set of date-partitioned directories on disk
+(e.g., `data/2026-03-01/`, `data/2026-03-02/`, ...). Each directory contains
+one or more Parquet files for that date. By pushing down a filter on the `date`
+column, the provider can skip entire directories -- avoiding the I/O of listing
+and reading files that cannot possibly match the query.
+
+```rust
+use std::any::Any;
+use std::collections::HashMap;
+use std::sync::Arc;
+
+use arrow::datatypes::{DataType, Field, Schema, SchemaRef};
+use arrow::array::{Date32Array, Float64Array, StringArray};
+use arrow::record_batch::RecordBatch;
+use datafusion::catalog::TableProvider;
+use datafusion::common::Result;
+use datafusion::datasource::TableType;
+use datafusion::execution::SendableRecordBatchStream;
+use datafusion::logical_expr::Expr;
+use datafusion::logical_expr::TableProviderFilterPushDown;
+use datafusion::physical_expr::EquivalenceProperties;
+use datafusion::physical_plan::execution_plan::{Boundedness, EmissionType};
+use datafusion::physical_plan::stream::RecordBatchStreamAdapter;
+use datafusion::physical_plan::{
+    ExecutionPlan, Partitioning, PlanProperties,
+};
+use futures::stream;
+
+/// A table provider backed by date-partitioned directories.
+/// Each date directory contains data files; by filtering on the
+/// `date` column we can skip entire directories of I/O.
+struct DatePartitionedTable {
+    schema: SchemaRef,
+    /// Maps date strings ("2026-03-01") to directory paths
+    partitions: HashMap<String, String>,
+}
+
+#[async_trait::async_trait]
+impl TableProvider for DatePartitionedTable {
+    fn as_any(&self) -> &dyn Any { self }
+    fn schema(&self) -> SchemaRef { Arc::clone(&self.schema) }
+    fn table_type(&self) -> TableType { TableType::Base }
+
+    fn supports_filters_pushdown(
+        &self,
+        filters: &[&Expr],
+    ) -> Result<Vec<TableProviderFilterPushDown>> {
+        Ok(filters.iter().map(|f| {
+            if Self::is_date_equality_filter(f) {
+                // We can fully evaluate this: we will only read
+                // directories matching the date, so no rows with
+                // a different date will appear in the output.
+                TableProviderFilterPushDown::Exact
+            } else {
+                TableProviderFilterPushDown::Unsupported
+            }
+        }).collect())
+    }
+
+    async fn scan(
+        &self,
+        _state: &dyn Session,
+        projection: Option<&Vec<usize>>,
+        filters: &[Expr],
+        limit: Option<usize>,
+    ) -> Result<Arc<dyn ExecutionPlan>> {
+        // Determine which date partitions to read by inspecting
+        // the pushed-down filters. This is the key optimization:
+        // we decide *during planning* which directories to scan,
+        // so that execution never touches irrelevant data.
+        let dates_to_read: Vec<String> = self
+            .extract_date_values(filters)
+            .unwrap_or_else(||
+                self.partitions.keys().cloned().collect()
+            );
+
+        let dirs: Vec<String> = dates_to_read
+            .iter()
+            .filter_map(|d| self.partitions.get(d).cloned())
+            .collect();
+
+        Ok(Arc::new(DatePartitionedExec {
+            schema: Arc::clone(&self.schema),
+            directories: dirs,
+            properties: PlanProperties::new(
+                EquivalenceProperties::new(
+                    Arc::clone(&self.schema),
+                ),
+                // One partition per date directory -- these
+                // will be read in parallel.
+                Partitioning::UnknownPartitioning(dirs.len()),
+                EmissionType::Incremental,
+                Boundedness::Bounded,
+            ),
+        }))
+    }
+}
+
+impl DatePartitionedTable {
+    /// Check if a filter is an equality comparison on the `date` column.
+    fn is_date_equality_filter(expr: &Expr) -> bool {
+        // In practice, match on BinaryExpr { left, op: Eq, right }
+        // and check if either side references the "date" column.
+        // Simplified here for clarity.
+        todo!("match on date equality expressions")
+    }
+
+    /// Extract date literal values from pushed-down equality filters.
+    fn extract_date_values(&self, filters: &[Expr]) -> Option<Vec<String>> {
+        // Parse filters like `date = '2026-03-01'` and return
+        // the literal date strings. Returns None if no date
+        // filters are present (meaning: read all partitions).
+        todo!("extract date literals from filter expressions")
+    }
+}
+```
+
+The key insight is that the filter pushdown decision (`supports_filters_pushdown`)
+and the partition pruning (`scan()`) work together: the first tells DataFusion
+that a `FilterExec` is unnecessary for the `date` predicate, and the second
+ensures that only the relevant directories are scanned. The actual file reading
+happens later, in the stream produced by `execute()`.
+
+## `scan` vs `scan_with_args`
+
+---
+
+The examples above all use the `scan()` method, which receives projection,
+filters, and limit as separate parameters. DataFusion also provides
+[`scan_with_args()`], which bundles these into a structured [`ScanArgs`]
+parameter:
+
+```rust
+async fn scan_with_args(
+    &self,
+    args: ScanArgs<'_>,
+) -> Result<ScanResult> {
+    let projection = args.projection();
+    let filters = args.filters();
+    let limit = args.limit();
+    // ...
+}
+```
+
+`ScanArgs` is designed to be extensible -- new scan parameters can be added
+without breaking existing implementations. It also carries additional context
+not available in `scan()`, such as a `preferred_ordering` hint that lets the
+optimizer request a specific output order from your provider.
+
+If you are building a new table provider, consider implementing
+`scan_with_args()` instead of `scan()`. The default implementation of `scan()`
+delegates to `scan_with_args()`, so you only need to implement one. Existing
+providers that already implement `scan()` will continue to work without changes.
+
+[`scan_with_args()`]: https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html#method.scan_with_args
+[`ScanArgs`]: https://docs.rs/datafusion/latest/datafusion/catalog/struct.ScanArgs.html
+
 ## Putting It All Together
 
 ---
@@ -624,9 +880,12 @@ level makes sense:
 |---|---|---|
 | Already in `RecordBatch`es in memory | [`MemTable`] | Nothing -- just construct it |
 | An async stream of batches | [`StreamTable`] | A stream factory |
-| A table with known sort order | [`SortedTableProvider`] wrapping another provider | The inner provider |
+| A logical transformation of other tables | [`ViewTable`] wrapping a logical plan | The logical plan |
+| Files on disk or object storage | [`ListingTable`] with a custom [`FileFormat`] | The file format |
 | A custom source needing full control | `TableProvider` + `ExecutionPlan` + stream | All three layers |
 
+[`FileFormat`]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/trait.FileFormat.html
+
 For most integrations, [`StreamTable`] combined with
 [`RecordBatchStreamAdapter`] provides a good balance of simplicity and
 flexibility. You provide a closure that returns a stream, and DataFusion handles

From d72f9c61ae21c84e81def72ceb14089279ad67e2 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Mon, 23 Mar 2026 12:43:40 -0400
Subject: [PATCH 06/19] Update links

---
 content/blog/2026-03-20-writing-table-providers.md | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/content/blog/2026-03-20-writing-table-providers.md b/content/blog/2026-03-20-writing-table-providers.md
index d8624786..5adb2a45 100644
--- a/content/blog/2026-03-20-writing-table-providers.md
+++ b/content/blog/2026-03-20-writing-table-providers.md
@@ -903,10 +903,9 @@ table providers for working with data analytics.
 
 ---
 
-- [TableProvider API docs][`TableProvider`]
-- [ExecutionPlan API docs][`ExecutionPlan`]
-- [SendableRecordBatchStream API docs][`SendableRecordBatchStream`]
-- [GitHub issue discussing table provider examples](https://github.com/apache/datafusion/issues/16821)
+- [`TableProvider` API docs](https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html)
+- [`ExecutionPlan` API docs](https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html)
+- [`SendableRecordBatchStream` API docs](https://docs.rs/datafusion/latest/datafusion/execution/type.SendableRecordBatchStream.html)
 - [DataFusion examples directory](https://github.com/apache/datafusion/tree/main/datafusion-examples/examples) --
   contains working examples including custom table providers
 

From e03d448d7bd58c7b285f9be9f933064cf0d27672 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Mon, 23 Mar 2026 12:50:05 -0400
Subject: [PATCH 07/19] pelican processing didn't handle backticks in links
 well

---
 .../2026-03-20-writing-table-providers.md     | 72 +++++++++----------
 1 file changed, 35 insertions(+), 37 deletions(-)

diff --git a/content/blog/2026-03-20-writing-table-providers.md b/content/blog/2026-03-20-writing-table-providers.md
index 5adb2a45..a25e1d42 100644
--- a/content/blog/2026-03-20-writing-table-providers.md
+++ b/content/blog/2026-03-20-writing-table-providers.md
@@ -39,11 +39,11 @@ understand and explains where your work should actually happen.
 When DataFusion executes a query against a table, three abstractions collaborate
 to produce results:
 
-1. **[`TableProvider`]** -- Describes the table (schema, capabilities) and
+1. **[TableProvider]** -- Describes the table (schema, capabilities) and
    produces an execution plan when queried.
-2. **[`ExecutionPlan`]** -- Describes *how* to compute the result: partitioning,
+2. **[ExecutionPlan]** -- Describes *how* to compute the result: partitioning,
    ordering, and child plan relationships.
-3. **[`SendableRecordBatchStream`]** -- The async stream that *actually does the
+3. **[SendableRecordBatchStream]** -- The async stream that *actually does the
    work*, yielding `RecordBatch`es one at a time.
 
 Think of these as a funnel: `TableProvider::scan()` is called once during
@@ -51,9 +51,16 @@ planning to create an `ExecutionPlan`, then `ExecutionPlan::execute()` is called
 once per partition to create a stream, and those streams are where rows are
 actually produced during execution.
 
-[`TableProvider`]: https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html
-[`ExecutionPlan`]: https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html
-[`SendableRecordBatchStream`]: https://docs.rs/datafusion/latest/datafusion/execution/type.SendableRecordBatchStream.html
+[TableProvider]: https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html
+[ExecutionPlan]: https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html
+[SendableRecordBatchStream]: https://docs.rs/datafusion/latest/datafusion/execution/type.SendableRecordBatchStream.html
+[MemTable]: https://docs.rs/datafusion/latest/datafusion/datasource/memory/struct.MemTable.html
+[StreamTable]: https://docs.rs/datafusion/latest/datafusion/datasource/stream/struct.StreamTable.html
+[ListingTable]: https://docs.rs/datafusion/latest/datafusion/datasource/listing/struct.ListingTable.html
+[ViewTable]: https://docs.rs/datafusion/latest/datafusion/datasource/view/struct.ViewTable.html
+[PlanProperties]: https://docs.rs/datafusion/latest/datafusion/physical_plan/struct.PlanProperties.html
+[StreamingTableExec]: https://docs.rs/datafusion/latest/datafusion/datasource/stream/struct.StreamingTableExec.html
+[DataSourceExec]: https://docs.rs/datafusion/latest/datafusion/datasource/struct.DataSourceExec.html
 
 ## Background: Logical and Physical Planning
 
@@ -121,7 +128,7 @@ affects which physical optimizations apply.
 
 ---
 
-A [`TableProvider`] represents a queryable data source. For a minimal read-only
+A [TableProvider] represents a queryable data source. For a minimal read-only
 table, you need four methods:
 
 ```rust
@@ -186,26 +193,21 @@ execution begins.
 DataFusion ships several `TableProvider` implementations that are excellent
 references:
 
-- **[`MemTable`]** -- Holds data in memory as `Vec<RecordBatch>`. The simplest
+- **[MemTable]** -- Holds data in memory as `Vec<RecordBatch>`. The simplest
   possible provider; great for tests and small datasets.
-- **[`StreamTable`]** -- Wraps a user-provided stream factory. Useful when your
+- **[StreamTable]** -- Wraps a user-provided stream factory. Useful when your
   data arrives as a continuous stream (e.g., from Kafka or a socket).
-- **[`ListingTable`]** -- The file-based data source behind DataFusion's
+- **[ListingTable]** -- The file-based data source behind DataFusion's
   built-in Parquet, CSV, and JSON support. Demonstrates sophisticated filter
   and projection pushdown, file pruning, and schema inference.
-- **[`ViewTable`]** -- Wraps a logical plan, representing a SQL view. Useful
+- **[ViewTable]** -- Wraps a logical plan, representing a SQL view. Useful
   if your provider is best expressed as a transformation of other tables.
 
-[`MemTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/memory/struct.MemTable.html
-[`StreamTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/stream/struct.StreamTable.html
-[`ListingTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/listing/struct.ListingTable.html
-[`ViewTable`]: https://docs.rs/datafusion/latest/datafusion/datasource/view/struct.ViewTable.html
-
 ## Layer 2: ExecutionPlan
 
 ---
 
-An [`ExecutionPlan`] is a node in the physical query plan tree. Your table
+An [ExecutionPlan] is a node in the physical query plan tree. Your table
 provider's `scan()` method returns one. The required methods are:
 
 ```rust
@@ -241,7 +243,7 @@ impl ExecutionPlan for MyExecPlan {
 }
 ```
 
-The key properties to set correctly in [`PlanProperties`] are **output
+The key properties to set correctly in [PlanProperties] are **output
 partitioning** and **output ordering**.
 
 **Output partitioning** tells the engine how many partitions your data has,
@@ -330,22 +332,18 @@ the runtime drive it.
 
 ### Existing Implementations to Learn From
 
-- **[`StreamingTableExec`]** -- Executes a streaming table scan. It takes a
+- **[StreamingTableExec]** -- Executes a streaming table scan. It takes a
   stream factory (a closure that produces streams) and handles partitioning.
   Good reference for wrapping external streams.
-- **[`DataSourceExec`]** -- The execution plan behind DataFusion's built-in file
+- **[DataSourceExec]** -- The execution plan behind DataFusion's built-in file
   scanning (Parquet, CSV, JSON). It demonstrates sophisticated partitioning,
   filter pushdown, and projection pushdown.
 
-[`StreamingTableExec`]: https://docs.rs/datafusion/latest/datafusion/datasource/stream/struct.StreamingTableExec.html
-[`DataSourceExec`]: https://docs.rs/datafusion/latest/datafusion/datasource/struct.DataSourceExec.html
-[`PlanProperties`]: https://docs.rs/datafusion/latest/datafusion/physical_plan/struct.PlanProperties.html
-
 ## Layer 3: SendableRecordBatchStream
 
 ---
 
-[`SendableRecordBatchStream`] is where the real work happens. It is defined as:
+[SendableRecordBatchStream] is where the real work happens. It is defined as:
 
 ```rust
 type SendableRecordBatchStream =
@@ -359,7 +357,7 @@ APIs, transforming data, etc.
 ### Using RecordBatchStreamAdapter
 
 The easiest way to create a `SendableRecordBatchStream` is with
-[`RecordBatchStreamAdapter`]. It bridges any `futures::Stream<Item =
+[RecordBatchStreamAdapter]. It bridges any `futures::Stream<Item =
 Result<RecordBatch>>` into the `SendableRecordBatchStream` type:
 
 ```rust
@@ -390,7 +388,7 @@ fn execute(
 }
 ```
 
-[`RecordBatchStreamAdapter`]: https://docs.rs/datafusion/latest/datafusion/physical_plan/stream/struct.RecordBatchStreamAdapter.html
+[RecordBatchStreamAdapter]: https://docs.rs/datafusion/latest/datafusion/physical_plan/stream/struct.RecordBatchStreamAdapter.html
 
 ### CPU-Intensive Work: Use a Separate Thread Pool
 
@@ -713,7 +711,7 @@ happens later, in the stream produced by `execute()`.
 
 The examples above all use the `scan()` method, which receives projection,
 filters, and limit as separate parameters. DataFusion also provides
-[`scan_with_args()`], which bundles these into a structured [`ScanArgs`]
+[scan_with_args()], which bundles these into a structured [ScanArgs]
 parameter:
 
 ```rust
@@ -738,8 +736,8 @@ If you are building a new table provider, consider implementing
 delegates to `scan_with_args()`, so you only need to implement one. Existing
 providers that already implement `scan()` will continue to work without changes.
 
-[`scan_with_args()`]: https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html#method.scan_with_args
-[`ScanArgs`]: https://docs.rs/datafusion/latest/datafusion/catalog/struct.ScanArgs.html
+[scan_with_args()]: https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html#method.scan_with_args
+[ScanArgs]: https://docs.rs/datafusion/latest/datafusion/catalog/struct.ScanArgs.html
 
 ## Putting It All Together
 
@@ -878,16 +876,16 @@ level makes sense:
 
 | If your data is... | Start with | You implement |
 |---|---|---|
-| Already in `RecordBatch`es in memory | [`MemTable`] | Nothing -- just construct it |
-| An async stream of batches | [`StreamTable`] | A stream factory |
-| A logical transformation of other tables | [`ViewTable`] wrapping a logical plan | The logical plan |
-| Files on disk or object storage | [`ListingTable`] with a custom [`FileFormat`] | The file format |
+| Already in `RecordBatch`es in memory | [MemTable] | Nothing -- just construct it |
+| An async stream of batches | [StreamTable] | A stream factory |
+| A logical transformation of other tables | [ViewTable] wrapping a logical plan | The logical plan |
+| Files on disk or object storage | [ListingTable] with a custom [FileFormat] | The file format |
 | A custom source needing full control | `TableProvider` + `ExecutionPlan` + stream | All three layers |
 
-[`FileFormat`]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/trait.FileFormat.html
+[FileFormat]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/trait.FileFormat.html
 
-For most integrations, [`StreamTable`] combined with
-[`RecordBatchStreamAdapter`] provides a good balance of simplicity and
+For most integrations, [StreamTable] combined with
+[RecordBatchStreamAdapter] provides a good balance of simplicity and
 flexibility. You provide a closure that returns a stream, and DataFusion handles
 the rest.
 

From dfc052034b28644cf680b07dbbf57ffe859cd6c7 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 14:48:56 -0400
Subject: [PATCH 08/19] Add an explanation of different ways to use FileFormat
 for a ListingTable

---
 .../2026-03-20-writing-table-providers.md     | 47 ++++++++++++++++++-
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/content/blog/2026-03-20-writing-table-providers.md b/content/blog/2026-03-20-writing-table-providers.md
index a25e1d42..30d47a7c 100644
--- a/content/blog/2026-03-20-writing-table-providers.md
+++ b/content/blog/2026-03-20-writing-table-providers.md
@@ -879,12 +879,55 @@ level makes sense:
 | Already in `RecordBatch`es in memory | [MemTable] | Nothing -- just construct it |
 | An async stream of batches | [StreamTable] | A stream factory |
 | A logical transformation of other tables | [ViewTable] wrapping a logical plan | The logical plan |
-| Files on disk or object storage | [ListingTable] with a custom [FileFormat] | The file format |
+| A variant of an existing file format | [ListingTable] with a custom [FileFormat] wrapping an existing one | A thin `FileFormat` wrapper |
+| Files in a custom format on disk or object storage | [ListingTable] with a custom [FileFormat], [FileSource], and [FileOpener] | The format, source, and opener |
 | A custom source needing full control | `TableProvider` + `ExecutionPlan` + stream | All three layers |
 
 [FileFormat]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/trait.FileFormat.html
+[FileSource]: https://docs.rs/datafusion-datasource/latest/datafusion_datasource/file/trait.FileSource.html
+[FileOpener]: https://docs.rs/datafusion-datasource/latest/datafusion_datasource/file_stream/trait.FileOpener.html
 
-For most integrations, [StreamTable] combined with
+### The File-Based Path: FileFormat, FileSource, and FileOpener
+
+If your data lives in files (local disk or object storage like S3), you do not
+need to build a `TableProvider` and `ExecutionPlan` from scratch. Instead, you
+can plug into [ListingTable] by implementing a stack of three traits:
+
+1. **[FileFormat]** -- The planning-level abstraction. Handles schema inference
+   (`infer_schema`), statistics (`infer_stats`), and produces a `FileSource` via
+   its `file_source()` method. If your format is a variant of an existing one,
+   you can wrap an existing `FileFormat` and delegate most methods.
+2. **[FileSource]** -- The execution-level configuration. Holds format-specific
+   settings and creates a `FileOpener` in `create_file_opener()`. You can also
+   override provided methods for optimization hooks like filter pushdown,
+   projection pushdown, and repartitioning.
+3. **[FileOpener]** -- The I/O layer. Has a single method,
+   `open(PartitionedFile)`, that reads a file (or byte range within a file)
+   and returns an async stream of `RecordBatch`es.
+
+The relationship flows downward:
+
+```text
+FileFormat  (planning: schema inference, statistics)
+  └── file_source() → FileSource  (execution: config + optimization hooks)
+        └── create_file_opener() → FileOpener  (I/O: reads files → RecordBatches)
+```
+
+`ListingTable` handles everything else: file discovery, partition column
+inference, and wiring the result into a [DataSourceExec] execution plan. You
+get file pruning, projection pushdown, and parallelism across files for free.
+
+If your format is a variant of an existing one, the [custom_file_format example]
+shows how to wrap `CsvFormat` to create a TSV format with minimal code -- you
+only need to implement `FileFormat`. For a fully custom format, a good approach
+is to study the built-in implementations like [ParquetSource] and [ParquetOpener]
+to understand the full `FileSource` → `FileOpener` contract.
+
+[custom_file_format example]: https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/custom_data_source/custom_file_format.rs
+[ParquetSource]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/parquet/struct.ParquetSource.html
+[ParquetOpener]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/parquet/struct.ParquetOpener.html
+
+For most non-file integrations, [StreamTable] combined with
 [RecordBatchStreamAdapter] provides a good balance of simplicity and
 flexibility. You provide a closure that returns a stream, and DataFusion handles
 the rest.

From f3aa83e79411fc2fc4a041755d04b6a855771bd2 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 15:08:45 -0400
Subject: [PATCH 09/19] Address alamb review feedback

- Clarify intro sentence to mention planning/execution work
- Label TableProvider as Logical Plan and ExecutionPlan as Physical Plan
- Change "four phases" to "several phases" (list has 5 items)
- "Some logical optimizations" and "rewrites such as" to signal non-exhaustive lists
- Clarify scan() comment: "don't do any execution work here"
- Rewrite partitioning section to lead with simple advice (match data layout)
  before covering target_partitions and hash partitioning subtleties
- Narrow CPU thread pool advice: spawn_blocking is for blocking/long-running
  work, not all CPU work
- Add "scan is single-threaded" as a reason to keep scan() lightweight

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 .../2026-03-20-writing-table-providers.md     | 96 ++++++++++---------
 1 file changed, 52 insertions(+), 44 deletions(-)

diff --git a/content/blog/2026-03-20-writing-table-providers.md b/content/blog/2026-03-20-writing-table-providers.md
index 30d47a7c..1547a0c9 100644
--- a/content/blog/2026-03-20-writing-table-providers.md
+++ b/content/blog/2026-03-20-writing-table-providers.md
@@ -30,7 +30,7 @@ One of DataFusion's greatest strengths is its extensibility. If your data lives
 in a custom format, behind an API, or in a system that DataFusion does not
 natively support, you can teach DataFusion to read it by implementing a
 **custom table provider**. This post walks through the three layers you need to
-understand and explains where your work should actually happen.
+understand to design a table provider and where planning and execution work should happen.
 
 ## The Three Layers
 
@@ -40,9 +40,9 @@ When DataFusion executes a query against a table, three abstractions collaborate
 to produce results:
 
 1. **[TableProvider]** -- Describes the table (schema, capabilities) and
-   produces an execution plan when queried.
+   produces an execution plan when queried. This is part of the **Logical Plan**.
 2. **[ExecutionPlan]** -- Describes *how* to compute the result: partitioning,
-   ordering, and child plan relationships.
+   ordering, and child plan relationships. This is part of the **Physical Plan**.
 3. **[SendableRecordBatchStream]** -- The async stream that *actually does the
    work*, yielding `RecordBatch`es one at a time.
 
@@ -67,7 +67,7 @@ actually produced during execution.
 ---
 
 Before diving into the three layers, it helps to understand how DataFusion
-processes a query. There are four phases between a SQL string (or DataFrame
+processes a query. There are several phases between a SQL string (or DataFrame
 call) and streaming results:
 
 ```text
@@ -84,7 +84,7 @@ SQL / DataFrame API
 A **logical plan** describes *what* the query computes without specifying *how*.
 It is a tree of relational operators -- `TableScan`, `Filter`, `Projection`,
 `Aggregate`, `Join`, `Sort`, `Limit`, and so on. The logical optimizer rewrites
-this tree to reduce work while preserving the query's meaning. Key logical
+this tree to reduce work while preserving the query's meaning. Some logical
 optimizations include:
 
 - **Predicate pushdown** -- moves filters as close to the data source as
@@ -103,7 +103,7 @@ optimizations include:
 The **physical planner** converts the optimized logical plan into an
 `ExecutionPlan` tree -- the concrete plan that will actually run. This is where
 decisions like "use a hash join vs. a sort-merge join" or "how many partitions
-to scan" are made. The physical optimizer then refines this tree further:
+to scan" are made. The physical optimizer then refines this tree further with rewrites such as:
 
 - **Distribution enforcement** -- inserts `RepartitionExec` nodes so that data
   is partitioned correctly for joins and aggregations.
@@ -150,7 +150,7 @@ impl TableProvider for MyTable {
         filters: &[Expr],
         limit: Option<usize>,
     ) -> Result<Arc<dyn ExecutionPlan>> {
-        // Build and return an ExecutionPlan -- keep this lightweight!
+        // Build and return an ExecutionPlan -- don't do any execution work here -- keep lightweight!
         Ok(Arc::new(MyExecPlan::new(
             Arc::clone(&self.schema),
             projection,
@@ -264,11 +264,29 @@ async work that are multiplexed onto a thread pool. You can have many more
 tasks (partitions) than physical threads -- the runtime will interleave them
 efficiently as they await I/O or yield.
 
-That said, having *too many* partitions is not free. Each partition adds
-scheduling overhead, and downstream operators like aggregations and joins may
-need to repartition the data to match their requirements. You can use the
-session configuration to find the **target partition count**, which reflects
-how many partitions DataFusion's optimizer expects to work with:
+**Start simple: match your data's natural layout.** If you have 4 files, expose
+4 partitions. If your source has 8 shards, expose 8 partitions. DataFusion will
+insert a `RepartitionExec` above your scan when downstream operators need a
+different distribution. You can also implement the
+[`repartitioned`](https://docs.rs/datafusion/latest/datafusion/physical_plan/trait.ExecutionPlan.html#method.repartitioned)
+method on your `ExecutionPlan` to let DataFusion request a different partition
+count directly from your source, avoiding the extra operator entirely.
+
+Consider how your data source naturally divides its data:
+
+- **By file or object:** If you are reading from S3, each file can be a
+  partition. DataFusion will read them in parallel.
+- **By shard or region:** If your source is a sharded database, each shard
+  maps naturally to a partition.
+- **By key range:** If your data is keyed (e.g., by timestamp or customer ID),
+  you can split it into ranges.
+
+**Advanced: aligning with `target_partitions`.** Once you have something
+working, you can tune further. Having *too many* partitions is not free: each
+partition adds scheduling overhead, and downstream operators may need to
+repartition the data anyway. The session configuration exposes a
+**target partition count** that reflects how many partitions the optimizer
+expects to work with:
 
 ```rust
 async fn scan(
@@ -279,35 +297,19 @@ async fn scan(
     limit: Option<usize>,
 ) -> Result<Arc<dyn ExecutionPlan>> {
     let target_partitions = state.config().target_partitions();
-    // Use target_partitions to decide how many partitions to expose.
-    // If your source naturally has more partitions, consider coalescing
-    // them down to target_partitions to avoid unnecessary repartitioning.
+    // Optionally coalesce or split partitions to match target_partitions.
     // ...
 }
 ```
 
 If your source produces data in exactly `target_partitions` partitions, the
-optimizer is less likely to insert a `RepartitionExec` above your scan --
-avoiding an expensive data shuffle. For small datasets, `target_partitions` may
-be set to 1, which avoids any repartitioning overhead entirely.
-
-Consider how your data source naturally divides its data:
-
-- **By file or object:** If you are reading from S3, each file can be a
-  partition. DataFusion will read them in parallel.
-- **By shard or region:** If your source is a sharded database, each shard
-  maps naturally to a partition.
-- **By key range:** If your data is keyed (e.g., by timestamp or customer ID),
-  you can split it into ranges.
-
-Getting partitioning right matters because it affects everything downstream in
-the plan. When DataFusion needs to perform an aggregation or join, it
-repartitions data by hashing the relevant columns. If your source already
-produces data partitioned by the join or group-by key, DataFusion can skip the
-repartition step entirely -- avoiding a potentially expensive shuffle.
+optimizer is less likely to insert a `RepartitionExec` above your scan.
+For small datasets, `target_partitions` may be set to 1, which avoids any
+repartitioning overhead entirely.
 
-For example, if you are building a table provider for a system that stores
-data partitioned by `customer_id`, and a common query groups by `customer_id`:
+**Advanced: declaring hash partitioning.** If your source stores data
+pre-partitioned by a specific key (e.g., `customer_id`), you can declare this
+in your output partitioning. For a query like:
 
 ```sql
 SELECT customer_id, SUM(amount)
@@ -390,11 +392,14 @@ fn execute(
 
 [RecordBatchStreamAdapter]: https://docs.rs/datafusion/latest/datafusion/physical_plan/stream/struct.RecordBatchStreamAdapter.html
 
-### CPU-Intensive Work: Use a Separate Thread Pool
+### Blocking Work: Use a Separate Thread Pool
 
-If your stream performs CPU-intensive work (parsing, decompression, complex
-transformations), avoid blocking the tokio runtime. Instead, offload to a
-dedicated thread pool and send results back through a channel:
+If your stream performs **blocking** work -- such as blocking I/O, or CPU work
+that runs for hundreds of milliseconds without yielding -- you must avoid
+blocking the tokio async runtime. Short CPU work (e.g., parsing a batch in a
+few milliseconds) is fine to do inline as long as your code yields back to the
+runtime frequently. But for long-running synchronous work that cannot yield,
+offload to a dedicated thread pool and send results back through a channel:
 
 ```rust
 fn execute(
@@ -407,7 +412,7 @@ fn execute(
 
     let (tx, rx) = tokio::sync::mpsc::channel(2);
 
-    // Spawn CPU-heavy work on a blocking thread pool
+    // Spawn blocking work on a dedicated thread pool
     tokio::task::spawn_blocking(move || {
         let batches = generate_data(&config);
         for batch in batches {
@@ -422,8 +427,8 @@ fn execute(
 }
 ```
 
-This pattern keeps the async runtime responsive while your data generation
-runs on its own threads.
+This pattern keeps the async runtime responsive while long-running synchronous
+work runs on its own threads.
 
 ## Where Should the Work Happen?
 
@@ -448,10 +453,13 @@ When `scan()` does heavy work, several problems arise:
 
 1. **Planning becomes slow.** If a query touches 10 tables and each `scan()`
    takes 500ms, planning alone takes 5 seconds before any data flows.
-2. **The optimizer cannot help.** The optimizer runs between planning and
+2. **Execution is single-threaded.** `scan()` runs on a single thread during
+   planning, so any work done there cannot benefit from the parallel execution
+   that DataFusion provides across partitions.
+3. **The optimizer cannot help.** The optimizer runs between planning and
    execution. If you have already fetched data during planning, optimizations
    like predicate pushdown or partition pruning cannot reduce the work.
-3. **Resource management breaks down.** DataFusion manages concurrency and
+4. **Resource management breaks down.** DataFusion manages concurrency and
    memory during execution. Work done during planning bypasses these controls.
 
 ## Filter Pushdown: Doing Less Work

From 46edd5adfaa7912990398201d811ecefdc31b22d Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 15:17:29 -0400
Subject: [PATCH 10/19] update date

---
 ...table-providers.md => 2026-03-31-writing-table-providers.md} | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 rename content/blog/{2026-03-20-writing-table-providers.md => 2026-03-31-writing-table-providers.md} (99%)

diff --git a/content/blog/2026-03-20-writing-table-providers.md b/content/blog/2026-03-31-writing-table-providers.md
similarity index 99%
rename from content/blog/2026-03-20-writing-table-providers.md
rename to content/blog/2026-03-31-writing-table-providers.md
index 1547a0c9..52dcc5fc 100644
--- a/content/blog/2026-03-20-writing-table-providers.md
+++ b/content/blog/2026-03-31-writing-table-providers.md
@@ -1,7 +1,7 @@
 ---
 layout: post
 title: Writing Custom Table Providers in Apache DataFusion
-date: 2026-03-20
+date: 2026-03-31
 author: Tim Saucer (rerun.io)
 categories: [tutorial]
 ---

From e0766277263f3b721873ae85544d0f7912c67a35 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 15:25:12 -0400
Subject: [PATCH 11/19] Add link to thread_pools example for blocking work
 section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 content/blog/2026-03-31-writing-table-providers.md | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/content/blog/2026-03-31-writing-table-providers.md b/content/blog/2026-03-31-writing-table-providers.md
index 52dcc5fc..5787ce39 100644
--- a/content/blog/2026-03-31-writing-table-providers.md
+++ b/content/blog/2026-03-31-writing-table-providers.md
@@ -428,7 +428,10 @@ fn execute(
 ```
 
 This pattern keeps the async runtime responsive while long-running synchronous
-work runs on its own threads.
+work runs on its own threads. For a working example that shows how to configure
+separate thread pools for I/O and CPU work, see the
+[thread_pools example](https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/query_planning/thread_pools.rs)
+in the DataFusion repository.
 
 ## Where Should the Work Happen?
 

From 981d98d77e9baa9bce5e1e2d7d8545997377d220 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 15:31:03 -0400
Subject: [PATCH 12/19] remove use statements from example

---
 .../2026-03-31-writing-table-providers.md     | 21 -------------------
 1 file changed, 21 deletions(-)

diff --git a/content/blog/2026-03-31-writing-table-providers.md b/content/blog/2026-03-31-writing-table-providers.md
index 5787ce39..37a6a9af 100644
--- a/content/blog/2026-03-31-writing-table-providers.md
+++ b/content/blog/2026-03-31-writing-table-providers.md
@@ -600,27 +600,6 @@ column, the provider can skip entire directories -- avoiding the I/O of listing
 and reading files that cannot possibly match the query.
 
 ```rust
-use std::any::Any;
-use std::collections::HashMap;
-use std::sync::Arc;
-
-use arrow::datatypes::{DataType, Field, Schema, SchemaRef};
-use arrow::array::{Date32Array, Float64Array, StringArray};
-use arrow::record_batch::RecordBatch;
-use datafusion::catalog::TableProvider;
-use datafusion::common::Result;
-use datafusion::datasource::TableType;
-use datafusion::execution::SendableRecordBatchStream;
-use datafusion::logical_expr::Expr;
-use datafusion::logical_expr::TableProviderFilterPushDown;
-use datafusion::physical_expr::EquivalenceProperties;
-use datafusion::physical_plan::execution_plan::{Boundedness, EmissionType};
-use datafusion::physical_plan::stream::RecordBatchStreamAdapter;
-use datafusion::physical_plan::{
-    ExecutionPlan, Partitioning, PlanProperties,
-};
-use futures::stream;
-
 /// A table provider backed by date-partitioned directories.
 /// Each date directory contains data files; by filtering on the
 /// `date` column we can skip entire directories of I/O.

From 5c61689962ba51e946573fd36f8a7319941637dd Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 15:35:46 -0400
Subject: [PATCH 13/19] revert section on scan_with_args and drop to single
 line

---
 .../2026-03-31-writing-table-providers.md     | 37 ++-----------------
 1 file changed, 3 insertions(+), 34 deletions(-)

diff --git a/content/blog/2026-03-31-writing-table-providers.md b/content/blog/2026-03-31-writing-table-providers.md
index 37a6a9af..1bf46e73 100644
--- a/content/blog/2026-03-31-writing-table-providers.md
+++ b/content/blog/2026-03-31-writing-table-providers.md
@@ -175,6 +175,9 @@ to produce:
   if you can stop reading early once you have produced enough rows, this avoids
   unnecessary work.
 
+You can also use the  [scan_with_args()](https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html#method.scan_with_args)
+variant that provides additional pushdown information for other advanced use cases.
+
 ### Keep `scan()` Lightweight
 
 This is a critical point: **`scan()` runs during planning, not execution.** It
@@ -695,40 +698,6 @@ that a `FilterExec` is unnecessary for the `date` predicate, and the second
 ensures that only the relevant directories are scanned. The actual file reading
 happens later, in the stream produced by `execute()`.
 
-## `scan` vs `scan_with_args`
-
----
-
-The examples above all use the `scan()` method, which receives projection,
-filters, and limit as separate parameters. DataFusion also provides
-[scan_with_args()], which bundles these into a structured [ScanArgs]
-parameter:
-
-```rust
-async fn scan_with_args(
-    &self,
-    args: ScanArgs<'_>,
-) -> Result<ScanResult> {
-    let projection = args.projection();
-    let filters = args.filters();
-    let limit = args.limit();
-    // ...
-}
-```
-
-`ScanArgs` is designed to be extensible -- new scan parameters can be added
-without breaking existing implementations. It also carries additional context
-not available in `scan()`, such as a `preferred_ordering` hint that lets the
-optimizer request a specific output order from your provider.
-
-If you are building a new table provider, consider implementing
-`scan_with_args()` instead of `scan()`. The default implementation of `scan()`
-delegates to `scan_with_args()`, so you only need to implement one. Existing
-providers that already implement `scan()` will continue to work without changes.
-
-[scan_with_args()]: https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html#method.scan_with_args
-[ScanArgs]: https://docs.rs/datafusion/latest/datafusion/catalog/struct.ScanArgs.html
-
 ## Putting It All Together
 
 ---

From 0e8ef09b1e1fc94ba90dbb0048fb15f4e9e65407 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 15:39:16 -0400
Subject: [PATCH 14/19] Move 'Choosing the Right Starting Point' before Layer 1

Addresses alamb's suggestion to move the section earlier so readers
understand what level of work is required before diving in.

- Moved section to just before Layer 1: TableProvider
- Trimmed the file-based path detail to a short paragraph with links
  (the full trait hierarchy was too deep for an intro-position section)
- Removed RecordBatchStreamAdapter reference (not yet introduced at
  that point in the article)
- Added a sentence orienting the reader to what the rest of the post covers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 .../2026-03-31-writing-table-providers.md     | 101 ++++++------------
 1 file changed, 35 insertions(+), 66 deletions(-)

diff --git a/content/blog/2026-03-31-writing-table-providers.md b/content/blog/2026-03-31-writing-table-providers.md
index 1bf46e73..ce2436e7 100644
--- a/content/blog/2026-03-31-writing-table-providers.md
+++ b/content/blog/2026-03-31-writing-table-providers.md
@@ -124,6 +124,41 @@ planning, those hints are passed to you. By implementing capabilities like
 metadata you declare in your `ExecutionPlan` (partitioning, ordering) directly
 affects which physical optimizations apply.
 
+## Choosing the Right Starting Point
+
+---
+
+Not every custom data source requires implementing all three layers from
+scratch. DataFusion provides building blocks that let you plug in at whatever
+level makes sense:
+
+| If your data is... | Start with | You implement |
+|---|---|---|
+| Already in `RecordBatch`es in memory | [MemTable] | Nothing -- just construct it |
+| An async stream of batches | [StreamTable] | A stream factory |
+| A logical transformation of other tables | [ViewTable] wrapping a logical plan | The logical plan |
+| A variant of an existing file format | [ListingTable] with a custom [FileFormat] wrapping an existing one | A thin `FileFormat` wrapper |
+| Files in a custom format on disk or object storage | [ListingTable] with a custom [FileFormat], [FileSource], and [FileOpener] | The format, source, and opener |
+| A custom source needing full control | `TableProvider` + `ExecutionPlan` + stream | All three layers |
+
+[FileFormat]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/trait.FileFormat.html
+[FileSource]: https://docs.rs/datafusion-datasource/latest/datafusion_datasource/file/trait.FileSource.html
+[FileOpener]: https://docs.rs/datafusion-datasource/latest/datafusion_datasource/file_stream/trait.FileOpener.html
+
+If your data is file-based, `ListingTable` handles file discovery, partition
+column inference, and plan construction -- you only need to implement
+`FileFormat`, `FileSource`, and `FileOpener` to describe how to read your
+files. See the [custom_file_format example] for a minimal wrapping approach,
+or [ParquetSource] and [ParquetOpener] for a full custom implementation to
+use as a reference.
+
+[custom_file_format example]: https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/custom_data_source/custom_file_format.rs
+[ParquetSource]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/parquet/struct.ParquetSource.html
+[ParquetOpener]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/parquet/struct.ParquetOpener.html
+
+The rest of this post focuses on the full `TableProvider` + `ExecutionPlan` +
+stream path, which gives you complete control and applies to any data source.
+
 ## Layer 1: TableProvider
 
 ---
@@ -825,72 +860,6 @@ impl ExecutionPlan for CountingExec {
 }
 ```
 
-## Choosing the Right Starting Point
-
----
-
-Not every custom data source requires implementing all three layers from
-scratch. DataFusion provides building blocks that let you plug in at whatever
-level makes sense:
-
-| If your data is... | Start with | You implement |
-|---|---|---|
-| Already in `RecordBatch`es in memory | [MemTable] | Nothing -- just construct it |
-| An async stream of batches | [StreamTable] | A stream factory |
-| A logical transformation of other tables | [ViewTable] wrapping a logical plan | The logical plan |
-| A variant of an existing file format | [ListingTable] with a custom [FileFormat] wrapping an existing one | A thin `FileFormat` wrapper |
-| Files in a custom format on disk or object storage | [ListingTable] with a custom [FileFormat], [FileSource], and [FileOpener] | The format, source, and opener |
-| A custom source needing full control | `TableProvider` + `ExecutionPlan` + stream | All three layers |
-
-[FileFormat]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/trait.FileFormat.html
-[FileSource]: https://docs.rs/datafusion-datasource/latest/datafusion_datasource/file/trait.FileSource.html
-[FileOpener]: https://docs.rs/datafusion-datasource/latest/datafusion_datasource/file_stream/trait.FileOpener.html
-
-### The File-Based Path: FileFormat, FileSource, and FileOpener
-
-If your data lives in files (local disk or object storage like S3), you do not
-need to build a `TableProvider` and `ExecutionPlan` from scratch. Instead, you
-can plug into [ListingTable] by implementing a stack of three traits:
-
-1. **[FileFormat]** -- The planning-level abstraction. Handles schema inference
-   (`infer_schema`), statistics (`infer_stats`), and produces a `FileSource` via
-   its `file_source()` method. If your format is a variant of an existing one,
-   you can wrap an existing `FileFormat` and delegate most methods.
-2. **[FileSource]** -- The execution-level configuration. Holds format-specific
-   settings and creates a `FileOpener` in `create_file_opener()`. You can also
-   override provided methods for optimization hooks like filter pushdown,
-   projection pushdown, and repartitioning.
-3. **[FileOpener]** -- The I/O layer. Has a single method,
-   `open(PartitionedFile)`, that reads a file (or byte range within a file)
-   and returns an async stream of `RecordBatch`es.
-
-The relationship flows downward:
-
-```text
-FileFormat  (planning: schema inference, statistics)
-  └── file_source() → FileSource  (execution: config + optimization hooks)
-        └── create_file_opener() → FileOpener  (I/O: reads files → RecordBatches)
-```
-
-`ListingTable` handles everything else: file discovery, partition column
-inference, and wiring the result into a [DataSourceExec] execution plan. You
-get file pruning, projection pushdown, and parallelism across files for free.
-
-If your format is a variant of an existing one, the [custom_file_format example]
-shows how to wrap `CsvFormat` to create a TSV format with minimal code -- you
-only need to implement `FileFormat`. For a fully custom format, a good approach
-is to study the built-in implementations like [ParquetSource] and [ParquetOpener]
-to understand the full `FileSource` → `FileOpener` contract.
-
-[custom_file_format example]: https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/custom_data_source/custom_file_format.rs
-[ParquetSource]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/parquet/struct.ParquetSource.html
-[ParquetOpener]: https://docs.rs/datafusion/latest/datafusion/datasource/file_format/parquet/struct.ParquetOpener.html
-
-For most non-file integrations, [StreamTable] combined with
-[RecordBatchStreamAdapter] provides a good balance of simplicity and
-flexibility. You provide a closure that returns a stream, and DataFusion handles
-the rest.
-
 ## Acknowledgements
 
 I would like to thank [Rerun.io] for sponsoring the development of this work. [Rerun.io]

From a7d1060d7f0b40a98130319f6bdaf5cfbd00425d Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 15:53:40 -0400
Subject: [PATCH 15/19] Fix pre-publish review issues
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Fix use-after-move bug in DatePartitionedExec construction (dirs.len()
  called after dirs moved into struct field)
- Fix incorrect import: SessionState → catalog::Session in CountingTable
  example
- Remove double space before scan_with_args link
- Add missing blank line before '### Using EXPLAIN' heading
- Split dense 'Only Push Down Filters' paragraph for readability
- Change 'full working example' to 'illustrative example' for the
  filter pushdown code that contains todo!() stubs
- Use 'Rerun is building' instead of repeating [Rerun.io] link

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 .../2026-03-31-writing-table-providers.md     | 20 ++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/content/blog/2026-03-31-writing-table-providers.md b/content/blog/2026-03-31-writing-table-providers.md
index ce2436e7..c780144d 100644
--- a/content/blog/2026-03-31-writing-table-providers.md
+++ b/content/blog/2026-03-31-writing-table-providers.md
@@ -210,7 +210,7 @@ to produce:
   if you can stop reading early once you have produced enough rows, this avoids
   unnecessary work.
 
-You can also use the  [scan_with_args()](https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html#method.scan_with_args)
+You can also use the [scan_with_args()](https://docs.rs/datafusion/latest/datafusion/catalog/trait.TableProvider.html#method.scan_with_args)
 variant that provides additional pushdown information for other advanced use cases.
 
 ### Keep `scan()` Lightweight
@@ -579,7 +579,12 @@ filter via `FilterExec` if you reported it as `Unsupported`.
 
 DataFusion already pushes filters as close to the data source as possible, typically placing them directly above the scan. `FilterExec` is also highly optimized, with vectorized evaluation and type-specialized kernels for fast predicate evaluation.
 
-Because of this, you should only implement filter pushdown when your data source can do strictly better. For example, avoid I/O by skipping data early using metadata. If your data source cannot eliminate I/O in this way, it is usually better to let DataFusion handle the filter, as its in-memory execution is already highly efficient (unless there are additional opportunities for deeper, application-specific optimizations).
+Because of this, you should only implement filter pushdown when your data source
+can do strictly better -- for example, by avoiding I/O entirely through
+skipping files or partitions based on metadata. If your data source cannot
+eliminate I/O in this way, it is usually better to let DataFusion handle the
+filter, as its in-memory execution is already highly efficient.
+
 ### Using EXPLAIN to Debug Your Table Provider
 
 The `EXPLAIN` statement is your best tool for understanding what DataFusion is
@@ -630,7 +635,7 @@ better output properties. Whenever your queries seem slower than expected,
 
 ### A Complete Filter Pushdown Example
 
-To make filter pushdown concrete, here is a full working example. Imagine a
+To make filter pushdown concrete, here is an illustrative example. Imagine a
 table provider that reads from a set of date-partitioned directories on disk
 (e.g., `data/2026-03-01/`, `data/2026-03-02/`, ...). Each directory contains
 one or more Parquet files for that date. By pushing down a filter on the `date`
@@ -690,6 +695,7 @@ impl TableProvider for DatePartitionedTable {
             .iter()
             .filter_map(|d| self.partitions.get(d).cloned())
             .collect();
+        let num_dirs = dirs.len();
 
         Ok(Arc::new(DatePartitionedExec {
             schema: Arc::clone(&self.schema),
@@ -700,7 +706,7 @@ impl TableProvider for DatePartitionedTable {
                 ),
                 // One partition per date directory -- these
                 // will be read in parallel.
-                Partitioning::UnknownPartitioning(dirs.len()),
+                Partitioning::UnknownPartitioning(num_dirs),
                 EmissionType::Incremental,
                 Boundedness::Bounded,
             ),
@@ -750,7 +756,7 @@ use arrow::record_batch::RecordBatch;
 use datafusion::catalog::TableProvider;
 use datafusion::common::Result;
 use datafusion::datasource::TableType;
-use datafusion::execution::context::SessionState;
+use datafusion::catalog::Session;
 use datafusion::execution::SendableRecordBatchStream;
 use datafusion::logical_expr::Expr;
 use datafusion::physical_expr::EquivalenceProperties;
@@ -862,8 +868,8 @@ impl ExecutionPlan for CountingExec {
 
 ## Acknowledgements
 
-I would like to thank [Rerun.io] for sponsoring the development of this work. [Rerun.io]
-is building a data visualization system for Physical AI and makes heavy use of DataFusion
+I would like to thank [Rerun.io] for sponsoring the development of this work. Rerun is
+building a data visualization system for Physical AI and makes heavy use of DataFusion
 table providers for working with data analytics.
 
 [Rerun.io]: https://rerun.io

From 3d74c054e39319bef8197ffbf4ff3013fada1742 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 15:55:26 -0400
Subject: [PATCH 16/19] Add reviewer acknowledgements to blog post

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 content/blog/2026-03-31-writing-table-providers.md | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/content/blog/2026-03-31-writing-table-providers.md b/content/blog/2026-03-31-writing-table-providers.md
index c780144d..4d0820f3 100644
--- a/content/blog/2026-03-31-writing-table-providers.md
+++ b/content/blog/2026-03-31-writing-table-providers.md
@@ -872,6 +872,14 @@ I would like to thank [Rerun.io] for sponsoring the development of this work. Re
 building a data visualization system for Physical AI and makes heavy use of DataFusion
 table providers for working with data analytics.
 
+I would also like to thank the reviewers of this post for their helpful feedback and
+suggestions: [@alamb], [@2010YOUY01], [@pgwhalen], and [@stuhood].
+
+[@alamb]: https://github.com/alamb
+[@2010YOUY01]: https://github.com/2010YOUY01
+[@pgwhalen]: https://github.com/pgwhalen
+[@stuhood]: https://github.com/stuhood
+
 [Rerun.io]: https://rerun.io
 
 ## Further Reading

From 735ea2f8e1da6d3b0c3037ed5715c25e7acef72c Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 15:56:07 -0400
Subject: [PATCH 17/19] Add 'Get Involved' section

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 .../blog/2026-03-31-writing-table-providers.md    | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/content/blog/2026-03-31-writing-table-providers.md b/content/blog/2026-03-31-writing-table-providers.md
index 4d0820f3..10db0014 100644
--- a/content/blog/2026-03-31-writing-table-providers.md
+++ b/content/blog/2026-03-31-writing-table-providers.md
@@ -882,6 +882,21 @@ suggestions: [@alamb], [@2010YOUY01], [@pgwhalen], and [@stuhood].
 
 [Rerun.io]: https://rerun.io
 
+## Get Involved
+
+DataFusion is not a project built or driven by a single person, company, or
+foundation. Our community of users and contributors works together to build a
+shared technology that none of us could have built alone.
+
+If you are interested in joining us, we would love to have you. You can try out
+DataFusion on some of your own data and projects and let us know how it goes,
+contribute suggestions, documentation, bug reports, or a PR with documentation,
+tests, or code. A list of open issues suitable for beginners is [here], and you
+can find out how to reach us on the [communication doc].
+
+[here]: https://github.com/apache/arrow-datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22
+[communication doc]: https://datafusion.apache.org/contributor-guide/communication.html
+
 ## Further Reading
 
 ---

From fc9653462f80a61bc1e0bf8af4ace78b91521567 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 15:59:52 -0400
Subject: [PATCH 18/19] Final pre-publish fixes
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Fix grammar: "Best practices are" → "Best practice is"
- Remove unused StringArray import from complete example
- Fix outdated arrow-datafusion repo link → apache/datafusion
- Add missing reviewers to acknowledgements: adriangb, kevinjqliu, Omega359

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 content/blog/2026-03-31-writing-table-providers.md | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/content/blog/2026-03-31-writing-table-providers.md b/content/blog/2026-03-31-writing-table-providers.md
index 10db0014..d8dddc68 100644
--- a/content/blog/2026-03-31-writing-table-providers.md
+++ b/content/blog/2026-03-31-writing-table-providers.md
@@ -216,7 +216,7 @@ variant that provides additional pushdown information for other advanced use cas
 ### Keep `scan()` Lightweight
 
 This is a critical point: **`scan()` runs during planning, not execution.** It
-should return quickly. Best practices are to avoid performing I/O, network
+should return quickly. Best practice is to avoid performing I/O, network
 calls, or heavy computation here. The `scan` method's job is to *describe* how
 the data will be produced, not to produce it. All the real work belongs in the
 stream (Layer 3).
@@ -750,7 +750,7 @@ data lazily during streaming:
 use std::any::Any;
 use std::sync::Arc;
 
-use arrow::array::{Int64Array, StringArray};
+use arrow::array::Int64Array;
 use arrow::datatypes::{DataType, Field, Schema, SchemaRef};
 use arrow::record_batch::RecordBatch;
 use datafusion::catalog::TableProvider;
@@ -873,10 +873,14 @@ building a data visualization system for Physical AI and makes heavy use of Data
 table providers for working with data analytics.
 
 I would also like to thank the reviewers of this post for their helpful feedback and
-suggestions: [@alamb], [@2010YOUY01], [@pgwhalen], and [@stuhood].
+suggestions: [@adriangb], [@alamb], [@2010YOUY01], [@kevinjqliu], [@Omega359],
+[@pgwhalen], and [@stuhood].
 
+[@adriangb]: https://github.com/adriangb
 [@alamb]: https://github.com/alamb
 [@2010YOUY01]: https://github.com/2010YOUY01
+[@kevinjqliu]: https://github.com/kevinjqliu
+[@Omega359]: https://github.com/Omega359
 [@pgwhalen]: https://github.com/pgwhalen
 [@stuhood]: https://github.com/stuhood
 
@@ -894,7 +898,7 @@ contribute suggestions, documentation, bug reports, or a PR with documentation,
 tests, or code. A list of open issues suitable for beginners is [here], and you
 can find out how to reach us on the [communication doc].
 
-[here]: https://github.com/apache/arrow-datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22
+[here]: https://github.com/apache/datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22
 [communication doc]: https://datafusion.apache.org/contributor-guide/communication.html
 
 ## Further Reading

From d286cba510a5a64853ab09f38bfb0e25a7175658 Mon Sep 17 00:00:00 2001
From: Tim Saucer <timsaucer@gmail.com>
Date: Tue, 31 Mar 2026 16:01:13 -0400
Subject: [PATCH 19/19] make it alphabetical

---
 content/blog/2026-03-31-writing-table-providers.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/content/blog/2026-03-31-writing-table-providers.md b/content/blog/2026-03-31-writing-table-providers.md
index d8dddc68..1a5da513 100644
--- a/content/blog/2026-03-31-writing-table-providers.md
+++ b/content/blog/2026-03-31-writing-table-providers.md
@@ -873,12 +873,12 @@ building a data visualization system for Physical AI and makes heavy use of Data
 table providers for working with data analytics.
 
 I would also like to thank the reviewers of this post for their helpful feedback and
-suggestions: [@adriangb], [@alamb], [@2010YOUY01], [@kevinjqliu], [@Omega359],
+suggestions: [@2010YOUY01], [@adriangb], [@alamb], [@kevinjqliu], [@Omega359],
 [@pgwhalen], and [@stuhood].
 
+[@2010YOUY01]: https://github.com/2010YOUY01
 [@adriangb]: https://github.com/adriangb
 [@alamb]: https://github.com/alamb
-[@2010YOUY01]: https://github.com/2010YOUY01
 [@kevinjqliu]: https://github.com/kevinjqliu
 [@Omega359]: https://github.com/Omega359
 [@pgwhalen]: https://github.com/pgwhalen