apache · alamb · Apr 1, 2026 · Mar 28, 2026
diff --git a/content/blog/2024-01-19-datafusion-34.0.0.md b/content/blog/2024-01-19-datafusion-34.0.0.md
@@ -112,17 +112,17 @@ more than 2x faster on [ClickBench] compared to version `25.0.0`, as shown below
 
 [ClickBench]: https://benchmark.clickhouse.com/
 
-<figure style="text-align: center;">
-  <img src="/blog/images/datafusion-34.0.0/compare-new.png" width="100%" class="img-fluid" alt="Fig 1: Adaptive Arrow schema architecture overview.">
+<figure class="text-center">
+  <img src="/blog/images/datafusion-34.0.0/compare-new.png" class="img-fluid" alt="Fig 1: Adaptive Arrow schema architecture overview.">
   <figcaption>
     <b>Figure 1</b>: Performance improvement between <code>25.0.0</code> and <code>34.0.0</code> on ClickBench. 
     Note that DataFusion <code>25.0.0</code>, could not run several queries due to 
     unsupported SQL (Q9, Q11, Q12, Q14) or memory requirements (Q33).
   </figcaption>
 </figure>
 
-<figure style="text-align: center;">
-  <img src="/blog/images/datafusion-34.0.0/compare.png" width="100%" class="img-fluid" alt="Fig 1: Adaptive Arrow schema architecture overview.">
+<figure class="text-center">
+  <img src="/blog/images/datafusion-34.0.0/compare.png" class="img-fluid" alt="Fig 1: Adaptive Arrow schema architecture overview.">
   <figcaption>
     <b>Figure 2</b>: Total query runtime for DataFusion <code>34.0.0</code> and DataFusion <code>25.0.0</code>.
   </figcaption>

diff --git a/content/blog/2024-03-06-comet-donation.md b/content/blog/2024-03-06-comet-donation.md
@@ -35,10 +35,9 @@ accelerate Spark workloads. It is designed as a drop-in
 replacement for Spark's JVM based SQL execution engine and offers significant
 performance improvements for some workloads as shown below.
 
-<figure style="text-align: center;">
+<figure class="text-center">
   <img 
     src="/blog/images/datafusion-comet/comet-architecture.png" 
-    width="100%" 
     class="img-fluid" 
     alt="Fig 1: Adaptive Arrow schema architecture overview."
   >

diff --git a/content/blog/2024-08-20-python-datafusion-40.0.0.md b/content/blog/2024-08-20-python-datafusion-40.0.0.md
@@ -68,10 +68,9 @@ Modern IDEs use language servers such as
 hints, and identify usage errors. These are major tools in the python user community. With this
 release, users can fully use these tools in their workflow.
 
-<figure style="text-align: center;">
+<figure class="text-center">
   <img 
     src="/blog/images/python-datafusion-40.0.0/vscode_hover_tooltip.png" 
-    width="100%"
     class="img-fluid"
     alt="Fig 1: Enhanced tooltips in an IDE."
   >
@@ -84,10 +83,9 @@ release, users can fully use these tools in their workflow.
 By having the type annotations, these IDEs can also identify quickly when a user has incorrectly
 used a function's arguments as shown in Figure 2.
 
-<figure style="text-align: center;">
+<figure class="text-center">
   <img 
     src="/blog/images/python-datafusion-40.0.0/pylance_error_checking.png"
-    width="100%"
     class="img-fluid"
     alt="Fig 2: Error checking in static analysis"
   >

diff --git a/content/blog/2025-03-11-ordering-analysis.md b/content/blog/2025-03-11-ordering-analysis.md
@@ -31,7 +31,7 @@ limitations under the License.
 
 ## Introduction
 In this blog post, we explain when an ordering requirement of an operator is satisfied by its input data. This analysis is essential for order-based optimizations and is often more complex than one might initially think.
-<blockquote style="border-left: 4px solid #007bff; padding: 10px; background-color: #f8f9fa;">
+<blockquote class="border-start border-primary border-4 ps-3 py-2 bg-light">
     <strong>Ordering Requirement</strong> for an operator describes how the input data to that operator must be sorted for the operator to compute the correct result. It is the job of the planner to make sure that these requirements are satisfied during execution (See DataFusion <a href="https://docs.rs/datafusion/latest/datafusion/physical_optimizer/enforce_sorting/struct.EnforceSorting.html" target="_blank">EnforceSorting</a> for an implementation of such a rule).
 </blockquote>
 
@@ -134,7 +134,7 @@ Let's start by creating an example table that we will refer throughout the post.
 
 <br>
 
-<blockquote style="border-left: 4px solid #007bff; padding: 10px; background-color: #f8f9fa;">
+<blockquote class="border-start border-primary border-4 ps-3 py-2 bg-light">
 <strong>How can a table have multiple orderings?</strong> At first glance it may seem counterintuitive for a table to have more than one valid ordering. However, during query execution such scenarios can arise.
 
 For example consider the following query:
@@ -197,7 +197,7 @@ To solve the shortcomings above DataFusion needs to track of following propertie
 - Equivalent Expression Groups (will be explained shortly)
 - Succinct Valid Orderings (will be explained shortly)
 
-<blockquote style="border-left: 4px solid #007bff; padding: 10px; background-color: #f8f9fa;">
+<blockquote class="border-start border-primary border-4 ps-3 py-2 bg-light">
     <strong>Note:</strong> These properties are implemented in the <code>EquivalenceProperties</code> structure in <code>DataFusion</code>, please see the <a href="https://github.com/apache/datafusion/blob/f47ea73b87eec4af044f9b9923baf042682615b2/datafusion/physical-expr/src/equivalence/properties/mod.rs#L134" target="_blank">source</a> for more details<br>
 </blockquote>
 
@@ -210,7 +210,7 @@ For instance in the example table:
 
 - Columns `hostname` and `currency` are constant because every row in the table has the same value (`'app.example.com'` for `hostname`, and `'USD'` for `currency`) for these columns.
 
-<blockquote style="border-left: 4px solid #007bff; padding: 10px; background-color: #f8f9fa;">
+<blockquote class="border-start border-primary border-4 ps-3 py-2 bg-light">
     <strong>Note:</strong> Constant expressions can arise during query execution. For example, in following query:<br>
     <code>SELECT hostname FROM logs</code><br><code>WHERE hostname='app.example.com'</code> <br>
     after filtering is done, for subsequent operators the <code>hostname</code> column will be constant.
@@ -221,7 +221,7 @@ Equivalent expression groups are expressions that always hold the same value acr
 
 In the example table, the expressions `price` and `price_cloned` form one equivalence group, and `time` and `time_cloned` form another equivalence group.
 
-<blockquote style="border-left: 4px solid #007bff; padding: 10px; background-color: #f8f9fa;">
+<blockquote class="border-start border-primary border-4 ps-3 py-2 bg-light">
     <strong>Note:</strong> Equivalent expression groups can arise during the query execution. For example, in the following query:<br>
     <code>SELECT time, time as time_cloned FROM logs</code> <br>
     after the projection is done, for subsequent operators <code>time</code> and <code>time_cloned</code> will form an equivalence group. As another example, in the following query:<br>
@@ -293,7 +293,7 @@ Following third and fourth constraints for the simplified table, the succinct va
 `[time_bin ASC]`,  
 `[time ASC]`  
 
-<blockquote style="border-left: 4px solid #007bff; padding: 10px; background-color: #f8f9fa;">
+<blockquote class="border-start border-primary border-4 ps-3 py-2 bg-light">
 <p><strong>How can DataFusion find orderings?</strong></p> 
 DataFusion's <code>CREATE EXTERNAL TABLE</code> has a <code>WITH ORDER</code> clause (see <a href="https://datafusion.apache.org/user-guide/sql/ddl.html#create-external-table">docs</a>) to specify the known orderings of the table during table creation. For example the following query:<br>
 <pre><code>

diff --git a/content/blog/2025-03-30-datafusion-python-46.0.0.md b/content/blog/2025-03-30-datafusion-python-46.0.0.md
@@ -177,10 +177,9 @@ for the user to click on it to expand the data.
 In the below view you can see an example of some of these features such as the
 expandable text and scroll bars.
 
-<figure style="text-align: center;">
+<figure class="text-center">
   <img 
     src="/blog/images/python-datafusion-46.0.0/html_rendering.png" 
-    width="100%"
     class="img-fluid"
     alt="Fig 1: Example html rendering in a jupyter notebook."
   >

diff --git a/content/blog/2025-04-10-fastest-tpch-generator.md b/content/blog/2025-04-10-fastest-tpch-generator.md
@@ -25,16 +25,6 @@ limitations under the License.
 
 [TOC]
 
-<style>
-/* Table borders */
-table, th, td {
-  border: 1px solid black;
-  border-collapse: collapse;
-}
-th, td {
-  padding: 3px;
-}
-</style>
 **TLDR: TPC-H SF=100 in 1min using tpchgen-rs vs 30min+ with dbgen**.
 
 3 members of the [Apache DataFusion] community used Rust and open source
@@ -135,7 +125,7 @@ bound on the Scale Factor.
 
 **Figure 2**: Example TBL formatted output of `dbgen` for the `LINEITEM` table
 
-<table>
+<table class="table table-bordered">
   <tr>
    <td><strong>Scale Factor</strong>
    </td>
@@ -308,7 +298,7 @@ compatible port, and knew about the performance shortcomings and how to approach
 them.
 
 
-<table>
+<table class="table table-bordered">
   <tr>
    <td><strong>Scale Factor</strong>
    </td>
@@ -356,7 +346,7 @@ list of optimizations:
 At the time of writing, single threaded performance is now 2.5x-2.7x faster than the initial version, as shown in Table 3.
 
 
-<table>
+<table class="table table-bordered">
   <tr>
    <td><strong>Scale Factor</strong>
    </td>
@@ -412,7 +402,7 @@ When writing to `/dev/null` tpchgen  generates the entire dataset in 25 seconds
 (4 GB/s).
 
 
-<table>
+<table class="table table-bordered">
   <tr>
    <td><strong>Scale Factor</strong>
    </td>
@@ -516,7 +506,7 @@ or CSV, tpchgen-cli creates the full SF=100 parquet format dataset in less than
 [a small 300 line PR]: https://github.com/clflushopt/tpchgen-rs/pull/61
 [Rust Parquet writer]: https://crates.io/crates/parquet
 
-<table>
+<table class="table table-bordered">
   <tr>
    <td><strong>Scale Factor</strong>
    </td>

diff --git a/content/blog/2025-06-30-cancellation.md b/content/blog/2025-06-30-cancellation.md
@@ -359,7 +359,7 @@ To illustrate what this process looks like, let's have a look at the execution o
 If we assume a task budget of 1 unit, each time Tokio schedules the task would result in the following sequence of function calls.
 
 <figure>
-<img src="/blog/images/task-cancellation/tokio_budget.png" style="width: 100%; max-width: 100%" class="img-fluid" alt="Sequence diagram showing how the tokio task budget is used and reset."
+<img src="/blog/images/task-cancellation/tokio_budget.png" class="img-fluid" alt="Sequence diagram showing how the tokio task budget is used and reset."
 />
 <figcaption>Tokio task budget system, assuming the task budget is set to 1, for the plan above.</figcaption>
 </figure>

diff --git a/content/blog/2025-07-11-datafusion-47.0.0.md b/content/blog/2025-07-11-datafusion-47.0.0.md
@@ -193,7 +193,7 @@ logging, or metrics) across thread boundaries without depending on any specific
 You can use the [JoinSetTracer] API to instrument DataFusion plans with your own tracing or logging libraries, or
 use pre-integrated community crates such as the [datafusion-tracing] crate.
 
-<div style="text-align: center;">
+<div class="text-center">
   <a href="https://github.com/datafusion-contrib/datafusion-tracing">
     <img
       src="/blog/images/datafusion-47.0.0/datafusion-telemetry.png"

diff --git a/content/blog/2025-07-14-user-defined-parquet-indexes.md b/content/blog/2025-07-14-user-defined-parquet-indexes.md
@@ -173,24 +173,24 @@ A **distinct value index** stores the unique values of a specific column. This t
 
 For example, if the files contain a column named `Category` like this:
 
-<table style="border-collapse:collapse;">
+<table class="table table-bordered">
   <tr>
-    <td style="border:1px solid #888;padding:2px 6px;"><b><code>Category</code></b></td>
+    <td><b><code>Category</code></b></td>
   </tr>
   <tr>
-    <td style="border:1px solid #888;padding:2px 6px;"><code>foo</code></td>
+    <td><code>foo</code></td>
   </tr>
   <tr>
-    <td style="border:1px solid #888;padding:2px 6px;"><code>bar</code></td>
+    <td><code>bar</code></td>
   </tr>
   <tr>
-    <td style="border:1px solid #888;padding:2px 6px;"><code>...</code></td>
+    <td><code>...</code></td>
   </tr>
   <tr>
-    <td style="border:1px solid #888;padding:2px 6px;"><code>baz</code></td>
+    <td><code>baz</code></td>
   </tr>
   <tr>
-    <td style="border:1px solid #888;padding:2px 6px;"><code>foo</code></td>
+    <td><code>foo</code></td>
   </tr>
 </table>
 

diff --git a/content/blog/2025-12-15-avoid-consecutive-repartitions.md b/content/blog/2025-12-15-avoid-consecutive-repartitions.md
@@ -25,18 +25,17 @@ limitations under the License.
 {% endcomment %}
 -->
 
-<div style="display: flex; align-items: center; gap: 20px; margin-bottom: 20px;">
-<div style="flex: 1;">
+<div class="row align-items-center mb-3">
+<div class="col-md-7">
 
 Databases are some of the most complex yet interesting pieces of software. They are amazing pieces of abstraction: query engines optimize and execute complex plans, storage engines provide sophisticated infrastructure as the backbone of the system, while intricate file formats lay the groundwork for particular workloads. All of this is exposed by a user-friendly interface and query languages (typically a dialect of SQL).
 <br><br>
 Starting a journey learning about database internals can be daunting. With so many topics that are whole PhD degrees themselves, finding a place to start is difficult. In this blog post, I will share my early journey in the database world and a quick lesson on one of the first topics I dove into. If you are new to the space, this post will help you get your first foot into the database world, and if you are already a veteran, you may still learn something new.
 
 </div>
-<div style="flex: 0 0 40%; text-align: center;">
+<div class="col-md-5 text-center">
 <img
   src="/blog/images/avoid-consecutive-repartitions/database_system_diagram.png"
-  width="100%"
   class="img-fluid"
   alt="Database System Components"
 />
@@ -122,18 +121,17 @@ Partitioning is a "divide-and-conquer" approach to executing a query. Each parti
 
 #### **Round-Robin Repartitioning**
 
-<div style="display: flex; align-items: top; gap: 20px; margin-bottom: 20px;">
-<div style="flex: 1;">
+<div class="row align-items-start mb-3">
+<div class="col-md-9">
 
 Round-robin repartitioning is the simplest partitioning strategy. Incoming data is processed in batches (chunks of rows), and these batches are distributed across partitions cyclically or sequentially, with each new batch assigned to the next available partition.
 <br><br>
 Round-robin repartitioning is useful when the data grouping isn't known or when aiming for an even distribution across partitions. Because it simply assigns batches in order without inspecting their contents, it is a low-overhead way to increase parallelism for downstream operations.
 
 </div>
-<div style="flex: 0 0 25%; text-align: center;">
+<div class="col-md-3 text-center">
 <img
   src="/blog/images/avoid-consecutive-repartitions/round_robin_repartitioning.png"
-  width="100%"
   class="img-fluid"
   alt="Round-Robin Repartitioning"
 />
@@ -142,18 +140,17 @@ Round-robin repartitioning is useful when the data grouping isn't known or when
 
 #### **Hash Repartitioning**
 
-<div style="display: flex; align-items: top; gap: 20px; margin-bottom: 20px;">
-<div style="flex: 1;">
+<div class="row align-items-start mb-3">
+<div class="col-md-9">
 
 Hash repartitioning distributes data based on a hash function applied to one or more columns, called the partitioning key. Rows with the same hash value are placed in the same partition.
 <br><br>
 Hash repartitioning is useful when working with grouped data. Imagine you have a database containing information on company sales, and you are looking to find the total revenue each store produced. Hash repartitioning would make this query much more efficient. Rather than iterating over the data on a single thread and keeping a running sum for each store, it would be better to hash repartition on the store column and have multiple threads calculate individual store sales.
 
 </div>
-<div style="flex: 0 0 25%; text-align: center;">
+<div class="col-md-3 text-center">
 <img
   src="/blog/images/avoid-consecutive-repartitions/hash_repartitioning.png"
-  width="100%"
   class="img-fluid"
   alt="Hash Repartitioning"
 />