Skip to content

optimize(parquet): Nested list batching child.write calls#10085

Open
mapleFU wants to merge 6 commits into
apache:mainfrom
mapleFU:nested-list-optimize-write
Open

optimize(parquet): Nested list batching child.write calls#10085
mapleFU wants to merge 6 commits into
apache:mainfrom
mapleFU:nested-list-optimize-write

Conversation

@mapleFU

@mapleFU mapleFU commented Jun 6, 2026

Copy link
Copy Markdown
Member

Which issue does this PR close?

Rationale for this change

Optimize nested list call recursive write counts.

What changes are included in this PR?

Separate list write function to direct-by-offsets and by backward scan

Are these changes tested?

Covered by existing

Are there any user-facing changes?

No

@github-actions github-actions Bot added the parquet Changes to the parquet crate label Jun 6, 2026
/// counting child-element starts to find and stamp slot boundaries.
///
/// Scan backward because we don't know start offset before writing.
fn write_list_scan<O: OffsetSizeTrait>(

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I previously want to remove the branch for write non-nested childs, however benchmark shows that adding a more branch will hurt the performance. So I split it to two functions.

for rep in rep_levels.iter_mut().rev() {
// This can uses `==`, since list write is recursive and the child is written
// before the parent.
if *rep <= ctx.rep_level {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still use <= because benchmark shows no performance enhancement between <= and ==, so just uses <=

// before the parent.
if *rep <= ctx.rep_level {
seen += 1;
if seen == next_stamp_at {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can SIMD this in the future, this is not cirtical in this branch. Or we can "batching" if list offsets is large. E.g. checking not one-by-one, and just batch by batch

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes Parquet level generation for nested Arrow List types by batching child.write(...) calls and applying repetition-level backfilling in larger chunks, reducing recursive write-call overhead for deeply nested list structures.

Changes:

  • Split list writing into two specialized hot paths: an offsets-based “direct” backfill for last-level list children and a backward-scan backfill for nested repetition cases
  • Extracted a shared run-classification loop (write_list_impl) to batch null/empty/non-empty runs while keeping monomorphized backfill strategies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread parquet/src/arrow/arrow_writer/levels.rs Outdated
Comment thread parquet/src/arrow/arrow_writer/levels.rs Outdated
Comment thread parquet/src/arrow/arrow_writer/levels.rs Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Comment thread parquet/src/arrow/arrow_writer/levels.rs Outdated
@mapleFU mapleFU requested a review from alamb June 11, 2026 02:00
@alamb

alamb commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

run benchmark arrow_writer

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4682497841-533-jtp2z 6.12.68+ #1 SMP Sat May 2 07:49:07 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing nested-list-optimize-write (0d16db7) to c3e0684 (merge-base) diff
BENCH_NAME=arrow_writer
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench arrow_writer
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                              main                                   nested-list-optimize-write
-----                                              ----                                   --------------------------
bool/bloom_filter                                  1.00     12.9±0.04ms    19.4 MB/sec    1.02     13.2±0.09ms    19.0 MB/sec
bool/cdc                                           1.00     15.7±0.06ms    15.9 MB/sec    1.01     15.9±0.09ms    15.7 MB/sec
bool/default                                       1.00     10.7±0.04ms    23.3 MB/sec    1.04     11.2±0.09ms    22.4 MB/sec
bool/parquet_2                                     1.00     14.6±0.05ms    17.2 MB/sec    1.02     14.9±0.07ms    16.8 MB/sec
bool/zstd                                          1.00     11.3±0.05ms    22.1 MB/sec    1.02     11.6±0.07ms    21.6 MB/sec
bool/zstd_parquet_2                                1.00     14.9±0.06ms    16.7 MB/sec    1.02     15.2±0.10ms    16.5 MB/sec
bool_non_null/bloom_filter                         1.00      7.0±0.02ms    17.9 MB/sec    1.01      7.0±0.02ms    17.8 MB/sec
bool_non_null/cdc                                  1.00      6.8±0.03ms    18.3 MB/sec    1.01      6.9±0.06ms    18.1 MB/sec
bool_non_null/default                              1.00      4.3±0.02ms    29.4 MB/sec    1.01      4.3±0.02ms    29.1 MB/sec
bool_non_null/parquet_2                            1.00      9.0±0.04ms    13.9 MB/sec    1.01      9.1±0.03ms    13.8 MB/sec
bool_non_null/zstd                                 1.00      4.6±0.03ms    27.1 MB/sec    1.01      4.7±0.02ms    26.9 MB/sec
bool_non_null/zstd_parquet_2                       1.00      9.4±0.04ms    13.3 MB/sec    1.01      9.5±0.03ms    13.2 MB/sec
float_with_nans/bloom_filter                       1.00     92.8±0.38ms   150.9 MB/sec    1.00     92.6±0.27ms   151.2 MB/sec
float_with_nans/cdc                                1.00     81.7±0.21ms   171.4 MB/sec    1.00     81.8±0.15ms   171.1 MB/sec
float_with_nans/default                            1.00     74.6±0.20ms   187.7 MB/sec    1.00     74.8±0.17ms   187.2 MB/sec
float_with_nans/parquet_2                          1.00     94.6±0.36ms   148.0 MB/sec    1.01     95.1±0.33ms   147.2 MB/sec
float_with_nans/zstd                               1.00    112.4±0.20ms   124.5 MB/sec    1.00    112.5±0.20ms   124.4 MB/sec
float_with_nans/zstd_parquet_2                     1.00    131.9±0.31ms   106.1 MB/sec    1.00    132.2±0.26ms   105.9 MB/sec
large_string_non_null/bloom_filter                 1.00     71.8±0.15ms     3.5 GB/sec    1.02     73.2±0.21ms     3.4 GB/sec
large_string_non_null/cdc                          1.00    243.6±1.07ms  1050.7 MB/sec    1.00    244.6±1.38ms  1046.7 MB/sec
large_string_non_null/default                      1.00     54.3±0.15ms     4.6 GB/sec    1.02     55.3±0.15ms     4.5 GB/sec
large_string_non_null/parquet_2                    1.00     54.3±0.15ms     4.6 GB/sec    1.02     55.3±0.21ms     4.5 GB/sec
large_string_non_null/zstd                         1.00     54.3±0.12ms     4.6 GB/sec    1.01     55.1±0.23ms     4.5 GB/sec
large_string_non_null/zstd_parquet_2               1.00     54.4±0.12ms     4.6 GB/sec    1.01     55.2±0.21ms     4.5 GB/sec
list_nested/bloom_filter                           1.08    158.9±0.30ms   181.6 MB/sec    1.00    146.6±0.20ms   196.8 MB/sec
list_nested/cdc                                    1.06    175.0±0.24ms   164.9 MB/sec    1.00    164.7±0.14ms   175.2 MB/sec
list_nested/default                                1.07    143.9±0.21ms   200.5 MB/sec    1.00    134.0±0.22ms   215.3 MB/sec
list_nested/parquet_2                              1.07    157.3±0.28ms   183.5 MB/sec    1.00    147.4±0.47ms   195.8 MB/sec
list_nested/zstd                                   1.07    153.1±0.33ms   188.5 MB/sec    1.00    142.6±0.14ms   202.4 MB/sec
list_nested/zstd_parquet_2                         1.06    170.2±0.22ms   169.6 MB/sec    1.00    159.9±0.20ms   180.5 MB/sec
list_primitive/bloom_filter                        1.00    287.6±1.67ms  1896.6 MB/sec    1.04    300.5±1.86ms  1815.0 MB/sec
list_primitive/cdc                                 1.01    323.0±2.01ms  1688.6 MB/sec    1.00    320.4±0.41ms  1702.2 MB/sec
list_primitive/default                             1.00    212.1±1.78ms     2.5 GB/sec    1.03    219.3±3.53ms     2.4 GB/sec
list_primitive/parquet_2                           1.00    232.7±0.37ms     2.3 GB/sec    1.01    235.2±0.52ms     2.3 GB/sec
list_primitive/zstd                                1.00    460.4±3.09ms  1184.6 MB/sec    1.00    460.2±0.45ms  1185.0 MB/sec
list_primitive/zstd_parquet_2                      1.00    457.2±5.60ms  1192.7 MB/sec    1.00    458.9±0.41ms  1188.5 MB/sec
list_primitive_non_null/bloom_filter               1.00    399.7±3.75ms  1361.6 MB/sec    1.03    410.2±6.92ms  1326.7 MB/sec
list_primitive_non_null/cdc                        1.01   414.4±10.18ms  1313.3 MB/sec    1.00    411.4±8.30ms  1323.0 MB/sec
list_primitive_non_null/default                    1.00    266.8±4.78ms  2040.1 MB/sec    1.03    274.5±8.13ms  1983.0 MB/sec
list_primitive_non_null/parquet_2                  1.00   284.4±13.10ms  1913.5 MB/sec    1.00    283.7±1.31ms  1918.5 MB/sec
list_primitive_non_null/zstd                       1.00    693.3±4.57ms   785.0 MB/sec    1.00   691.6±12.90ms   786.9 MB/sec
list_primitive_non_null/zstd_parquet_2             1.00    661.2±5.57ms   823.1 MB/sec    1.02    676.6±4.85ms   804.4 MB/sec
list_primitive_sparse_99pct_null/bloom_filter      1.00     12.7±0.11ms     2.9 GB/sec    1.01     12.9±0.06ms     2.8 GB/sec
list_primitive_sparse_99pct_null/cdc               1.00     24.2±0.08ms  1546.2 MB/sec    1.01     24.4±0.11ms  1531.3 MB/sec
list_primitive_sparse_99pct_null/default           1.00     12.4±0.06ms     3.0 GB/sec    1.00     12.4±0.08ms     2.9 GB/sec
list_primitive_sparse_99pct_null/parquet_2         1.00     12.3±0.07ms     3.0 GB/sec    1.01     12.5±0.06ms     2.9 GB/sec
list_primitive_sparse_99pct_null/zstd              1.00     14.2±0.08ms     2.6 GB/sec    1.01     14.3±0.06ms     2.6 GB/sec
list_primitive_sparse_99pct_null/zstd_parquet_2    1.00     12.4±0.08ms     2.9 GB/sec    1.01     12.5±0.07ms     2.9 GB/sec
list_struct_with_list/bloom_filter                 1.11    316.0±0.89ms   144.3 MB/sec    1.00    283.8±0.43ms   160.7 MB/sec
list_struct_with_list/cdc                          1.08    349.1±0.57ms   130.6 MB/sec    1.00    324.1±0.44ms   140.7 MB/sec
list_struct_with_list/default                      1.10    284.7±0.76ms   160.1 MB/sec    1.00    258.8±0.44ms   176.2 MB/sec
list_struct_with_list/parquet_2                    1.10    302.7±0.65ms   150.6 MB/sec    1.00    276.4±0.47ms   165.0 MB/sec
list_struct_with_list/zstd                         1.09    306.5±0.76ms   148.8 MB/sec    1.00    280.4±0.51ms   162.6 MB/sec
list_struct_with_list/zstd_parquet_2               1.09    325.8±0.59ms   140.0 MB/sec    1.00    300.1±0.51ms   151.9 MB/sec
primitive/bloom_filter                             1.00    148.2±2.01ms   302.7 MB/sec    1.01    149.8±0.52ms   299.6 MB/sec
primitive/cdc                                      1.00    157.8±0.45ms   284.4 MB/sec    1.02    160.5±0.54ms   279.6 MB/sec
primitive/default                                  1.00    117.1±0.65ms   383.2 MB/sec    1.02    119.7±0.50ms   375.0 MB/sec
primitive/parquet_2                                1.00    132.8±0.80ms   337.9 MB/sec    1.01    134.4±0.50ms   333.8 MB/sec
primitive/zstd                                     1.00    146.8±0.42ms   305.7 MB/sec    1.02    149.2±0.43ms   300.7 MB/sec
primitive/zstd_parquet_2                           1.00    166.2±0.68ms   270.0 MB/sec    1.01    168.2±0.82ms   266.8 MB/sec
primitive_all_null/bloom_filter                    1.00    924.8±3.99µs    47.4 GB/sec    1.02    939.9±3.51µs    46.6 GB/sec
primitive_all_null/cdc                             1.00     19.1±0.30ms     2.3 GB/sec    1.02     19.5±0.58ms     2.2 GB/sec
primitive_all_null/default                         1.00    313.0±1.25µs   140.0 GB/sec    1.02    320.8±1.35µs   136.6 GB/sec
primitive_all_null/parquet_2                       1.00    317.9±1.35µs   137.9 GB/sec    1.00    319.3±1.30µs   137.2 GB/sec
primitive_all_null/zstd                            1.00    429.5±1.27µs   102.0 GB/sec    1.02    437.2±1.54µs   100.2 GB/sec
primitive_all_null/zstd_parquet_2                  1.00    398.7±1.35µs   109.9 GB/sec    1.01    403.8±1.89µs   108.5 GB/sec
primitive_non_null/bloom_filter                    1.01    106.4±0.39ms   413.7 MB/sec    1.00    105.1±0.25ms   418.8 MB/sec
primitive_non_null/cdc                             1.00     90.7±0.25ms   485.2 MB/sec    1.00     90.3±0.33ms   487.1 MB/sec
primitive_non_null/default                         1.00     67.7±0.23ms   650.3 MB/sec    1.00     67.6±0.27ms   650.9 MB/sec
primitive_non_null/parquet_2                       1.00     89.8±0.23ms   490.2 MB/sec    1.00     89.5±0.26ms   491.9 MB/sec
primitive_non_null/zstd                            1.00     98.6±0.17ms   446.0 MB/sec    1.00     98.4±0.18ms   447.3 MB/sec
primitive_non_null/zstd_parquet_2                  1.00    123.5±0.20ms   356.3 MB/sec    1.00    123.2±0.19ms   357.1 MB/sec
primitive_sparse_99pct_null/bloom_filter           1.00     12.5±0.13ms     3.5 GB/sec    1.02     12.8±0.16ms     3.4 GB/sec
primitive_sparse_99pct_null/cdc                    1.00     30.3±0.26ms  1481.2 MB/sec    1.00     30.3±0.31ms  1479.6 MB/sec
primitive_sparse_99pct_null/default                1.00     10.8±0.07ms     4.0 GB/sec    1.01     10.9±0.08ms     4.0 GB/sec
primitive_sparse_99pct_null/parquet_2              1.00     10.8±0.07ms     4.0 GB/sec    1.01     10.9±0.08ms     4.0 GB/sec
primitive_sparse_99pct_null/zstd                   1.00     14.2±0.08ms     3.1 GB/sec    1.02     14.4±0.10ms     3.0 GB/sec
primitive_sparse_99pct_null/zstd_parquet_2         1.00     12.8±0.08ms     3.4 GB/sec    1.02     13.0±0.07ms     3.4 GB/sec
short_string_non_null/bloom_filter                 1.03     27.7±0.07ms   433.1 MB/sec    1.00     26.9±0.09ms   446.2 MB/sec
short_string_non_null/cdc                          1.00     20.0±0.06ms   598.7 MB/sec    1.01     20.2±0.05ms   593.3 MB/sec
short_string_non_null/default                      1.00     15.9±0.05ms   755.3 MB/sec    1.01     16.0±0.09ms   748.3 MB/sec
short_string_non_null/parquet_2                    1.00     25.6±0.06ms   467.9 MB/sec    1.01     25.9±0.05ms   463.6 MB/sec
short_string_non_null/zstd                         1.00     35.4±0.09ms   338.6 MB/sec    1.01     35.6±0.10ms   336.9 MB/sec
short_string_non_null/zstd_parquet_2               1.00     28.5±0.06ms   420.8 MB/sec    1.01     28.7±0.05ms   418.3 MB/sec
string/bloom_filter                                1.00   217.7±14.09ms     2.4 GB/sec    1.09   236.2±25.14ms     2.2 GB/sec
string/cdc                                         1.00    219.0±6.49ms     2.3 GB/sec    1.01   221.8±15.61ms     2.3 GB/sec
string/default                                     1.07   140.3±20.36ms     3.6 GB/sec    1.00   130.9±18.99ms     3.9 GB/sec
string/parquet_2                                   1.00   161.0±23.66ms     3.2 GB/sec    1.23    197.3±2.34ms     2.6 GB/sec
string/zstd                                        1.00   440.6±17.07ms  1190.0 MB/sec    1.01   446.6±20.24ms  1173.9 MB/sec
string/zstd_parquet_2                              1.00    407.9±9.59ms  1285.1 MB/sec    1.01   413.2±13.23ms  1268.9 MB/sec
string_and_binary_view/bloom_filter                1.01     64.5±0.20ms   499.8 MB/sec    1.00     64.0±0.37ms   503.7 MB/sec
string_and_binary_view/cdc                         1.00     59.3±0.30ms   544.1 MB/sec    1.01     59.7±0.25ms   540.6 MB/sec
string_and_binary_view/default                     1.00     48.4±0.31ms   667.0 MB/sec    1.02     49.2±0.19ms   655.0 MB/sec
string_and_binary_view/parquet_2                   1.00     59.4±0.17ms   542.6 MB/sec    1.01     60.2±0.29ms   535.5 MB/sec
string_and_binary_view/zstd                        1.00     84.9±0.16ms   379.7 MB/sec    1.01     85.9±0.26ms   375.5 MB/sec
string_and_binary_view/zstd_parquet_2              1.00     73.3±0.19ms   439.7 MB/sec    1.01     74.1±0.22ms   435.4 MB/sec
string_dictionary/bloom_filter                     1.00     88.5±0.73ms     2.9 GB/sec    1.03     91.2±1.45ms     2.8 GB/sec
string_dictionary/cdc                              1.00     50.7±0.28ms     5.1 GB/sec    1.03     52.0±0.48ms     5.0 GB/sec
string_dictionary/default                          1.00     46.8±0.83ms     5.5 GB/sec    1.03     48.4±1.20ms     5.3 GB/sec
string_dictionary/parquet_2                        1.00     54.3±0.22ms     4.8 GB/sec    1.01     55.1±0.31ms     4.7 GB/sec
string_dictionary/zstd                             1.00    207.8±1.49ms  1271.1 MB/sec    1.01    209.0±1.56ms  1263.7 MB/sec
string_dictionary/zstd_parquet_2                   1.00    198.6±0.20ms  1330.0 MB/sec    1.00    199.5±0.22ms  1323.9 MB/sec
string_non_null/bloom_filter                       1.00   245.5±16.29ms     2.1 GB/sec    1.03   253.3±11.27ms     2.0 GB/sec
string_non_null/cdc                                1.00    262.4±8.78ms  1996.6 MB/sec    1.04    272.4±6.31ms  1924.0 MB/sec
string_non_null/default                            1.00   135.4±13.98ms     3.8 GB/sec    1.04   140.7±12.80ms     3.6 GB/sec
string_non_null/parquet_2                          1.03   141.8±13.58ms     3.6 GB/sec    1.00   137.8±10.65ms     3.7 GB/sec
string_non_null/zstd                               1.04    556.9±4.16ms   941.0 MB/sec    1.00    537.9±2.82ms   974.2 MB/sec
string_non_null/zstd_parquet_2                     1.00    504.7±0.42ms  1038.3 MB/sec    1.01    508.5±8.08ms  1030.4 MB/sec
struct_all_null/bloom_filter                       1.00    390.6±1.46µs    40.3 GB/sec    1.01    394.2±1.41µs    39.9 GB/sec
struct_all_null/cdc                                1.00      7.8±0.06ms     2.0 GB/sec    1.02      7.9±0.15ms  2037.6 MB/sec
struct_all_null/default                            1.00    136.6±0.48µs   115.3 GB/sec    1.01    137.7±0.59µs   114.3 GB/sec
struct_all_null/parquet_2                          1.00    137.9±0.49µs   114.2 GB/sec    1.00    137.8±0.56µs   114.3 GB/sec
struct_all_null/zstd                               1.00    187.2±0.68µs    84.1 GB/sec    1.01    188.5±1.13µs    83.6 GB/sec
struct_all_null/zstd_parquet_2                     1.00    172.6±0.83µs    91.2 GB/sec    1.01    174.2±0.87µs    90.4 GB/sec
struct_non_null/bloom_filter                       1.02     46.8±0.15ms   342.0 MB/sec    1.00     46.0±0.13ms   347.9 MB/sec
struct_non_null/cdc                                1.01     46.0±0.81ms   347.7 MB/sec    1.00     45.7±0.12ms   350.2 MB/sec
struct_non_null/default                            1.00     32.1±0.12ms   498.7 MB/sec    1.00     32.2±0.11ms   497.4 MB/sec
struct_non_null/parquet_2                          1.00     40.8±0.12ms   391.9 MB/sec    1.00     40.9±0.12ms   390.9 MB/sec
struct_non_null/zstd                               1.00     40.7±0.11ms   393.4 MB/sec    1.00     40.9±0.10ms   391.7 MB/sec
struct_non_null/zstd_parquet_2                     1.00     54.9±0.17ms   291.7 MB/sec    1.00     55.0±0.12ms   290.9 MB/sec
struct_sparse_99pct_null/bloom_filter              1.00      6.5±0.05ms     2.4 GB/sec    1.01      6.6±0.05ms     2.4 GB/sec
struct_sparse_99pct_null/cdc                       1.03     14.0±0.07ms  1154.6 MB/sec    1.00     13.6±0.12ms  1186.3 MB/sec
struct_sparse_99pct_null/default                   1.00      6.0±0.03ms     2.6 GB/sec    1.00      6.0±0.03ms     2.6 GB/sec
struct_sparse_99pct_null/parquet_2                 1.01      6.1±0.04ms     2.6 GB/sec    1.00      6.0±0.03ms     2.6 GB/sec
struct_sparse_99pct_null/zstd                      1.01      7.5±0.06ms     2.1 GB/sec    1.00      7.4±0.04ms     2.1 GB/sec
struct_sparse_99pct_null/zstd_parquet_2            1.02      6.9±0.04ms     2.3 GB/sec    1.00      6.8±0.03ms     2.3 GB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 2370.5s
Peak memory 2.7 GiB
Avg memory 2.5 GiB
CPU user 2310.8s
CPU sys 53.5s
Peak spill 0 B

branch

Metric Value
Wall time 2355.5s
Peak memory 2.7 GiB
Avg memory 2.5 GiB
CPU user 2284.3s
CPU sys 67.5s
Peak spill 0 B

File an issue against this benchmark runner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

optimize(parquet): write list without repeated child write call

4 participants