Skip to content

removed clippy ignore statment#10111

Merged
alamb merged 2 commits into
apache:mainfrom
Rich-T-kid:rich-T-kid/arrow-ipc-cleanup
Jun 11, 2026
Merged

removed clippy ignore statment#10111
alamb merged 2 commits into
apache:mainfrom
Rich-T-kid:rich-T-kid/arrow-ipc-cleanup

Conversation

@Rich-T-kid

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Resolves this #10044 (comment) from #10044

Rationale for this change

Code in this file is hard to navigate & its unclear what is happening.

What changes are included in this PR?

This PR introduces IpcMetadataBuilder, a struct that groups the nodes and buffers vecs previously passed separately into write_array_data(), and removes the redundant num_rows/null_count parameters by deriving them from array_data directly. Together these reduce write_array_data() from 10 arguments to 7, eliminating the #[allow(clippy::too_many_arguments)] suppression, and doc comments are added to clarify the two-channel output model between IpcMetadataBuilder (flatbuffer header metadata) and IpcBodySink (raw Arrow data bytes).

Are these changes tested?

yes

Are there any user-facing changes?

no

@github-actions github-actions Bot added the arrow Changes to the arrow crate label Jun 10, 2026
@Rich-T-kid

Copy link
Copy Markdown
Contributor Author

small refactor
cc @alamb

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thank you @Rich-T-kid

Comment thread arrow-ipc/src/writer.rs Outdated
offset = encode_sink_buffer(
null_buffer,
buffers,
&mut meta.buffers,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the next PR could potentially update this to take the whole structure here (rather than &mut meta.buffers)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about doing that but only meta.buffers gets used. If you like that approach more I can change it right now

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a fine follow on -- the downside as you say is that it is now less obvious what is being used. The upside is that the logic is more encapsulated (the caller doesn't have to know that only the buffers are being used 🤔 )

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense to me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced this in the latest PR.

Comment thread arrow-ipc/src/writer.rs
sink: &mut IpcBodySink<'_>,
nodes: &mut Vec<crate::FieldNode>,
offset: i64,
num_rows: usize,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is a nice cleanup

@alamb

alamb commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

run benchmark flight

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4676159256-528-jg7ck 6.12.68+ #1 SMP Sat May 2 07:49:07 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing rich-T-kid/arrow-ipc-cleanup (f6ff6b1) to 301eb26 (merge-base) diff
BENCH_NAME=flight
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench flight
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                         main                                   rich-T-kid_arrow-ipc-cleanup
-----                         ----                                   ----------------------------
encode/dict/65536x1           1.00    269.7±0.67µs   932.0 MB/sec    1.01    273.0±1.89µs   920.9 MB/sec
encode/dict/65536x8           1.00      8.6±0.12ms   232.9 MB/sec    1.04      9.0±0.13ms   223.8 MB/sec
encode/dict/8192x1            1.00     35.3±0.03µs   924.9 MB/sec    1.00     35.4±0.11µs   922.9 MB/sec
encode/dict/8192x8            1.03    305.8±1.85µs   854.3 MB/sec    1.00    297.4±1.86µs   878.7 MB/sec
encode/fixed/65536x1          1.00     10.2±0.02µs    48.1 GB/sec    1.00     10.2±0.02µs    47.9 GB/sec
encode/fixed/65536x8          1.02   1150.3±2.18µs     3.4 GB/sec    1.00   1126.2±3.27µs     3.5 GB/sec
encode/fixed/8192x1           1.03      3.2±0.01µs    19.1 GB/sec    1.00      3.1±0.01µs    19.6 GB/sec
encode/fixed/8192x8           1.05     18.0±0.06µs    27.1 GB/sec    1.00     17.2±0.04µs    28.5 GB/sec
encode/nested/65536x1         1.00     38.0±0.23µs    32.1 GB/sec    1.00     38.2±0.29µs    32.0 GB/sec
encode/nested/65536x8         1.01      3.2±0.02ms     3.1 GB/sec    1.00      3.2±0.03ms     3.1 GB/sec
encode/nested/8192x1          1.00      5.7±0.05µs    26.9 GB/sec    1.02      5.8±0.01µs    26.5 GB/sec
encode/nested/8192x8          1.01     47.4±0.18µs    25.8 GB/sec    1.00     46.9±0.09µs    26.1 GB/sec
encode/variable/65536x1       1.14     81.6±0.40µs    26.9 GB/sec    1.00     71.4±0.36µs    30.8 GB/sec
encode/variable/65536x8       1.03      5.8±0.07ms     3.0 GB/sec    1.00      5.6±0.09ms     3.1 GB/sec
encode/variable/8192x1        1.00      7.0±0.01µs    39.0 GB/sec    1.01      7.1±0.02µs    38.5 GB/sec
encode/variable/8192x8        1.15     91.3±0.32µs    24.1 GB/sec    1.00     79.4±0.26µs    27.7 GB/sec
roundtrip/dict/65536x1        1.00  1283.0±49.45µs   195.9 MB/sec    1.02  1308.0±47.47µs   192.2 MB/sec
roundtrip/dict/65536x8        1.20     17.5±0.64ms   115.0 MB/sec    1.00     14.5±0.54ms   138.6 MB/sec
roundtrip/dict/8192x1         1.00    205.5±6.33µs   158.9 MB/sec    1.03    212.2±6.06µs   153.9 MB/sec
roundtrip/dict/8192x8         1.00  1310.3±47.41µs   199.4 MB/sec    1.03  1346.1±41.78µs   194.1 MB/sec
roundtrip/fixed/65536x1       1.00    303.7±4.93µs  1646.5 MB/sec    1.04    315.1±3.86µs  1587.2 MB/sec
roundtrip/fixed/65536x8       1.00      2.1±0.03ms  1865.1 MB/sec    1.06      2.3±0.07ms  1766.2 MB/sec
roundtrip/fixed/8192x1        1.00     92.0±1.02µs   680.2 MB/sec    1.02     93.8±0.79µs   667.3 MB/sec
roundtrip/fixed/8192x8        1.00    330.0±4.19µs  1517.2 MB/sec    1.02    338.2±5.04µs  1480.7 MB/sec
roundtrip/nested/65536x1      1.00   836.1±44.77µs  1495.3 MB/sec    1.03   861.1±39.82µs  1451.9 MB/sec
roundtrip/nested/65536x8      1.00     11.0±0.49ms   913.2 MB/sec    1.02     11.1±0.41ms   899.5 MB/sec
roundtrip/nested/8192x1       1.00    157.7±5.45µs   991.9 MB/sec    1.03    162.9±5.20µs   960.5 MB/sec
roundtrip/nested/8192x8       1.00   887.5±38.95µs  1410.3 MB/sec    1.03   914.6±42.02µs  1368.5 MB/sec
roundtrip/variable/65536x1    1.00  1225.5±53.74µs  1836.1 MB/sec    1.01  1242.3±29.71µs  1811.3 MB/sec
roundtrip/variable/65536x8    1.00     16.4±0.58ms  1097.8 MB/sec    1.04     17.0±0.50ms  1059.9 MB/sec
roundtrip/variable/8192x1     1.00    204.9±5.95µs  1373.6 MB/sec    1.03    212.0±5.91µs  1327.7 MB/sec
roundtrip/variable/8192x8     1.00  1211.2±25.79µs  1858.8 MB/sec    1.03  1253.4±25.59µs  1796.3 MB/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 330.1s
Peak memory 94.4 MiB
Avg memory 34.0 MiB
CPU user 331.2s
CPU sys 74.5s
Peak spill 0 B

branch

Metric Value
Wall time 335.1s
Peak memory 91.6 MiB
Avg memory 36.7 MiB
CPU user 335.2s
CPU sys 76.7s
Peak spill 0 B

File an issue against this benchmark runner

@alamb alamb merged commit cecbc72 into apache:main Jun 11, 2026
27 checks passed
@alamb

alamb commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Thank you @Rich-T-kid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants