feat: Add neqo_udp::Socket::send_buffer()#3389
feat: Add neqo_udp::Socket::send_buffer()#3389larseggert wants to merge 3 commits intomozilla:mainfrom
neqo_udp::Socket::send_buffer()#3389Conversation
This allows sending from a `&[u8]` without copying into a `Vec<u8>`. (I have a pending Gecko patch to use `neqo-udp` for WebRTC where this avoids copying.)
There was a problem hiding this comment.
Pull request overview
Adds a new Socket::send_buffer() API to allow sending directly from a borrowed &[u8] (optionally using GSO), avoiding Vec<u8> allocations/copies.
Changes:
- Introduce
Socket::send_buffer()for zero-copy sends from a borrowed buffer. - Refactor common send logic into
try_send_transmit()and reuse it fromsend_inner(). - Add tests (and a shared helper) covering
send_buffer()for both single datagrams and GSO.
Can you share the patch? I suggest not merging here until the Firefox patch is in a reviewable state to reduce churn on the Neqo side. |
https://github.com/larseggert/firefox/tree/feat-webrtc-neqo-udp |
|
I'll leave this as draft per @mxinden's suggestion to first stabilize the Gecko parts. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3389 +/- ##
==========================================
- Coverage 94.24% 94.14% -0.10%
==========================================
Files 125 129 +4
Lines 37973 38352 +379
Branches 37973 38352 +379
==========================================
+ Hits 35787 36107 +320
- Misses 1349 1390 +41
- Partials 837 855 +18
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Merging this PR will degrade performance by 8.85%
Performance Changes
Comparing Footnotes
|
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
|
Unable to generate the flame graphsThe performance report has correctly been generated, but there was an internal error while generating the flame graphs for this run. We're working on fixing the issue. Feel free to contact us on Discord or at support@codspeed.io if the issue persists. |
Benchmark resultsNo significant performance differences relative to 378c365. All resultstransfer/1-conn/1-100mb-resp (aka. Download)/mtu-1504: No change in performance detected. time: [202.27 ms 202.68 ms 203.14 ms]
thrpt: [492.27 MiB/s 493.38 MiB/s 494.39 MiB/s]
change:
time: [-0.0418% +0.2813% +0.6153] (p = 0.09 > 0.05)
thrpt: [-0.6115% -0.2805% +0.0418]
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severetransfer/1-conn/10_000-parallel-1b-resp (aka. RPS)/mtu-1504: Change within noise threshold. time: [284.22 ms 286.30 ms 288.40 ms]
thrpt: [34.674 Kelem/s 34.929 Kelem/s 35.185 Kelem/s]
change:
time: [-2.3828% -1.4224% -0.5537] (p = 0.00 < 0.05)
thrpt: [+0.5568% +1.4429% +2.4409]
Change within noise threshold.transfer/1-conn/1-1b-resp (aka. HPS)/mtu-1504: No change in performance detected. time: [38.458 ms 38.616 ms 38.796 ms]
thrpt: [25.776 B/s 25.896 B/s 26.002 B/s]
change:
time: [-0.6236% -0.0683% +0.5339] (p = 0.82 > 0.05)
thrpt: [-0.5310% +0.0683% +0.6275]
No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) low mild
3 (3.00%) high mild
5 (5.00%) high severetransfer/1-conn/1-100mb-req (aka. Upload)/mtu-1504: No change in performance detected. time: [204.49 ms 205.01 ms 205.76 ms]
thrpt: [486.00 MiB/s 487.78 MiB/s 489.02 MiB/s]
change:
time: [-0.6883% -0.3301% +0.0873] (p = 0.10 > 0.05)
thrpt: [-0.0872% +0.3312% +0.6931]
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severedecode 4096 bytes, mask ff: No change in performance detected. time: [4.5032 µs 4.5222 µs 4.5533 µs]
change: [-0.5059% +0.0253% +0.5572] (p = 0.93 > 0.05)
No change in performance detected.
Found 27 outliers among 100 measurements (27.00%)
1 (1.00%) low severe
5 (5.00%) low mild
8 (8.00%) high mild
13 (13.00%) high severedecode 1048576 bytes, mask ff: No change in performance detected. time: [1.1582 ms 1.1595 ms 1.1608 ms]
change: [-0.7521% -0.2021% +0.3459] (p = 0.47 > 0.05)
No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
10 (10.00%) low severe
4 (4.00%) high mild
2 (2.00%) high severedecode 4096 bytes, mask 7f: No change in performance detected. time: [5.7907 µs 5.8151 µs 5.8577 µs]
change: [-0.3733% +0.0625% +0.6873] (p = 0.84 > 0.05)
No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severedecode 1048576 bytes, mask 7f: No change in performance detected. time: [1.4866 ms 1.4887 ms 1.4910 ms]
change: [-0.0804% +0.1552% +0.3827] (p = 0.20 > 0.05)
No change in performance detected.decode 4096 bytes, mask 3f: No change in performance detected. time: [5.5363 µs 5.5439 µs 5.5518 µs]
change: [-0.2793% -0.0081% +0.2438] (p = 0.95 > 0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severedecode 1048576 bytes, mask 3f: No change in performance detected. time: [1.4143 ms 1.4164 ms 1.4185 ms]
change: [-0.9743% -0.3836% +0.0324] (p = 0.15 > 0.05)
No change in performance detected.streams/simulated/1-streams/each-1000-bytes: No change in performance detected. time: [129.68 ms 129.68 ms 129.69 ms]
thrpt: [7.5302 KiB/s 7.5304 KiB/s 7.5306 KiB/s]
change:
time: [-0.0026% +0.0011% +0.0048] (p = 0.56 > 0.05)
thrpt: [-0.0048% -0.0011% +0.0026]
No change in performance detected.streams/simulated/1000-streams/each-1-bytes: No change in performance detected. time: [2.5363 s 2.5366 s 2.5369 s]
thrpt: [394.18 B/s 394.23 B/s 394.27 B/s]
change:
time: [-0.0146% +0.0014% +0.0183] (p = 0.87 > 0.05)
thrpt: [-0.0183% -0.0014% +0.0146]
No change in performance detected.streams/simulated/1000-streams/each-1000-bytes: No change in performance detected. time: [6.5837 s 6.5899 s 6.5973 s]
thrpt: [148.02 KiB/s 148.19 KiB/s 148.33 KiB/s]
change:
time: [-0.1588% +0.0037% +0.1669] (p = 0.96 > 0.05)
thrpt: [-0.1666% -0.0037% +0.1591]
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high severestreams/walltime/1-streams/each-1000-bytes: No change in performance detected. time: [587.40 µs 590.07 µs 593.03 µs]
change: [-0.1180% +0.5209% +1.1472] (p = 0.11 > 0.05)
No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
13 (13.00%) high severestreams/walltime/1000-streams/each-1-bytes: Change within noise threshold. time: [12.396 ms 12.415 ms 12.436 ms]
change: [-0.4607% -0.2415% -0.0077] (p = 0.04 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severestreams/walltime/1000-streams/each-1000-bytes: Change within noise threshold. time: [45.164 ms 45.216 ms 45.272 ms]
change: [-1.4562% -1.1203% -0.8833] (p = 0.00 < 0.05)
Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severecoalesce_acked_from_zero 1+1 entries: No change in performance detected. time: [92.121 ns 92.513 ns 92.938 ns]
change: [-0.6918% -0.1398% +0.4273] (p = 0.64 > 0.05)
No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
9 (9.00%) high mild
3 (3.00%) high severecoalesce_acked_from_zero 3+1 entries: No change in performance detected. time: [109.75 ns 110.05 ns 110.38 ns]
change: [-3.5864% -0.9570% +0.8453] (p = 0.55 > 0.05)
No change in performance detected.
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low mild
12 (12.00%) high severecoalesce_acked_from_zero 10+1 entries: No change in performance detected. time: [109.51 ns 110.19 ns 110.96 ns]
change: [-1.2250% -0.2052% +0.6894] (p = 0.70 > 0.05)
No change in performance detected.
Found 18 outliers among 100 measurements (18.00%)
5 (5.00%) low severe
4 (4.00%) low mild
9 (9.00%) high severecoalesce_acked_from_zero 1000+1 entries: No change in performance detected. time: [94.525 ns 94.666 ns 94.824 ns]
change: [-6.4226% -2.1470% +0.3715] (p = 0.39 > 0.05)
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
5 (5.00%) high mild
5 (5.00%) high severeRxStreamOrderer::inbound_frame(): Change within noise threshold. time: [109.10 ms 109.25 ms 109.44 ms]
change: [+0.8274% +1.1397% +1.3990] (p = 0.00 < 0.05)
Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
4 (4.00%) low mild
4 (4.00%) high mild
1 (1.00%) high severesent::Packets::take_ranges: No change in performance detected. time: [4.4145 µs 4.4906 µs 4.5548 µs]
change: [-5.1414% -2.4734% +0.3416] (p = 0.08 > 0.05)
No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildtransfer/simulated/pacing-false/varying-seeds: No change in performance detected. time: [23.941 s 23.941 s 23.941 s]
thrpt: [171.09 KiB/s 171.09 KiB/s 171.09 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000]
No change in performance detected.transfer/simulated/pacing-true/varying-seeds: No change in performance detected. time: [23.676 s 23.676 s 23.676 s]
thrpt: [173.01 KiB/s 173.01 KiB/s 173.01 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000]
No change in performance detected.transfer/simulated/pacing-false/same-seed: No change in performance detected. time: [23.941 s 23.941 s 23.941 s]
thrpt: [171.09 KiB/s 171.09 KiB/s 171.09 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000]
No change in performance detected.transfer/simulated/pacing-true/same-seed: No change in performance detected. time: [23.676 s 23.676 s 23.676 s]
thrpt: [173.01 KiB/s 173.01 KiB/s 173.01 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000]
No change in performance detected.transfer/walltime/pacing-false/varying-seeds: Change within noise threshold. time: [23.219 ms 23.236 ms 23.253 ms]
change: [-0.5258% -0.4257% -0.3273] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severetransfer/walltime/pacing-true/varying-seeds: Change within noise threshold. time: [24.072 ms 24.089 ms 24.106 ms]
change: [+1.0423% +1.2498% +1.4033] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildtransfer/walltime/pacing-false/same-seed: Change within noise threshold. time: [23.525 ms 23.544 ms 23.563 ms]
change: [+0.5217% +0.6665% +0.7991] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mildtransfer/walltime/pacing-true/same-seed: Change within noise threshold. time: [24.106 ms 24.122 ms 24.140 ms]
change: [-0.6114% -0.3892% -0.2316] (p = 0.00 < 0.05)
Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severeDownload data for |
Client/server transfer resultsPerformance differences relative to 378c365. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Table above only shows statistically significant changes. See all results below. All resultsTransfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
This allows sending from a
&[u8]without copying into aVec<u8>. (I have a pending Gecko patch to useneqo-udpfor WebRTC where this avoids copying.)