feat: Use a BinaryHeap for the RxStreamOrderer#3054
feat: Use a BinaryHeap for the RxStreamOrderer#3054larseggert wants to merge 3 commits intomozilla:mainfrom
BinaryHeap for the RxStreamOrderer#3054Conversation
Claude things this is faster.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3054 +/- ##
==========================================
- Coverage 93.41% 92.69% -0.72%
==========================================
Files 124 125 +1
Lines 36234 36401 +167
Branches 36234 36401 +167
==========================================
- Hits 33847 33742 -105
- Misses 1540 1812 +272
Partials 847 847
|
Benchmark resultsPerformance differences relative to e94d8c6. 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: No change in performance detected. time: [198.69 ms 198.96 ms 199.25 ms]
thrpt: [501.88 MiB/s 502.61 MiB/s 503.30 MiB/s]
change:
time: [−0.3215% −0.1337% +0.0589%] (p = 0.18 > 0.05)
thrpt: [−0.0589% +0.1338% +0.3225%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: 💚 Performance has improved. time: [275.93 ms 277.73 ms 279.57 ms]
thrpt: [35.769 Kelem/s 36.006 Kelem/s 36.241 Kelem/s]
change:
time: [−3.4078% −2.6205% −1.7568%] (p = 0.00 < 0.05)
thrpt: [+1.7882% +2.6910% +3.5280%]
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected. time: [28.391 ms 28.466 ms 28.568 ms]
thrpt: [35.005 B/s 35.130 B/s 35.223 B/s]
change:
time: [−0.7924% −0.2954% +0.1849%] (p = 0.24 > 0.05)
thrpt: [−0.1845% +0.2963% +0.7988%]
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: Change within noise threshold. time: [200.33 ms 200.51 ms 200.69 ms]
thrpt: [498.27 MiB/s 498.74 MiB/s 499.19 MiB/s]
change:
time: [−1.2257% −0.9978% −0.8160%] (p = 0.00 < 0.05)
thrpt: [+0.8227% +1.0078% +1.2409%]
decode 4096 bytes, mask ff: No change in performance detected. time: [11.598 µs 11.634 µs 11.678 µs]
change: [−0.0902% +0.3590% +0.8599%] (p = 0.14 > 0.05)
decode 1048576 bytes, mask ff: No change in performance detected. time: [3.0214 ms 3.0306 ms 3.0415 ms]
change: [−0.4590% +0.0184% +0.4968%] (p = 0.91 > 0.05)
decode 4096 bytes, mask 7f: No change in performance detected. time: [19.963 µs 20.031 µs 20.114 µs]
change: [−0.1766% +0.1347% +0.4714%] (p = 0.44 > 0.05)
decode 1048576 bytes, mask 7f: No change in performance detected. time: [5.0437 ms 5.0585 ms 5.0766 ms]
change: [−0.6053% −0.1050% +0.3628%] (p = 0.68 > 0.05)
decode 4096 bytes, mask 3f: No change in performance detected. time: [8.2578 µs 8.2928 µs 8.3415 µs]
change: [−1.0254% −0.2539% +0.4810%] (p = 0.54 > 0.05)
decode 1048576 bytes, mask 3f: No change in performance detected. time: [1.5865 ms 1.5921 ms 1.5991 ms]
change: [−0.9581% −0.2435% +0.4217%] (p = 0.51 > 0.05)
1-streams/each-1000-bytes/wallclock-time: Change within noise threshold. time: [585.30 µs 588.12 µs 591.20 µs]
change: [+0.2303% +0.8014% +1.4096%] (p = 0.00 < 0.05)
1000-streams/each-1-bytes/wallclock-time: 💚 Performance has improved. time: [13.014 ms 13.054 ms 13.114 ms]
change: [−3.7565% −3.3931% −2.9509%] (p = 0.00 < 0.05)
1000-streams/each-1000-bytes/wallclock-time: 💚 Performance has improved. time: [45.996 ms 46.120 ms 46.247 ms]
change: [−3.7940% −3.3373% −2.8882%] (p = 0.00 < 0.05)
coalesce_acked_from_zero 1+1 entries: No change in performance detected. time: [88.177 ns 88.543 ns 88.901 ns]
change: [−0.9551% −0.2383% +0.3745%] (p = 0.50 > 0.05)
coalesce_acked_from_zero 3+1 entries: No change in performance detected. time: [105.88 ns 106.40 ns 107.02 ns]
change: [−0.1757% +0.5561% +1.6875%] (p = 0.32 > 0.05)
coalesce_acked_from_zero 10+1 entries: No change in performance detected. time: [105.08 ns 105.39 ns 105.78 ns]
change: [−1.7450% −0.7185% +0.0000%] (p = 0.11 > 0.05)
coalesce_acked_from_zero 1000+1 entries: No change in performance detected. time: [89.361 ns 90.527 ns 93.125 ns]
change: [−5.4906% +6.4286% +25.002%] (p = 0.63 > 0.05)
RxStreamOrderer::inbound_frame(): 💚 Performance has improved. time: [99.570 ms 99.634 ms 99.702 ms]
change: [−8.0734% −7.9411% −7.8197%] (p = 0.00 < 0.05)
BTreeMap: in-order frames:time: [920.99 µs 923.19 µs 927.32 µs] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe BinaryHeap: in-order frames:time: [96.227 µs 96.318 µs 96.409 µs] Found 5 outliers among 100 measurements (5.00%) 2 (2.00%) high mild 3 (3.00%) high severe BTreeMap: reverse-order frames:time: [187.10 µs 187.27 µs 187.45 µs] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe BinaryHeap: reverse-order frames:time: [140.21 µs 140.37 µs 140.52 µs] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe BTreeMap: random-order frames:time: [341.23 µs 341.59 µs 341.96 µs] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe BinaryHeap: random-order frames:time: [131.21 µs 131.39 µs 131.64 µs] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe BTreeMap: frames with gaps:time: [113.75 µs 113.86 µs 113.98 µs] Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) high mild 1 (1.00%) high severe BinaryHeap: frames with gaps:time: [57.041 µs 57.225 µs 57.502 µs] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe BTreeMap: overlapping frames:time: [169.49 µs 169.70 µs 169.91 µs] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe BinaryHeap: overlapping frames:time: [115.80 µs 115.99 µs 116.20 µs] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe BTreeMap: read_to_end after in-order insert:time: [301.84 µs 302.09 µs 302.34 µs] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe BinaryHeap: read_to_end after in-order insert:time: [256.56 µs 256.80 µs 257.05 µs] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe varying_frame_counts/BTreeMap/100: time: [12.383 µs 12.396 µs 12.408 µs]
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severe
varying_frame_counts/BinaryHeap/100
time: [10.120 µs 10.129 µs 10.139 µs]
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
varying_frame_counts/BTreeMap/500
time: [96.575 µs 96.647 µs 96.715 µs]
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe
varying_frame_counts/BinaryHeap/500
time: [57.504 µs 57.561 µs 57.618 µs]
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
varying_frame_counts/BTreeMap/1000
time: [203.39 µs 204.17 µs 205.58 µs]
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
varying_frame_counts/BinaryHeap/1000
time: [114.71 µs 114.83 µs 114.97 µs]
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
varying_frame_counts/BTreeMap/5000
time: [4.2644 ms 4.2704 ms 4.2766 ms]
varying_frame_counts/BinaryHeap/5000
time: [3.8359 ms 3.8412 ms 3.8469 ms]
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
varying_frame_counts/BTreeMap/10000
time: [9.6113 ms 9.6247 ms 9.6390 ms]
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
varying_frame_counts/BinaryHeap/10000
time: [8.9493 ms 8.9783 ms 9.0254 ms]
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severesent::Packets::take_ranges: No change in performance detected. time: [4.6060 µs 4.7374 µs 4.8735 µs]
change: [−3.9980% +0.0194% +4.1328%] (p = 0.99 > 0.05)
transfer/pacing-false/varying-seeds/wallclock-time/run: 💚 Performance has improved. time: [23.813 ms 23.859 ms 23.913 ms]
change: [−5.7948% −5.5582% −5.3053%] (p = 0.00 < 0.05)
transfer/pacing-false/varying-seeds/simulated-time/run: No change in performance detected. time: [25.183 s 25.218 s 25.254 s]
thrpt: [162.19 KiB/s 162.42 KiB/s 162.65 KiB/s]
change:
time: [−0.1044% +0.0844% +0.2695%] (p = 0.38 > 0.05)
thrpt: [−0.2688% −0.0844% +0.1045%]
transfer/pacing-true/varying-seeds/wallclock-time/run: Change within noise threshold. time: [24.760 ms 24.819 ms 24.880 ms]
change: [−3.4095% −3.0389% −2.7060%] (p = 0.00 < 0.05)
transfer/pacing-true/varying-seeds/simulated-time/run: Change within noise threshold. time: [24.897 s 24.936 s 24.976 s]
thrpt: [164.00 KiB/s 164.26 KiB/s 164.51 KiB/s]
change:
time: [−0.5599% −0.3266% −0.0848%] (p = 0.01 < 0.05)
thrpt: [+0.0848% +0.3277% +0.5630%]
transfer/pacing-false/same-seed/wallclock-time/run: Change within noise threshold. time: [24.728 ms 24.754 ms 24.792 ms]
change: [−3.0443% −2.8605% −2.6565%] (p = 0.00 < 0.05)
transfer/pacing-false/same-seed/simulated-time/run: No change in performance detected. time: [25.710 s 25.710 s 25.710 s]
thrpt: [159.31 KiB/s 159.31 KiB/s 159.31 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000%]
transfer/pacing-true/same-seed/wallclock-time/run: Change within noise threshold. time: [25.735 ms 25.770 ms 25.822 ms]
change: [−2.7261% −2.5468% −2.3496%] (p = 0.00 < 0.05)
transfer/pacing-true/same-seed/simulated-time/run: No change in performance detected. time: [25.675 s 25.675 s 25.675 s]
thrpt: [159.53 KiB/s 159.53 KiB/s 159.53 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000%]
Download data for |
| group.finish(); | ||
| } | ||
|
|
||
| criterion_group!( |
There was a problem hiding this comment.
That's an awful lot of benchmarking.
There was a problem hiding this comment.
Yeah, no intention of leaving this in if there is a signal that the core change is at all positive.
Signed-off-by: Lars Eggert <lars@eggert.org>
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to b9c32c7. neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
Client/server transfer resultsPerformance differences relative to b9c32c7. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
|
It's not clear that this is faster. (If you want to compare the benchmarks, don't you have to run those on main? How does that work?) |
Claude thinks this is faster.