refactor(transport/cubic): only convert max_datagram_size_f64 once#2994
refactor(transport/cubic): only convert max_datagram_size_f64 once#2994omansfeld merged 2 commits intomozilla:mainfrom
max_datagram_size_f64 once#2994Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2994 +/- ##
==========================================
- Coverage 93.36% 93.34% -0.02%
==========================================
Files 123 123
Lines 35696 35697 +1
Branches 35696 35697 +1
==========================================
- Hits 33327 33321 -6
- Misses 1524 1534 +10
+ Partials 845 842 -3
|
|
| Branch | cubic_mss_f64 |
| Testbed | On-prem |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result nanoseconds (ns) (Result Δ%) | Upper Boundary nanoseconds (ns) (Limit %) |
|---|---|---|---|
| 1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client | 📈 view plot 🚷 view threshold | 208,900,000.00 ns(-1.64%)Baseline: 212,382,537.31 ns | 218,333,921.10 ns (95.68%) |
| 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client | 📈 view plot 🚷 view threshold | 200,770,000.00 ns(-3.11%)Baseline: 207,220,149.25 ns | 213,166,494.59 ns (94.18%) |
| 1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client | 📈 view plot 🚷 view threshold | 28,551,000.00 ns(+1.11%)Baseline: 28,238,089.55 ns | 28,731,622.06 ns (99.37%) |
| 1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client | 📈 view plot 🚷 view threshold | 295,100,000.00 ns(+0.64%)Baseline: 293,236,417.91 ns | 299,314,514.73 ns (98.59%) |
| 1-streams/each-1000-bytes/simulated-time | 📈 view plot 🚷 view threshold | 119,140,000.00 ns(+1.75%)Baseline: 117,092,686.57 ns | 119,165,504.61 ns (99.98%) |
| 1-streams/each-1000-bytes/wallclock-time | 📈 view plot 🚷 view threshold | 597,190.00 ns(-1.96%)Baseline: 609,148.36 ns | 629,905.49 ns (94.81%) |
| 1000-streams/each-1-bytes/simulated-time | 📈 view plot 🚷 view threshold | 14,990,000,000.00 ns(+0.01%)Baseline: 14,987,835,820.90 ns | 15,006,429,992.32 ns (99.89%) |
| 1000-streams/each-1-bytes/wallclock-time | 📈 view plot 🚷 view threshold | 14,062,000.00 ns(-3.35%)Baseline: 14,548,955.22 ns | 15,061,860.75 ns (93.36%) |
| 1000-streams/each-1000-bytes/simulated-time | 📈 view plot 🚷 view threshold | 18,931,000,000.00 ns(+0.22%)Baseline: 18,889,223,880.60 ns | 19,121,446,626.32 ns (99.00%) |
| 1000-streams/each-1000-bytes/wallclock-time | 📈 view plot 🚷 view threshold | 49,631,000.00 ns(-10.26%)Baseline: 55,308,059.70 ns | 60,233,922.84 ns (82.40%) |
| RxStreamOrderer::inbound_frame() | 📈 view plot 🚷 view threshold | 108,680,000.00 ns(-0.60%)Baseline: 109,334,776.12 ns | 110,803,576.63 ns (98.08%) |
| coalesce_acked_from_zero 1+1 entries | 📈 view plot 🚷 view threshold | 89.19 ns(+0.68%)Baseline: 88.58 ns | 89.23 ns (99.96%) |
| coalesce_acked_from_zero 10+1 entries | 📈 view plot 🚷 view threshold | 106.67 ns(+0.47%)Baseline: 106.17 ns | 107.03 ns (99.66%) |
| coalesce_acked_from_zero 1000+1 entries | 📈 view plot 🚷 view threshold | 90.06 ns(+0.16%)Baseline: 89.91 ns | 93.74 ns (96.07%) |
| coalesce_acked_from_zero 3+1 entries | 📈 view plot 🚷 view threshold | 106.90 ns(+0.21%)Baseline: 106.68 ns | 107.45 ns (99.49%) |
| decode 1048576 bytes, mask 3f | 📈 view plot 🚷 view threshold | 1,590,500.00 ns(-0.13%)Baseline: 1,592,605.97 ns | 1,599,666.11 ns (99.43%) |
| decode 1048576 bytes, mask 7f | 📈 view plot 🚷 view threshold | 5,045,300.00 ns(-0.34%)Baseline: 5,062,635.82 ns | 5,078,360.48 ns (99.35%) |
| decode 1048576 bytes, mask ff | 📈 view plot 🚷 view threshold | 3,029,400.00 ns(-0.14%)Baseline: 3,033,573.13 ns | 3,046,332.33 ns (99.44%) |
| decode 4096 bytes, mask 3f | 📈 view plot 🚷 view threshold | 8,269.30 ns(-0.30%)Baseline: 8,294.57 ns | 8,338.97 ns (99.16%) |
| decode 4096 bytes, mask 7f | 📈 view plot 🚷 view threshold | 20,025.00 ns(+0.10%)Baseline: 20,004.60 ns | 20,092.64 ns (99.66%) |
| decode 4096 bytes, mask ff | 📈 view plot 🚷 view threshold | 11,843.00 ns(-0.08%)Baseline: 11,852.07 ns | 11,923.49 ns (99.32%) |
| sent::Packets::take_ranges | 📈 view plot 🚷 view threshold | 4,791.80 ns(-0.84%)Baseline: 4,832.37 ns | 5,008.53 ns (95.67%) |
| transfer/pacing-false/same-seed/simulated-time/run | 📈 view plot 🚷 view threshold | 25,152,000,000.00 ns | |
| transfer/pacing-false/same-seed/wallclock-time/run | 📈 view plot 🚷 view threshold | 25,807,000.00 ns(-0.49%)Baseline: 25,934,200.00 ns | 26,667,573.27 ns (96.77%) |
| transfer/pacing-false/varying-seeds/simulated-time/run | 📈 view plot 🚷 view threshold | 25,158,000,000.00 ns(-0.03%)Baseline: 25,165,646,153.85 ns | 25,209,363,767.86 ns (99.80%) |
| transfer/pacing-false/varying-seeds/wallclock-time/run | 📈 view plot 🚷 view threshold | 26,261,000.00 ns(-0.09%)Baseline: 26,285,815.38 ns | 26,894,554.10 ns (97.64%) |
| transfer/pacing-true/same-seed/simulated-time/run | 📈 view plot 🚷 view threshold | 25,588,000,000.00 ns | |
| transfer/pacing-true/same-seed/wallclock-time/run | 📈 view plot 🚷 view threshold | 27,225,000.00 ns(-0.86%)Baseline: 27,461,876.92 ns | 28,234,730.66 ns (96.42%) |
| transfer/pacing-true/varying-seeds/simulated-time/run | 📈 view plot 🚷 view threshold | 25,004,000,000.00 ns(+0.04%)Baseline: 24,994,892,307.69 ns | 25,041,476,466.55 ns (99.85%) |
| transfer/pacing-true/varying-seeds/wallclock-time/run | 📈 view plot 🚷 view threshold | 27,050,000.00 ns(+0.64%)Baseline: 26,877,400.00 ns | 27,528,709.49 ns (98.26%) |
|
Quick general question about our github workflow: Should I add patches to the merge queue myself once they are approved and comments are addressed (like here now), or is that done by the reviewer? (also just noting, there might be merge conflicts with #2970 so they probably shouldn't both be added to the queue at the same time) |
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to 5d6e774. neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
Client/server transfer resultsPerformance differences relative to 8441e29. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
We don't really have a fixed policy. Sometimes the reviewer will, but esp. when I think that some nits may still be fixed I will just approve and leave it for the author to enqueue. |
Well, whichever is second will need a rebase :-) |
Benchmark resultsPerformance differences relative to 8441e29. 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: Change within noise threshold. time: [200.32 ms 200.77 ms 201.34 ms]
thrpt: [496.67 MiB/s 498.08 MiB/s 499.20 MiB/s]
change:
time: [+0.1838% +0.7369% +1.1839%] (p = 0.00 < 0.05)
thrpt: [−1.1700% −0.7315% −0.1835%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: Change within noise threshold. time: [293.60 ms 295.10 ms 296.60 ms]
thrpt: [33.715 Kelem/s 33.887 Kelem/s 34.060 Kelem/s]
change:
time: [−1.8996% −1.1990% −0.4718%] (p = 0.00 < 0.05)
thrpt: [+0.4741% +1.2135% +1.9364%]
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected. time: [28.455 ms 28.551 ms 28.667 ms]
thrpt: [34.883 B/s 35.025 B/s 35.143 B/s]
change:
time: [−0.7152% −0.1500% +0.4157%] (p = 0.60 > 0.05)
thrpt: [−0.4139% +0.1503% +0.7204%]
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: 💔 Performance has regressed. time: [208.71 ms 208.90 ms 209.10 ms]
thrpt: [478.23 MiB/s 478.69 MiB/s 479.14 MiB/s]
change:
time: [+1.9063% +2.1734% +2.3833%] (p = 0.00 < 0.05)
thrpt: [−2.3278% −2.1272% −1.8707%]
decode 4096 bytes, mask ff: No change in performance detected. time: [11.809 µs 11.843 µs 11.885 µs]
change: [−0.7498% +0.1343% +1.1011%] (p = 0.78 > 0.05)
decode 1048576 bytes, mask ff: No change in performance detected. time: [3.0200 ms 3.0294 ms 3.0405 ms]
change: [−0.5349% −0.0327% +0.4811%] (p = 0.90 > 0.05)
decode 4096 bytes, mask 7f: No change in performance detected. time: [19.963 µs 20.025 µs 20.091 µs]
change: [−0.7817% −0.2284% +0.2363%] (p = 0.41 > 0.05)
decode 1048576 bytes, mask 7f: No change in performance detected. time: [5.0342 ms 5.0453 ms 5.0581 ms]
change: [−0.5843% −0.1787% +0.1991%] (p = 0.38 > 0.05)
decode 4096 bytes, mask 3f: Change within noise threshold. time: [8.2496 µs 8.2693 µs 8.2988 µs]
change: [−1.0909% −0.5736% −0.1330%] (p = 0.02 < 0.05)
decode 1048576 bytes, mask 3f: No change in performance detected. time: [1.5850 ms 1.5905 ms 1.5974 ms]
change: [−0.4693% +0.0563% +0.5033%] (p = 0.87 > 0.05)
1-streams/each-1000-bytes/wallclock-time: No change in performance detected. time: [595.01 µs 597.19 µs 599.68 µs]
change: [−0.0149% +0.5289% +1.0658%] (p = 0.06 > 0.05)
1000-streams/each-1-bytes/wallclock-time: Change within noise threshold. time: [14.009 ms 14.062 ms 14.136 ms]
change: [−1.4241% −0.9946% −0.4497%] (p = 0.00 < 0.05)
1000-streams/each-1000-bytes/wallclock-time: 💚 Performance has improved. time: [49.438 ms 49.631 ms 49.826 ms]
change: [−4.0619% −3.5237% −3.0129%] (p = 0.00 < 0.05)
1000-streams/each-1000-bytes/simulated-time: No change in performance detected. time: [18.763 s 18.931 s 19.098 s]
thrpt: [51.135 KiB/s 51.585 KiB/s 52.048 KiB/s]
change:
time: [−1.0872% +0.0965% +1.4040%] (p = 0.88 > 0.05)
thrpt: [−1.3845% −0.0964% +1.0992%]
coalesce_acked_from_zero 1+1 entries: No change in performance detected. time: [88.654 ns 89.188 ns 89.889 ns]
change: [−0.4586% +0.3650% +1.4553%] (p = 0.51 > 0.05)
coalesce_acked_from_zero 3+1 entries: No change in performance detected. time: [106.62 ns 106.90 ns 107.21 ns]
change: [−12.249% −4.6644% +0.3526%] (p = 0.23 > 0.05)
coalesce_acked_from_zero 10+1 entries: No change in performance detected. time: [106.21 ns 106.67 ns 107.22 ns]
change: [−0.3342% +0.1232% +0.5627%] (p = 0.59 > 0.05)
coalesce_acked_from_zero 1000+1 entries: No change in performance detected. time: [89.928 ns 90.057 ns 90.201 ns]
change: [−0.9721% +0.1828% +1.3566%] (p = 0.77 > 0.05)
RxStreamOrderer::inbound_frame(): Change within noise threshold. time: [108.62 ms 108.68 ms 108.75 ms]
change: [−1.0925% −0.9721% −0.8536%] (p = 0.00 < 0.05)
sent::Packets::take_ranges: No change in performance detected. time: [4.6749 µs 4.7918 µs 4.9071 µs]
change: [−3.1827% +0.1711% +3.3470%] (p = 0.92 > 0.05)
transfer/pacing-false/varying-seeds/wallclock-time/run: Change within noise threshold. time: [26.212 ms 26.261 ms 26.319 ms]
change: [−2.1860% −1.9174% −1.6308%] (p = 0.00 < 0.05)
transfer/pacing-false/varying-seeds/simulated-time/run: No change in performance detected. time: [25.126 s 25.158 s 25.191 s]
thrpt: [162.60 KiB/s 162.81 KiB/s 163.02 KiB/s]
change:
time: [−0.1780% +0.0256% +0.2292%] (p = 0.81 > 0.05)
thrpt: [−0.2287% −0.0256% +0.1783%]
transfer/pacing-true/varying-seeds/wallclock-time/run: No change in performance detected. time: [26.991 ms 27.050 ms 27.110 ms]
change: [−0.2704% +0.0805% +0.4207%] (p = 0.64 > 0.05)
transfer/pacing-true/varying-seeds/simulated-time/run: No change in performance detected. time: [24.970 s 25.004 s 25.038 s]
thrpt: [163.59 KiB/s 163.82 KiB/s 164.04 KiB/s]
change:
time: [−0.1177% +0.0914% +0.3110%] (p = 0.41 > 0.05)
thrpt: [−0.3100% −0.0913% +0.1179%]
transfer/pacing-false/same-seed/wallclock-time/run: Change within noise threshold. time: [25.782 ms 25.807 ms 25.835 ms]
change: [−2.7894% −2.5904% −2.4195%] (p = 0.00 < 0.05)
transfer/pacing-false/same-seed/simulated-time/run: No change in performance detected. time: [25.152 s 25.152 s 25.152 s]
thrpt: [162.85 KiB/s 162.85 KiB/s 162.85 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000%]
transfer/pacing-true/same-seed/wallclock-time/run: Change within noise threshold. time: [27.197 ms 27.225 ms 27.254 ms]
change: [−2.4904% −2.3264% −2.1759%] (p = 0.00 < 0.05)
transfer/pacing-true/same-seed/simulated-time/run: No change in performance detected. time: [25.588 s 25.588 s 25.588 s]
thrpt: [160.07 KiB/s 160.07 KiB/s 160.07 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000%]
Download data for |
As discussed with @mxinden I'm splitting out some even smaller PRs from #2967 that are purely refactors.
This one calls
convert_to_f64(max_datagram_size)once and then uses that value in all other function calls and throughoutbytes_for_cwnd_increaseinstead of converting on each use.