Skip to content

refactor(transport/cubic): only convert max_datagram_size_f64 once#2994

Merged
omansfeld merged 2 commits intomozilla:mainfrom
omansfeld:cubic_mss_f64
Sep 17, 2025
Merged

refactor(transport/cubic): only convert max_datagram_size_f64 once#2994
omansfeld merged 2 commits intomozilla:mainfrom
omansfeld:cubic_mss_f64

Conversation

@omansfeld
Copy link
Collaborator

As discussed with @mxinden I'm splitting out some even smaller PRs from #2967 that are purely refactors.

This one calls convert_to_f64(max_datagram_size) once and then uses that value in all other function calls and throughout bytes_for_cwnd_increase instead of converting on each use.

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment.

@codecov
Copy link

codecov bot commented Sep 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.34%. Comparing base (7da737f) to head (a532be3).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2994      +/-   ##
==========================================
- Coverage   93.36%   93.34%   -0.02%     
==========================================
  Files         123      123              
  Lines       35696    35697       +1     
  Branches    35696    35697       +1     
==========================================
- Hits        33327    33321       -6     
- Misses       1524     1534      +10     
+ Partials      845      842       -3     
Components Coverage Δ
neqo-common 97.31% <ø> (ø)
neqo-crypto 83.31% <ø> (ø)
neqo-http3 93.32% <ø> (ø)
neqo-qpack 94.14% <ø> (ø)
neqo-transport 94.40% <100.00%> (-0.04%) ⬇️
neqo-udp 80.48% <ø> (ø)
mtu 85.57% <ø> (-0.20%) ⬇️

@github-actions
Copy link
Contributor

github-actions bot commented Sep 17, 2025

🐰 Bencher Report

Branchcubic_mss_f64
TestbedOn-prem
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
nanoseconds (ns)
(Result Δ%)
Upper Boundary
nanoseconds (ns)
(Limit %)
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client📈 view plot
🚷 view threshold
208,900,000.00 ns
(-1.64%)Baseline: 212,382,537.31 ns
218,333,921.10 ns
(95.68%)
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client📈 view plot
🚷 view threshold
200,770,000.00 ns
(-3.11%)Baseline: 207,220,149.25 ns
213,166,494.59 ns
(94.18%)
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client📈 view plot
🚷 view threshold
28,551,000.00 ns
(+1.11%)Baseline: 28,238,089.55 ns
28,731,622.06 ns
(99.37%)
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client📈 view plot
🚷 view threshold
295,100,000.00 ns
(+0.64%)Baseline: 293,236,417.91 ns
299,314,514.73 ns
(98.59%)
1-streams/each-1000-bytes/simulated-time📈 view plot
🚷 view threshold
119,140,000.00 ns
(+1.75%)Baseline: 117,092,686.57 ns
119,165,504.61 ns
(99.98%)
1-streams/each-1000-bytes/wallclock-time📈 view plot
🚷 view threshold
597,190.00 ns
(-1.96%)Baseline: 609,148.36 ns
629,905.49 ns
(94.81%)
1000-streams/each-1-bytes/simulated-time📈 view plot
🚷 view threshold
14,990,000,000.00 ns
(+0.01%)Baseline: 14,987,835,820.90 ns
15,006,429,992.32 ns
(99.89%)
1000-streams/each-1-bytes/wallclock-time📈 view plot
🚷 view threshold
14,062,000.00 ns
(-3.35%)Baseline: 14,548,955.22 ns
15,061,860.75 ns
(93.36%)
1000-streams/each-1000-bytes/simulated-time📈 view plot
🚷 view threshold
18,931,000,000.00 ns
(+0.22%)Baseline: 18,889,223,880.60 ns
19,121,446,626.32 ns
(99.00%)
1000-streams/each-1000-bytes/wallclock-time📈 view plot
🚷 view threshold
49,631,000.00 ns
(-10.26%)Baseline: 55,308,059.70 ns
60,233,922.84 ns
(82.40%)
RxStreamOrderer::inbound_frame()📈 view plot
🚷 view threshold
108,680,000.00 ns
(-0.60%)Baseline: 109,334,776.12 ns
110,803,576.63 ns
(98.08%)
coalesce_acked_from_zero 1+1 entries📈 view plot
🚷 view threshold
89.19 ns
(+0.68%)Baseline: 88.58 ns
89.23 ns
(99.96%)
coalesce_acked_from_zero 10+1 entries📈 view plot
🚷 view threshold
106.67 ns
(+0.47%)Baseline: 106.17 ns
107.03 ns
(99.66%)
coalesce_acked_from_zero 1000+1 entries📈 view plot
🚷 view threshold
90.06 ns
(+0.16%)Baseline: 89.91 ns
93.74 ns
(96.07%)
coalesce_acked_from_zero 3+1 entries📈 view plot
🚷 view threshold
106.90 ns
(+0.21%)Baseline: 106.68 ns
107.45 ns
(99.49%)
decode 1048576 bytes, mask 3f📈 view plot
🚷 view threshold
1,590,500.00 ns
(-0.13%)Baseline: 1,592,605.97 ns
1,599,666.11 ns
(99.43%)
decode 1048576 bytes, mask 7f📈 view plot
🚷 view threshold
5,045,300.00 ns
(-0.34%)Baseline: 5,062,635.82 ns
5,078,360.48 ns
(99.35%)
decode 1048576 bytes, mask ff📈 view plot
🚷 view threshold
3,029,400.00 ns
(-0.14%)Baseline: 3,033,573.13 ns
3,046,332.33 ns
(99.44%)
decode 4096 bytes, mask 3f📈 view plot
🚷 view threshold
8,269.30 ns
(-0.30%)Baseline: 8,294.57 ns
8,338.97 ns
(99.16%)
decode 4096 bytes, mask 7f📈 view plot
🚷 view threshold
20,025.00 ns
(+0.10%)Baseline: 20,004.60 ns
20,092.64 ns
(99.66%)
decode 4096 bytes, mask ff📈 view plot
🚷 view threshold
11,843.00 ns
(-0.08%)Baseline: 11,852.07 ns
11,923.49 ns
(99.32%)
sent::Packets::take_ranges📈 view plot
🚷 view threshold
4,791.80 ns
(-0.84%)Baseline: 4,832.37 ns
5,008.53 ns
(95.67%)
transfer/pacing-false/same-seed/simulated-time/run📈 view plot
🚷 view threshold
25,152,000,000.00 ns
transfer/pacing-false/same-seed/wallclock-time/run📈 view plot
🚷 view threshold
25,807,000.00 ns
(-0.49%)Baseline: 25,934,200.00 ns
26,667,573.27 ns
(96.77%)
transfer/pacing-false/varying-seeds/simulated-time/run📈 view plot
🚷 view threshold
25,158,000,000.00 ns
(-0.03%)Baseline: 25,165,646,153.85 ns
25,209,363,767.86 ns
(99.80%)
transfer/pacing-false/varying-seeds/wallclock-time/run📈 view plot
🚷 view threshold
26,261,000.00 ns
(-0.09%)Baseline: 26,285,815.38 ns
26,894,554.10 ns
(97.64%)
transfer/pacing-true/same-seed/simulated-time/run📈 view plot
🚷 view threshold
25,588,000,000.00 ns
transfer/pacing-true/same-seed/wallclock-time/run📈 view plot
🚷 view threshold
27,225,000.00 ns
(-0.86%)Baseline: 27,461,876.92 ns
28,234,730.66 ns
(96.42%)
transfer/pacing-true/varying-seeds/simulated-time/run📈 view plot
🚷 view threshold
25,004,000,000.00 ns
(+0.04%)Baseline: 24,994,892,307.69 ns
25,041,476,466.55 ns
(99.85%)
transfer/pacing-true/varying-seeds/wallclock-time/run📈 view plot
🚷 view threshold
27,050,000.00 ns
(+0.64%)Baseline: 26,877,400.00 ns
27,528,709.49 ns
(98.26%)
🐰 View full continuous benchmarking report in Bencher

@omansfeld
Copy link
Collaborator Author

omansfeld commented Sep 17, 2025

Quick general question about our github workflow: Should I add patches to the merge queue myself once they are approved and comments are addressed (like here now), or is that done by the reviewer?

(also just noting, there might be merge conflicts with #2970 so they probably shouldn't both be added to the queue at the same time)

@github-actions
Copy link
Contributor

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to 5d6e774.

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

@github-actions
Copy link
Contributor

Client/server transfer results

Performance differences relative to 8441e29.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params) Mean ± σ Min Max MiB/s ± σ Δ main Δ main
google vs. google 454.4 ± 4.7 446.3 465.9 70.4 ± 6.8
google vs. neqo (cubic, paced) 277.6 ± 4.3 270.2 286.5 115.3 ± 7.4 1.1 0.4%
msquic vs. msquic 165.8 ± 39.7 138.1 488.3 193.0 ± 0.8
msquic vs. neqo (cubic, paced) 184.2 ± 45.7 146.5 532.0 173.8 ± 0.7 -2.5 -1.4%
neqo vs. google (cubic, paced) 754.6 ± 6.0 746.9 795.6 42.4 ± 5.3 0.0 0.0%
neqo vs. msquic (cubic, paced) 157.0 ± 5.2 150.2 177.4 203.8 ± 6.2 -0.8 -0.5%
neqo vs. neqo (cubic) 89.0 ± 4.1 82.6 97.5 359.7 ± 7.8 💚 -1.8 -2.0%
neqo vs. neqo (cubic, paced) 92.4 ± 5.2 84.5 116.4 346.5 ± 6.2 💔 2.3 2.5%
neqo vs. neqo (reno) 89.4 ± 4.7 80.8 98.8 357.9 ± 6.8 -0.3 -0.4%
neqo vs. neqo (reno, paced) 91.9 ± 3.9 84.9 98.8 348.4 ± 8.2 0.2 0.3%
neqo vs. quiche (cubic, paced) 195.0 ± 4.6 187.8 206.2 164.1 ± 7.0 -0.4 -0.2%
neqo vs. s2n (cubic, paced) 223.6 ± 4.7 212.5 231.2 143.1 ± 6.8 💔 3.8 1.7%
quiche vs. neqo (cubic, paced) 152.9 ± 4.7 140.4 160.7 209.3 ± 6.8 -0.1 -0.1%
quiche vs. quiche 142.1 ± 4.2 135.6 152.0 225.2 ± 7.6
s2n vs. neqo (cubic, paced) 173.7 ± 4.9 159.6 184.3 184.3 ± 6.5 1.0 0.6%
s2n vs. s2n 247.7 ± 26.1 232.6 346.0 129.2 ± 1.2

Download data for profiler.firefox.com or download performance comparison data.

@larseggert
Copy link
Collaborator

Should I add patches to the merge queue myself once they are approved and comments are addressed (like here now), or is that done by the reviewer?

We don't really have a fixed policy. Sometimes the reviewer will, but esp. when I think that some nits may still be fixed I will just approve and leave it for the author to enqueue.

@larseggert
Copy link
Collaborator

here might be merge conflicts with #2970 so they probably shouldn't both be added to the queue at the same time

Well, whichever is second will need a rebase :-)

@omansfeld omansfeld added this pull request to the merge queue Sep 17, 2025
Merged via the queue into mozilla:main with commit ae02833 Sep 17, 2025
116 of 118 checks passed
@omansfeld omansfeld deleted the cubic_mss_f64 branch September 17, 2025 14:26
@github-actions
Copy link
Contributor

Benchmark results

Performance differences relative to 8441e29.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: Change within noise threshold.
       time:   [200.32 ms 200.77 ms 201.34 ms]
       thrpt:  [496.67 MiB/s 498.08 MiB/s 499.20 MiB/s]
change:
       time:   [+0.1838% +0.7369% +1.1839%] (p = 0.00 < 0.05)
       thrpt:  [−1.1700% −0.7315% −0.1835%]

Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: Change within noise threshold.
       time:   [293.60 ms 295.10 ms 296.60 ms]
       thrpt:  [33.715 Kelem/s 33.887 Kelem/s 34.060 Kelem/s]
change:
       time:   [−1.8996% −1.1990% −0.4718%] (p = 0.00 < 0.05)
       thrpt:  [+0.4741% +1.2135% +1.9364%]

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.
       time:   [28.455 ms 28.551 ms 28.667 ms]
       thrpt:  [34.883   B/s 35.025   B/s 35.143   B/s]
change:
       time:   [−0.7152% −0.1500% +0.4157%] (p = 0.60 > 0.05)
       thrpt:  [−0.4139% +0.1503% +0.7204%]

Found 22 outliers among 100 measurements (22.00%)
4 (4.00%) low severe
18 (18.00%) high severe

1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: 💔 Performance has regressed.
       time:   [208.71 ms 208.90 ms 209.10 ms]
       thrpt:  [478.23 MiB/s 478.69 MiB/s 479.14 MiB/s]
change:
       time:   [+1.9063% +2.1734% +2.3833%] (p = 0.00 < 0.05)
       thrpt:  [−2.3278% −2.1272% −1.8707%]

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild

decode 4096 bytes, mask ff: No change in performance detected.
       time:   [11.809 µs 11.843 µs 11.885 µs]
       change: [−0.7498% +0.1343% +1.1011%] (p = 0.78 > 0.05)

Found 17 outliers among 100 measurements (17.00%)
3 (3.00%) low severe
4 (4.00%) low mild
1 (1.00%) high mild
9 (9.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [3.0200 ms 3.0294 ms 3.0405 ms]
       change: [−0.5349% −0.0327% +0.4811%] (p = 0.90 > 0.05)

Found 10 outliers among 100 measurements (10.00%)
10 (10.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [19.963 µs 20.025 µs 20.091 µs]
       change: [−0.7817% −0.2284% +0.2363%] (p = 0.41 > 0.05)

Found 21 outliers among 100 measurements (21.00%)
2 (2.00%) low severe
3 (3.00%) low mild
1 (1.00%) high mild
15 (15.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.
       time:   [5.0342 ms 5.0453 ms 5.0581 ms]
       change: [−0.5843% −0.1787% +0.1991%] (p = 0.38 > 0.05)

Found 12 outliers among 100 measurements (12.00%)
12 (12.00%) high severe

decode 4096 bytes, mask 3f: Change within noise threshold.
       time:   [8.2496 µs 8.2693 µs 8.2988 µs]
       change: [−1.0909% −0.5736% −0.1330%] (p = 0.02 < 0.05)

Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [1.5850 ms 1.5905 ms 1.5974 ms]
       change: [−0.4693% +0.0563% +0.5033%] (p = 0.87 > 0.05)

Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) high mild
5 (5.00%) high severe

1-streams/each-1000-bytes/wallclock-time: No change in performance detected.
       time:   [595.01 µs 597.19 µs 599.68 µs]
       change: [−0.0149% +0.5289% +1.0658%] (p = 0.06 > 0.05)

Found 8 outliers among 100 measurements (8.00%)
8 (8.00%) high severe
1-streams/each-1000-bytes/simulated-time
time: [118.93 ms 119.14 ms 119.35 ms]
thrpt: [8.1821 KiB/s 8.1966 KiB/s 8.2111 KiB/s]
change:
time: [−0.3433% −0.0491% +0.2170%] (p = 0.73 > 0.05)
thrpt: [−0.2165% +0.0491% +0.3445%]
No change in performance detected.

1000-streams/each-1-bytes/wallclock-time: Change within noise threshold.
       time:   [14.009 ms 14.062 ms 14.136 ms]
       change: [−1.4241% −0.9946% −0.4497%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
1000-streams/each-1-bytes/simulated-time
time: [14.976 s 14.990 s 15.003 s]
thrpt: [66.651 B/s 66.711 B/s 66.772 B/s]
change:
time: [−0.1481% −0.0256% +0.1001%] (p = 0.69 > 0.05)
thrpt: [−0.1000% +0.0256% +0.1483%]
No change in performance detected.

1000-streams/each-1000-bytes/wallclock-time: 💚 Performance has improved.
       time:   [49.438 ms 49.631 ms 49.826 ms]
       change: [−4.0619% −3.5237% −3.0129%] (p = 0.00 < 0.05)
1000-streams/each-1000-bytes/simulated-time: No change in performance detected.
       time:   [18.763 s 18.931 s 19.098 s]
       thrpt:  [51.135 KiB/s 51.585 KiB/s 52.048 KiB/s]
change:
       time:   [−1.0872% +0.0965% +1.4040%] (p = 0.88 > 0.05)
       thrpt:  [−1.3845% −0.0964% +1.0992%]

Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) low mild
2 (2.00%) high mild

coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [88.654 ns 89.188 ns 89.889 ns]
       change: [−0.4586% +0.3650% +1.4553%] (p = 0.51 > 0.05)

Found 11 outliers among 100 measurements (11.00%)
8 (8.00%) high mild
3 (3.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [106.62 ns 106.90 ns 107.21 ns]
       change: [−12.249% −4.6644% +0.3526%] (p = 0.23 > 0.05)

Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) high mild
9 (9.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [106.21 ns 106.67 ns 107.22 ns]
       change: [−0.3342% +0.1232% +0.5627%] (p = 0.59 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
4 (4.00%) high mild
9 (9.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [89.928 ns 90.057 ns 90.201 ns]
       change: [−0.9721% +0.1828% +1.3566%] (p = 0.77 > 0.05)

Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) high mild
8 (8.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [108.62 ms 108.68 ms 108.75 ms]
       change: [−1.0925% −0.9721% −0.8536%] (p = 0.00 < 0.05)

Found 9 outliers among 100 measurements (9.00%)
7 (7.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe

sent::Packets::take_ranges: No change in performance detected.
       time:   [4.6749 µs 4.7918 µs 4.9071 µs]
       change: [−3.1827% +0.1711% +3.3470%] (p = 0.92 > 0.05)

Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild

transfer/pacing-false/varying-seeds/wallclock-time/run: Change within noise threshold.
       time:   [26.212 ms 26.261 ms 26.319 ms]
       change: [−2.1860% −1.9174% −1.6308%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

transfer/pacing-false/varying-seeds/simulated-time/run: No change in performance detected.
       time:   [25.126 s 25.158 s 25.191 s]
       thrpt:  [162.60 KiB/s 162.81 KiB/s 163.02 KiB/s]
change:
       time:   [−0.1780% +0.0256% +0.2292%] (p = 0.81 > 0.05)
       thrpt:  [−0.2287% −0.0256% +0.1783%]

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

transfer/pacing-true/varying-seeds/wallclock-time/run: No change in performance detected.
       time:   [26.991 ms 27.050 ms 27.110 ms]
       change: [−0.2704% +0.0805% +0.4207%] (p = 0.64 > 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

transfer/pacing-true/varying-seeds/simulated-time/run: No change in performance detected.
       time:   [24.970 s 25.004 s 25.038 s]
       thrpt:  [163.59 KiB/s 163.82 KiB/s 164.04 KiB/s]
change:
       time:   [−0.1177% +0.0914% +0.3110%] (p = 0.41 > 0.05)
       thrpt:  [−0.3100% −0.0913% +0.1179%]

Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild

transfer/pacing-false/same-seed/wallclock-time/run: Change within noise threshold.
       time:   [25.782 ms 25.807 ms 25.835 ms]
       change: [−2.7894% −2.5904% −2.4195%] (p = 0.00 < 0.05)

Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe

transfer/pacing-false/same-seed/simulated-time/run: No change in performance detected.
       time:   [25.152 s 25.152 s 25.152 s]
       thrpt:  [162.85 KiB/s 162.85 KiB/s 162.85 KiB/s]
change:
       time:   [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
       thrpt:  [+0.0000% +0.0000% +0.0000%]
transfer/pacing-true/same-seed/wallclock-time/run: Change within noise threshold.
       time:   [27.197 ms 27.225 ms 27.254 ms]
       change: [−2.4904% −2.3264% −2.1759%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

transfer/pacing-true/same-seed/simulated-time/run: No change in performance detected.
       time:   [25.588 s 25.588 s 25.588 s]
       thrpt:  [160.07 KiB/s 160.07 KiB/s 160.07 KiB/s]
change:
       time:   [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
       thrpt:  [+0.0000% +0.0000% +0.0000%]

Download data for profiler.firefox.com or download performance comparison data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants