feat(transport/cubic): multiplicative decrease updates#2970
feat(transport/cubic): multiplicative decrease updates#2970mxinden merged 5 commits intomozilla:mainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2970 +/- ##
==========================================
- Coverage 95.67% 93.36% -2.31%
==========================================
Files 123 123
Lines 35706 35703 -3
Branches 35706 35703 -3
==========================================
- Hits 34160 33333 -827
- Misses 1506 1528 +22
- Partials 40 842 +802
|
|
@mxinden This is just a refactor and the behavior is already tested here: neqo/neqo-transport/src/cc/tests/cubic.rs Lines 310 to 335 in 89cb6ea So this PR should be ready for review/merge right now. The refactor is mainly that fast convergence has a slightly different algorithm now that uses one less variable to keep state. Functionally it does the same thing. For reference: |
|
| Branch | cubic_decrease_updates |
| Testbed | On-prem |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result nanoseconds (ns) (Result Δ%) | Upper Boundary nanoseconds (ns) (Limit %) |
|---|---|---|---|
| 1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client | 📈 view plot 🚷 view threshold | 207,290,000.00 ns(-1.83%)Baseline: 211,157,640.45 ns | 218,765,548.43 ns (94.75%) |
| 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client | 📈 view plot 🚷 view threshold | 199,440,000.00 ns(-2.98%)Baseline: 205,556,629.21 ns | 214,279,795.21 ns (93.07%) |
| 1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client | 📈 view plot 🚷 view threshold | 28,491,000.00 ns(+0.65%)Baseline: 28,305,719.10 ns | 28,818,517.10 ns (98.86%) |
| 1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client | 📈 view plot 🚷 view threshold | 295,140,000.00 ns(+0.37%)Baseline: 294,063,483.15 ns | 301,012,816.27 ns (98.05%) |
| 1-streams/each-1000-bytes/simulated-time | 📈 view plot 🚷 view threshold | 119,230,000.00 ns(+1.40%)Baseline: 117,587,191.01 ns | 120,307,629.74 ns (99.10%) |
| 1-streams/each-1000-bytes/wallclock-time | 📈 view plot 🚷 view threshold | 593,520.00 ns(-1.93%)Baseline: 605,230.22 ns | 629,624.70 ns (94.27%) |
| 1000-streams/each-1-bytes/simulated-time | 📈 view plot 🚷 view threshold | 14,997,000,000.00 ns(+0.05%)Baseline: 14,989,528,089.89 ns | 15,008,117,892.06 ns (99.93%) |
| 1000-streams/each-1-bytes/wallclock-time | 📈 view plot 🚷 view threshold | 14,098,000.00 ns(-2.45%)Baseline: 14,451,617.98 ns | 15,057,961.61 ns (93.62%) |
| 1000-streams/each-1000-bytes/simulated-time | 📈 view plot 🚷 view threshold | 18,910,000,000.00 ns(+0.09%)Baseline: 18,893,078,651.69 ns | 19,109,024,465.38 ns (98.96%) |
| 1000-streams/each-1000-bytes/wallclock-time | 📈 view plot 🚷 view threshold | 50,599,000.00 ns(-6.68%)Baseline: 54,219,505.62 ns | 60,428,590.95 ns (83.73%) |
| RxStreamOrderer::inbound_frame() | 📈 view plot 🚷 view threshold | 109,530,000.00 ns(-0.07%)Baseline: 109,604,044.94 ns | 111,581,896.08 ns (98.16%) |
| coalesce_acked_from_zero 1+1 entries | 📈 view plot 🚷 view threshold | 88.92 ns(+0.38%)Baseline: 88.58 ns | 89.23 ns (99.66%) |
| coalesce_acked_from_zero 10+1 entries | 📈 view plot 🚷 view threshold | 106.60 ns(+0.45%)Baseline: 106.12 ns | 107.01 ns (99.62%) |
| coalesce_acked_from_zero 1000+1 entries | 📈 view plot 🚷 view threshold | 89.55 ns(-0.33%)Baseline: 89.84 ns | 93.74 ns (95.53%) |
| coalesce_acked_from_zero 3+1 entries | 📈 view plot 🚷 view threshold | 107.47 ns(+0.78%)Baseline: 106.64 ns | 107.59 ns (99.89%) |
| decode 1048576 bytes, mask 3f | 📈 view plot 🚷 view threshold | 1,589,200.00 ns(-0.21%)Baseline: 1,592,615.73 ns | 1,600,018.62 ns (99.32%) |
| decode 1048576 bytes, mask 7f | 📈 view plot 🚷 view threshold | 5,043,600.00 ns(-0.33%)Baseline: 5,060,098.88 ns | 5,078,765.16 ns (99.31%) |
| decode 1048576 bytes, mask ff | 📈 view plot 🚷 view threshold | 3,029,100.00 ns(-0.13%)Baseline: 3,033,119.10 ns | 3,045,489.35 ns (99.46%) |
| decode 4096 bytes, mask 3f | 📈 view plot 🚷 view threshold | 8,276.30 ns(-0.21%)Baseline: 8,293.92 ns | 8,334.79 ns (99.30%) |
| decode 4096 bytes, mask 7f | 📈 view plot 🚷 view threshold | 19,997.00 ns(-0.04%)Baseline: 20,005.52 ns | 20,090.21 ns (99.54%) |
| decode 4096 bytes, mask ff | 📈 view plot 🚷 view threshold | 11,662.00 ns(-1.31%)Baseline: 11,817.29 ns | 12,009.37 ns (97.11%) |
| sent::Packets::take_ranges | 📈 view plot 🚷 view threshold | 4,569.60 ns(-4.76%)Baseline: 4,797.89 ns | 5,032.89 ns (90.79%) |
| transfer/pacing-false/same-seed/simulated-time/run | 📈 view plot 🚷 view threshold | 25,152,000,000.00 ns | |
| transfer/pacing-false/same-seed/wallclock-time/run | 📈 view plot 🚷 view threshold | 26,525,000.00 ns(+1.91%)Baseline: 26,026,758.62 ns | 26,823,767.14 ns (98.89%) |
| transfer/pacing-false/varying-seeds/simulated-time/run | 📈 view plot 🚷 view threshold | 25,157,000,000.00 ns(-0.03%)Baseline: 25,164,091,954.02 ns | 25,209,371,794.20 ns (99.79%) |
| transfer/pacing-false/varying-seeds/wallclock-time/run | 📈 view plot 🚷 view threshold | 26,935,000.00 ns(+2.08%)Baseline: 26,387,103.45 ns | 27,149,800.23 ns (99.21%) |
| transfer/pacing-true/same-seed/simulated-time/run | 📈 view plot 🚷 view threshold | 25,588,000,000.00 ns | |
| transfer/pacing-true/same-seed/wallclock-time/run | 📈 view plot 🚷 view threshold | 27,876,000.00 ns(+1.21%)Baseline: 27,543,632.18 ns | 28,353,524.57 ns (98.32%) |
| transfer/pacing-true/varying-seeds/simulated-time/run | 📈 view plot 🚷 view threshold | 24,991,000,000.00 ns(-0.01%)Baseline: 24,993,954,022.99 ns | 25,043,989,857.86 ns (99.79%) |
| transfer/pacing-true/varying-seeds/wallclock-time/run | 📈 view plot 🚷 view threshold | 27,542,000.00 ns(+2.22%)Baseline: 26,944,229.89 ns | 27,699,879.03 ns (99.43%) |
- fast convergence algorithm - renaming - updating docs - adding cwnd_prior - fixing tests
3ea26cb to
66aa0af
Compare
Failed Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
Client/server transfer resultsPerformance differences relative to 8b7d440. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
Benchmark resultsPerformance differences relative to 8b7d440. 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: Change within noise threshold. time: [199.02 ms 199.44 ms 199.95 ms]
thrpt: [500.12 MiB/s 501.41 MiB/s 502.46 MiB/s]
change:
time: [−1.2092% −0.9454% −0.6403%] (p = 0.00 < 0.05)
thrpt: [+0.6444% +0.9545% +1.2240%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected. time: [293.26 ms 295.14 ms 297.07 ms]
thrpt: [33.662 Kelem/s 33.883 Kelem/s 34.100 Kelem/s]
change:
time: [−0.1720% +0.6582% +1.4494%] (p = 0.12 > 0.05)
thrpt: [−1.4287% −0.6539% +0.1723%]
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected. time: [28.415 ms 28.491 ms 28.576 ms]
thrpt: [34.995 B/s 35.099 B/s 35.192 B/s]
change:
time: [−0.4699% −0.0447% +0.3760%] (p = 0.85 > 0.05)
thrpt: [−0.3746% +0.0447% +0.4721%]
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: Change within noise threshold. time: [207.01 ms 207.29 ms 207.60 ms]
thrpt: [481.70 MiB/s 482.42 MiB/s 483.07 MiB/s]
change:
time: [−0.6017% −0.4128% −0.2161%] (p = 0.00 < 0.05)
thrpt: [+0.2166% +0.4145% +0.6053%]
decode 4096 bytes, mask ff: No change in performance detected. time: [11.622 µs 11.662 µs 11.708 µs]
change: [−0.4545% +0.1013% +0.6223%] (p = 0.72 > 0.05)
decode 1048576 bytes, mask ff: No change in performance detected. time: [3.0196 ms 3.0291 ms 3.0403 ms]
change: [−0.2729% +0.1636% +0.6066%] (p = 0.49 > 0.05)
decode 4096 bytes, mask 7f: No change in performance detected. time: [19.941 µs 19.997 µs 20.065 µs]
change: [−0.4638% −0.0131% +0.4022%] (p = 0.95 > 0.05)
decode 1048576 bytes, mask 7f: No change in performance detected. time: [5.0327 ms 5.0436 ms 5.0551 ms]
change: [−0.4346% −0.0755% +0.2538%] (p = 0.67 > 0.05)
decode 4096 bytes, mask 3f: No change in performance detected. time: [8.2524 µs 8.2763 µs 8.3073 µs]
change: [−1.2731% −0.4022% +0.2652%] (p = 0.35 > 0.05)
decode 1048576 bytes, mask 3f: No change in performance detected. time: [1.5849 ms 1.5892 ms 1.5949 ms]
change: [−0.6173% −0.0952% +0.4157%] (p = 0.71 > 0.05)
1-streams/each-1000-bytes/wallclock-time: No change in performance detected. time: [589.88 µs 593.52 µs 597.62 µs]
change: [−1.0576% +0.0602% +1.0451%] (p = 0.92 > 0.05)
1000-streams/each-1-bytes/wallclock-time: No change in performance detected. time: [14.042 ms 14.098 ms 14.182 ms]
change: [−0.3112% +0.1620% +0.8293%] (p = 0.62 > 0.05)
1000-streams/each-1000-bytes/wallclock-time: Change within noise threshold. time: [50.409 ms 50.599 ms 50.795 ms]
change: [−1.7733% −1.2668% −0.7724%] (p = 0.00 < 0.05)
coalesce_acked_from_zero 1+1 entries: No change in performance detected. time: [88.583 ns 88.924 ns 89.263 ns]
change: [−0.7653% −0.1736% +0.3470%] (p = 0.55 > 0.05)
coalesce_acked_from_zero 3+1 entries: No change in performance detected. time: [107.08 ns 107.47 ns 107.88 ns]
change: [−0.5460% +0.0183% +0.5348%] (p = 0.96 > 0.05)
coalesce_acked_from_zero 10+1 entries: No change in performance detected. time: [106.20 ns 106.60 ns 107.09 ns]
change: [−0.2235% +0.4779% +1.5881%] (p = 0.34 > 0.05)
coalesce_acked_from_zero 1000+1 entries: No change in performance detected. time: [89.428 ns 89.551 ns 89.691 ns]
change: [−0.9254% +0.1754% +1.2288%] (p = 0.76 > 0.05)
RxStreamOrderer::inbound_frame(): Change within noise threshold. time: [109.37 ms 109.53 ms 109.79 ms]
change: [−0.7895% −0.6148% −0.3293%] (p = 0.00 < 0.05)
sent::Packets::take_ranges: No change in performance detected. time: [4.4720 µs 4.5696 µs 4.6742 µs]
change: [−4.7448% −1.4587% +1.6001%] (p = 0.39 > 0.05)
transfer/pacing-false/varying-seeds/wallclock-time/run: Change within noise threshold. time: [26.885 ms 26.935 ms 26.986 ms]
change: [+0.8013% +1.0582% +1.3165%] (p = 0.00 < 0.05)
transfer/pacing-false/varying-seeds/simulated-time/run: No change in performance detected. time: [25.119 s 25.157 s 25.194 s]
thrpt: [162.58 KiB/s 162.82 KiB/s 163.06 KiB/s]
change:
time: [−0.1834% +0.0272% +0.2308%] (p = 0.81 > 0.05)
thrpt: [−0.2302% −0.0272% +0.1837%]
transfer/pacing-true/varying-seeds/wallclock-time/run: Change within noise threshold. time: [27.463 ms 27.542 ms 27.624 ms]
change: [+0.2344% +0.6146% +0.9858%] (p = 0.00 < 0.05)
transfer/pacing-true/varying-seeds/simulated-time/run: No change in performance detected. time: [24.949 s 24.991 s 25.033 s]
thrpt: [163.62 KiB/s 163.90 KiB/s 164.17 KiB/s]
change:
time: [−0.0959% +0.1462% +0.3802%] (p = 0.23 > 0.05)
thrpt: [−0.3787% −0.1459% +0.0960%]
transfer/pacing-false/same-seed/wallclock-time/run: Change within noise threshold. time: [26.490 ms 26.525 ms 26.576 ms]
change: [+1.6328% +1.8750% +2.1157%] (p = 0.00 < 0.05)
transfer/pacing-false/same-seed/simulated-time/run: No change in performance detected. time: [25.152 s 25.152 s 25.152 s]
thrpt: [162.85 KiB/s 162.85 KiB/s 162.85 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000%]
transfer/pacing-true/same-seed/wallclock-time/run: No change in performance detected. time: [27.849 ms 27.876 ms 27.904 ms]
change: [−0.1251% +0.0015% +0.1247%] (p = 0.98 > 0.05)
transfer/pacing-true/same-seed/simulated-time/run: No change in performance detected. time: [25.588 s 25.588 s 25.588 s]
thrpt: [160.07 KiB/s 160.07 KiB/s 160.07 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000%]
Download data for |
mxinden
left a comment
There was a problem hiding this comment.
Thanks for answering my many questions.
Changes the following:
last_max_cwndcwnd_priorfor logging, this will also be used in a later PR for dynamically changing the value ofCUBIC_ALPHASplit off of #2535.
Part of #2967.
Depends on #2968.