fix: Restart PMTUD when full-sized packets are lost#3335
fix: Restart PMTUD when full-sized packets are lost#3335larseggert wants to merge 15 commits intomozilla:mainfrom
Conversation
Merging this PR will degrade performance by 27.99%
Performance Changes
Comparing Footnotes
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3335 +/- ##
==========================================
- Coverage 94.26% 94.21% -0.05%
==========================================
Files 125 129 +4
Lines 37973 38685 +712
Branches 37973 38685 +712
==========================================
+ Hits 35795 36449 +654
- Misses 1339 1387 +48
- Partials 839 849 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Pull request overview
This PR implements a PMTUD (Path MTU Discovery) black hole detection mechanism to handle MTU reductions mid-connection. When full-sized packets are repeatedly lost while smaller packets succeed, PMTUD restarts from the minimum MTU to re-discover the correct path MTU. This addresses Firefox loss rate increases caused by network path changes (e.g., VPNs) that reduce MTU without triggering IP address changes.
Changes:
- Added
BlackHoleDetectorto track large packet losses and trigger PMTUD restarts - Introduced
DynamicMtutest fixture to simulate mid-connection MTU changes - Added comprehensive tests for black hole detection and false-positive prevention
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| neqo-transport/src/pmtud.rs | Core black hole detection logic, refactored PMTUD constants, integrated detector into main PMTUD flow |
| neqo-transport/src/stats.rs | Added pmtud_change counter to track black hole detection events |
| test-fixture/src/sim/mtu.rs | New DynamicMtu node for simulating MTU reductions during tests |
| test-fixture/src/sim/mod.rs | Exported DynamicMtu in network module |
| test-fixture/src/sim/connection.rs | Added test goals to verify PMTUD behavior (change/no-change expectations) |
| neqo-transport/tests/network.rs | Added integration tests for black hole detection and false-positive scenarios |
| neqo-transport/src/connection/tests/ecn.rs | Updated debug output expectation for new pmtud_change stat |
|
I'm going to defer to @mxinden on this one. At least for initial passes. Happy to jump in as needed. |
93fb8ef to
768d4bd
Compare
martinthomson
left a comment
There was a problem hiding this comment.
This looks pretty good. I didn't do too deep a review of some of the unit tests, but the high-level tests are good. I don't understand quite enough to give r+; I have some questions, but I don't think this is far off.
There was a problem hiding this comment.
My current theory is that there are VPNs or other things in the path that lower the MTU below what the outgoing interface MTU is but that don't cause IP address changes that would trigger path challenges.
Do we have evidence that supports this theory? Why would a VPN not cause an IP address change? Isn't that the reason why most people use a VPN? Do we expect such malfunctioning VPNs to be so widely deployed among Firefox Nightly users, that switching them has an impact on the 75th percentile?
Without any evidence that the cause for the high packet loss when enabling PMTUD is due to MTU black holes, does it make sense to merge a solution?
Alternative theory that would explain the high packet loss, but is not fixed by the BlackHoleDetector:
- The DONTFRAGMENT bit is not properly propagated on IPv4 connections, e.g. a VPN not properly configured.
- PMTUD increases the MTU beyond the path MTU.
- Lower level IP stack fragments.
- Fragmented packets are more likely to be lost intermittently, as the whole is declared lost even if actually only a single fragment went missing.
Why is this not fixed by this patch? If sufficient fragments get lost to trigger the ACK-based case, it will. |
It would run into an infinite cycle:
(I might be missing something in the current implementation.) |
I think you're on to something. How should we deal with this? We could stick to the minimum MTU after a black hole on a connection, or we could tie into the raise timer (10 min) to try again? Anything smarter? |
|
I'm glad to see that the performance tests show that wall clock time regresses with this change. That's expected, especially since our tests are highly sensitive to PMTU. It's disappointing to see that the noise on take_ranges continues unabated. |
Turning on PMTUD in Firefox causes loss rates to go up. My current theory is that there are VPNs or other things in the path than lower the MTU below what the outgoing interface MTU is but that don't cause IP address changes that would trigger path challenges. This mechanism adds a heuristic to restart PMTUD when we see loss of full-sized packets while smaller packets get ACKed. We had a version of this in the past that mozilla#3200 removed due to it triggering during normal bulk uploads.
d77caf1 to
33ed470
Compare
@larseggert can you explain how to the current version of this pull request addresses the issue above? Say that the actual path MTU is |
|
This is handled - we repeatedly lower the MTU. See the |
|
Thanks. That makes sense. If I understand correctly, this will require 3 packet train losses per search table entry (assuming no positive I am still against merging here.
I think the best next step is to find a setup that actually suffers from the high packet loss we see in telemetry, to understand (one of) the root cause(s). I don't mean to be the gate keeper here. If you want to proceed, feel free to overrule. |
Failed Interop TestsQUIC Interop Runner, client vs. server, differences relative to
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-pr as client
neqo-pr as server
|
Client/server transfer resultsPerformance differences relative to 26711da. Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Table above only shows statistically significant changes. See all results below. All resultsTransfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
Download data for |
Benchmark resultsSignificant performance differences relative to 26711da. transfer/1-conn/10_000-parallel-1b-resp (aka. RPS)/mtu-1504: 💔 Performance has regressed by +3.3490%. time: [281.27 ms 283.57 ms 285.88 ms]
thrpt: [34.980 Kelem/s 35.264 Kelem/s 35.553 Kelem/s]
change:
time: [+2.2857% +3.3490% +4.4269] (p = 0.00 < 0.05)
thrpt: [-4.2393% -3.2405% -2.2347]
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildRxStreamOrderer::inbound_frame(): 💚 Performance has improved by -1.3003%. time: [108.59 ms 108.69 ms 108.79 ms]
change: [-1.4950% -1.3003% -1.1304] (p = 0.00 < 0.05)
Performance has improved.transfer/walltime/pacing-true/same-seed: 💔 Performance has regressed by +3.2799%. time: [24.234 ms 24.264 ms 24.308 ms]
change: [+3.1178% +3.2799% +3.4666] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severeAll resultstransfer/1-conn/1-100mb-resp (aka. Download)/mtu-1504: Change within noise threshold. time: [198.89 ms 199.33 ms 199.87 ms]
thrpt: [500.32 MiB/s 501.68 MiB/s 502.79 MiB/s]
change:
time: [-1.1534% -0.7928% -0.4725] (p = 0.00 < 0.05)
thrpt: [+0.4747% +0.7991% +1.1668]
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severetransfer/1-conn/10_000-parallel-1b-resp (aka. RPS)/mtu-1504: 💔 Performance has regressed by +3.3490%. time: [281.27 ms 283.57 ms 285.88 ms]
thrpt: [34.980 Kelem/s 35.264 Kelem/s 35.553 Kelem/s]
change:
time: [+2.2857% +3.3490% +4.4269] (p = 0.00 < 0.05)
thrpt: [-4.2393% -3.2405% -2.2347]
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildtransfer/1-conn/1-1b-resp (aka. HPS)/mtu-1504: No change in performance detected. time: [38.663 ms 38.843 ms 39.045 ms]
thrpt: [25.612 B/s 25.745 B/s 25.865 B/s]
change:
time: [-0.1642% +0.4253% +0.9804] (p = 0.15 > 0.05)
thrpt: [-0.9709% -0.4235% +0.1644]
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low mild
9 (9.00%) high severetransfer/1-conn/1-100mb-req (aka. Upload)/mtu-1504: Change within noise threshold. time: [200.86 ms 201.35 ms 201.99 ms]
thrpt: [495.08 MiB/s 496.64 MiB/s 497.86 MiB/s]
change:
time: [-1.2474% -0.9301% -0.5852] (p = 0.00 < 0.05)
thrpt: [+0.5886% +0.9388% +1.2632]
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severedecode 4096 bytes, mask ff: No change in performance detected. time: [4.5115 µs 4.5186 µs 4.5262 µs]
change: [-0.5514% -0.1361% +0.4081] (p = 0.63 > 0.05)
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
5 (5.00%) high mild
2 (2.00%) high severedecode 1048576 bytes, mask ff: No change in performance detected. time: [1.1575 ms 1.1599 ms 1.1638 ms]
change: [-0.5416% -0.1072% +0.3690] (p = 0.66 > 0.05)
No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
9 (9.00%) low severe
1 (1.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severedecode 4096 bytes, mask 7f: No change in performance detected. time: [5.7943 µs 5.8035 µs 5.8131 µs]
change: [-0.4728% -0.1212% +0.2057] (p = 0.50 > 0.05)
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severedecode 1048576 bytes, mask 7f: No change in performance detected. time: [1.4869 ms 1.4903 ms 1.4951 ms]
change: [-0.6072% +0.0290% +0.5635] (p = 0.93 > 0.05)
No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severedecode 4096 bytes, mask 3f: No change in performance detected. time: [5.5340 µs 5.5422 µs 5.5508 µs]
change: [-0.2734% +0.0106% +0.2959] (p = 0.95 > 0.05)
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severedecode 1048576 bytes, mask 3f: No change in performance detected. time: [1.4140 ms 1.4161 ms 1.4182 ms]
change: [-0.3302% -0.0838% +0.1510] (p = 0.50 > 0.05)
No change in performance detected.streams/simulated/1-streams/each-1000-bytes: No change in performance detected. time: [129.68 ms 129.68 ms 129.69 ms]
thrpt: [7.5302 KiB/s 7.5304 KiB/s 7.5307 KiB/s]
change:
time: [-0.0023% +0.0014% +0.0051] (p = 0.49 > 0.05)
thrpt: [-0.0051% -0.0014% +0.0023]
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mildstreams/simulated/1000-streams/each-1-bytes: No change in performance detected. time: [2.5363 s 2.5366 s 2.5369 s]
thrpt: [394.18 B/s 394.23 B/s 394.28 B/s]
change:
time: [-0.0019% +0.0152% +0.0329] (p = 0.09 > 0.05)
thrpt: [-0.0329% -0.0152% +0.0019]
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mildstreams/simulated/1000-streams/each-1000-bytes: Change within noise threshold. time: [6.5820 s 6.5872 s 6.5931 s]
thrpt: [148.12 KiB/s 148.25 KiB/s 148.37 KiB/s]
change:
time: [-0.3997% -0.2225% -0.0602] (p = 0.01 < 0.05)
thrpt: [+0.0602% +0.2230% +0.4014]
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severestreams/walltime/1-streams/each-1000-bytes: Change within noise threshold. time: [591.09 µs 593.48 µs 596.10 µs]
change: [+0.7873% +1.3855% +1.9553] (p = 0.00 < 0.05)
Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) high mild
11 (11.00%) high severestreams/walltime/1000-streams/each-1-bytes: No change in performance detected. time: [12.494 ms 12.514 ms 12.534 ms]
change: [-0.6355% -0.1900% +0.1392] (p = 0.40 > 0.05)
No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildstreams/walltime/1000-streams/each-1000-bytes: No change in performance detected. time: [45.362 ms 45.467 ms 45.619 ms]
change: [-0.0558% +0.2274% +0.5790] (p = 0.18 > 0.05)
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severecoalesce_acked_from_zero 1+1 entries: No change in performance detected. time: [92.324 ns 92.797 ns 93.293 ns]
change: [-0.4052% +0.0917% +0.5384] (p = 0.72 > 0.05)
No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
8 (8.00%) high mild
4 (4.00%) high severecoalesce_acked_from_zero 3+1 entries: No change in performance detected. time: [109.81 ns 110.16 ns 110.53 ns]
change: [-1.3716% -0.5143% +0.2051] (p = 0.22 > 0.05)
No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) high mild
10 (10.00%) high severecoalesce_acked_from_zero 10+1 entries: No change in performance detected. time: [109.31 ns 109.77 ns 110.31 ns]
change: [-1.9584% -0.4737% +0.6878] (p = 0.55 > 0.05)
No change in performance detected.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
9 (9.00%) high severecoalesce_acked_from_zero 1000+1 entries: No change in performance detected. time: [94.840 ns 94.983 ns 95.155 ns]
change: [+0.0041% +0.5570% +1.1621] (p = 0.05 > 0.05)
No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) high mild
6 (6.00%) high severeRxStreamOrderer::inbound_frame(): 💚 Performance has improved by -1.3003%. time: [108.59 ms 108.69 ms 108.79 ms]
change: [-1.4950% -1.3003% -1.1304] (p = 0.00 < 0.05)
Performance has improved.sent::Packets::take_ranges: No change in performance detected. time: [4.4565 µs 4.5573 µs 4.6543 µs]
change: [-4.6891% -1.1965% +2.1629] (p = 0.50 > 0.05)
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mildtransfer/simulated/pacing-false/varying-seeds: No change in performance detected. time: [23.941 s 23.941 s 23.941 s]
thrpt: [171.09 KiB/s 171.09 KiB/s 171.09 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000]
No change in performance detected.transfer/simulated/pacing-true/varying-seeds: No change in performance detected. time: [23.676 s 23.676 s 23.676 s]
thrpt: [173.01 KiB/s 173.01 KiB/s 173.01 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000]
No change in performance detected.transfer/simulated/pacing-false/same-seed: No change in performance detected. time: [23.941 s 23.941 s 23.941 s]
thrpt: [171.09 KiB/s 171.09 KiB/s 171.09 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000]
No change in performance detected.transfer/simulated/pacing-true/same-seed: No change in performance detected. time: [23.676 s 23.676 s 23.676 s]
thrpt: [173.01 KiB/s 173.01 KiB/s 173.01 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000]
No change in performance detected.transfer/walltime/pacing-false/varying-seeds: Change within noise threshold. time: [23.420 ms 23.442 ms 23.467 ms]
change: [+2.4973% +2.6326% +2.7624] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severetransfer/walltime/pacing-true/varying-seeds: Change within noise threshold. time: [23.978 ms 24.002 ms 24.026 ms]
change: [+2.3716% +2.4877% +2.6182] (p = 0.00 < 0.05)
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mildtransfer/walltime/pacing-false/same-seed: Change within noise threshold. time: [23.503 ms 23.525 ms 23.549 ms]
change: [+1.9992% +2.1166% +2.2392] (p = 0.00 < 0.05)
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1 (1.00%) high severetransfer/walltime/pacing-true/same-seed: 💔 Performance has regressed by +3.2799%. time: [24.234 ms 24.264 ms 24.308 ms]
change: [+3.1178% +3.2799% +3.4666] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severeDownload data for |
|
Claude encountered an error —— View job I'll analyze this and get back to you. |
|
Nothing against finding a reproducer, I just think we won't be able to... |
Turning on PMTUD in Firefox causes loss rates to go up. My current theory is that there are VPNs or other things in the path that lower the MTU below what the outgoing interface MTU is but that don't cause IP address changes that would trigger path challenges.
This mechanism adds a heuristic to restart PMTUD when we see loss of full-sized packets while smaller packets get ACKed, or when a PTO happens.
We had a version of this in the past that #3200 removed due to it triggering during normal bulk uploads.