feat(http3): optimize stream sending to avoid blocked streams by larseggert · Pull Request #3015 · mozilla/neqo

larseggert · 2025-09-26T10:19:45Z

Implements optimizations to address wasteful attempts to send data on flow-control blocked streams and fixes iteration patterns that could lose streams during re-addition.

Key improvements:

Add blocked_streams HashSet to track flow-control blocked streams separately
Modify stream_has_pending_data() to skip adding blocked streams to pending queue
Add check_blocked_streams() method to automatically move unblocked streams back

Benefits:

Reduces CPU usage by eliminating repeated attempts on blocked streams
Improves responsiveness by automatically detecting when streams become unblocked

Closes #261

…ed streams Implements optimizations to address wasteful attempts to send data on flow-control blocked streams and fixes iteration patterns that could lose streams during re-addition. Key improvements: - Add blocked_streams HashSet to track flow-control blocked streams separately - Modify stream_has_pending_data() to skip adding blocked streams to pending queue - Replace mem::take() with single-stream iteration to handle re-additions properly - Add check_blocked_streams() method to automatically move unblocked streams back - Use HashSet::retain() for zero-allocation in-place stream filtering - Clean up both pending and blocked sets when streams are removed Benefits: - Reduces CPU usage by eliminating repeated attempts on blocked streams - Improves responsiveness by automatically detecting when streams become unblocked Closes mozilla#261

Copilot

Pull Request Overview

This PR optimizes HTTP/3 stream sending by introducing flow-control aware stream management to avoid repeatedly attempting to send data on blocked streams.

Adds a separate blocked_streams HashSet to track flow-control blocked streams
Implements check_blocked_streams() method to automatically detect when blocked streams become unblocked
Refactors stream iteration from mem::take() to single-stream processing to handle re-additions properly

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

neqo-http3/src/connection.rs

codecov · 2025-09-26T10:26:19Z

Codecov Report

❌ Patch coverage is 81.74603% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.34%. Comparing base (b9c32c7) to head (95245b9).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3015      +/-   ##
==========================================
- Coverage   93.41%   93.34%   -0.07%     
==========================================
  Files         124      124              
  Lines       36234    36350     +116     
  Branches    36234    36350     +116     
==========================================
+ Hits        33847    33930      +83     
- Misses       1540     1575      +35     
+ Partials      847      845       -2

Components	Coverage Δ
neqo-common	`97.32% <ø> (ø)`
neqo-crypto	`83.25% <ø> (-0.48%)`	⬇️
neqo-http3	`93.17% <81.74%> (-0.12%)`	⬇️
neqo-qpack	`94.18% <ø> (ø)`
neqo-transport	`94.48% <ø> (-0.01%)`	⬇️
neqo-udp	`78.94% <ø> (-0.48%)`	⬇️
mtu	`85.76% <ø> (ø)`

…²) performance ## Summary Implements optimizations to eliminate wasteful attempts to send data on flow-control blocked streams and addresses performance issues identified during code review. ## Key Improvements ### 1. Blocked Stream Tracking - Add `blocked_streams` HashSet to track flow-control blocked streams separately - Modify `stream_has_pending_data()` to skip adding blocked streams to pending queue - Add `check_blocked_streams()` method to automatically move unblocked streams back to pending - Clean up both pending and blocked sets when streams are removed ### 2. Performance Optimizations - Use `mem::take()` pattern in `send_non_control_streams()` to avoid O(n²) iteration - Implement batching in `check_blocked_streams()` to avoid expensive per-stream calls during iteration - Properly handle streams that get re-added during processing ## Benefits - **Reduces CPU usage** by eliminating repeated attempts on blocked streams - **Improves responsiveness** by automatically detecting when streams become unblocked - **Fixes algorithmic performance issues** in stream processing hot paths - **Maintains backward compatibility** with all existing tests passing Closes mozilla#261

Copilot

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

neqo-http3/src/connection.rs

Copilot · 2025-09-26T13:29:28Z

neqo-http3/src/connection.rs

+        let mut unblocked = Vec::new();
+        #[expect(
+            clippy::iter_over_hash_type,
+            reason = "OK to loop over streams in an undefined order."
+        )]
+        for &stream_id in &self.blocked_streams {
+            match conn.stream_avail_send_space(stream_id) {
+                Ok(0) => {
+                    // Still blocked, do nothing.
+                }
+                Ok(_) => {
+                    // Unblocked, collect for removal and move to pending data if needed.
+                    unblocked.push(stream_id);
+                }
+                Err(_) => {
+                    // Stream no longer exists, collect for removal.
+                    unblocked.push(stream_id);
+                }
+            }
+        }
+        // Remove all unblocked streams from blocked_streams.
+        for stream_id in unblocked {
+            self.blocked_streams.remove(&stream_id);
+            if let Some(stream) = self.send_streams.get(&stream_id) {
+                if stream.has_data_to_send() {
+                    self.streams_with_pending_data.insert(stream_id);
+                }
+            }
+        }


The method iterates over all blocked streams on every call, which could be expensive with many blocked streams. Consider optimizing by only checking blocked streams when flow control updates are received, or implementing a more efficient unblocking mechanism.

neqo-http3/src/connection.rs

mxinden

This increases the amount of state tracking, i.e. does not come for free both in terms of complexity, and potentially in terms of compute. Unless we have a benchmark showing this improves performance, i.e. shows that the additional tracking is cheaper than the continuous attempt at sending on a blocked stream, I don't think we should merge.

Copilot

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-29T12:16:19Z

neqo-http3/src/connection.rs

+        for stream_id in unblocked {
+            self.blocked_streams.remove(&stream_id);
+            if let Some(stream) = self.send_streams.get(&stream_id) {
+                if stream.has_data_to_send() {


This creates a potential infinite loop. When mark_stream_for_sending() is called, it adds the stream to streams_with_pending_data, which will cause it to be processed again in send_streams_with_pending_data(). If the stream becomes blocked again, it will be moved back to blocked_streams, then check_blocked_streams() will move it back to pending, creating a cycle.

Suggested change

if stream.has_data_to_send() {

if stream.has_data_to_send() && !self.streams_with_pending_data.contains(&stream_id) {

Copilot · 2025-09-29T12:16:19Z

neqo-http3/src/connection.rs

+                    match conn.stream_avail_send_space(stream_id) {
+                        Ok(0) => {
+                            // No space available, stream is likely blocked
+                            self.blocked_streams.insert(stream_id);
+                        }
+                        Ok(_) => {
+                            // Space available, stream can continue sending
+                            self.streams_with_pending_data.insert(stream_id);
+                        }


This logic will re-add a stream that was just processed back to streams_with_pending_data in the same iteration, potentially causing it to be processed again immediately. This could lead to inefficient repeated processing of the same stream within a single send_streams_with_pending_data() call.

github-actions · 2025-09-29T14:11:51Z

Bencher Report

Branch	fix-261
Testbed	On-prem

Click to view all benchmark results

Benchmark	Latency	Benchmark Result nanoseconds (ns) (Result Δ%)	Upper Boundary nanoseconds (ns) (Limit %)
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client	📈 view plot 🚷 view threshold	208,660,000.00 ns (+0.92%) Baseline: 206,766,840.58 ns	216,469,067.65 ns (96.39%)
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client	📈 view plot 🚷 view threshold	204,780,000.00 ns (+1.93%) Baseline: 200,912,115.94 ns	211,182,472.97 ns (96.97%)
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client	📈 view plot 🚷 view threshold	38,595,000.00 ns (+21.90%) Baseline: 31,661,324.64 ns	42,922,891.79 ns (89.92%)
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client	📈 view plot 🚷 view threshold	290,440,000.00 ns (-0.32%) Baseline: 291,370,840.58 ns	303,634,212.32 ns (95.65%)
1-streams/each-1000-bytes/simulated-time	📈 view plot 🚷 view threshold	118,970,000.00 ns (+0.25%) Baseline: 118,674,550.72 ns	120,712,253.92 ns (98.56%)
1-streams/each-1000-bytes/wallclock-time	📈 view plot 🚷 view threshold	586,310.00 ns (-1.07%) Baseline: 592,642.00 ns	615,476.24 ns (95.26%)
1000-streams/each-1-bytes/simulated-time	📈 view plot 🚷 view threshold	2,333,300,000.00 ns (-82.70%) Baseline: 13,490,980,579.71 ns	23,068,540,082.26 ns (10.11%)
1000-streams/each-1-bytes/wallclock-time	📈 view plot 🚷 view threshold	12,511,000.00 ns (-9.61%) Baseline: 13,841,753.62 ns	15,084,963.56 ns (82.94%)
1000-streams/each-1000-bytes/simulated-time	📈 view plot 🚷 view threshold	16,476,000,000.00 ns (-11.70%) Baseline: 18,659,620,289.86 ns	20,618,474,175.45 ns (79.91%)
1000-streams/each-1000-bytes/wallclock-time	📈 view plot 🚷 view threshold	49,854,000.00 ns (-1.61%) Baseline: 50,669,866.67 ns	57,072,130.62 ns (87.35%)
RxStreamOrderer::inbound_frame()	📈 view plot 🚷 view threshold	110,340,000.00 ns (+0.59%) Baseline: 109,696,956.52 ns	111,590,965.33 ns (98.88%)
coalesce_acked_from_zero 1+1 entries	📈 view plot 🚷 view threshold	89.64 ns (+0.87%) Baseline: 88.87 ns	90.13 ns (99.45%)
coalesce_acked_from_zero 10+1 entries	📈 view plot 🚷 view threshold	105.74 ns (-0.32%) Baseline: 106.07 ns	107.22 ns (98.62%)
coalesce_acked_from_zero 1000+1 entries	📈 view plot 🚷 view threshold	92.08 ns (+1.94%) Baseline: 90.32 ns	94.94 ns (96.98%)
coalesce_acked_from_zero 3+1 entries	📈 view plot 🚷 view threshold	106.54 ns (-0.05%) Baseline: 106.59 ns	107.68 ns (98.95%)
decode 1048576 bytes, mask 3f	📈 view plot 🚷 view threshold	1,760,900.00 ns (+7.70%) Baseline: 1,634,959.71 ns	1,809,989.32 ns (97.29%)
decode 1048576 bytes, mask 7f	📈 view plot 🚷 view threshold	5,048,300.00 ns (-0.36%) Baseline: 5,066,487.54 ns	5,113,021.54 ns (98.73%)
decode 1048576 bytes, mask ff	📈 view plot 🚷 view threshold	3,003,000.00 ns (-0.87%) Baseline: 3,029,363.77 ns	3,053,866.08 ns (98.33%)
decode 4096 bytes, mask 3f	📈 view plot 🚷 view threshold	6,252.20 ns (-15.02%) Baseline: 7,357.02 ns	10,374.04 ns (60.27%)
decode 4096 bytes, mask 7f	📈 view plot 🚷 view threshold	19,633.00 ns (-0.86%) Baseline: 19,802.48 ns	20,464.83 ns (95.94%)
decode 4096 bytes, mask ff	📈 view plot 🚷 view threshold	11,341.00 ns (-0.21%) Baseline: 11,365.41 ns	12,521.98 ns (90.57%)
sent::Packets::take_ranges	📈 view plot 🚷 view threshold	4,560.80 ns (-3.38%) Baseline: 4,720.32 ns	4,959.89 ns (91.95%)
transfer/pacing-false/same-seed/simulated-time/run	📈 view plot 🚷 view threshold	25,234,000,000.00 ns (-0.69%) Baseline: 25,410,201,166.18 ns	26,029,155,392.72 ns (96.95%)
transfer/pacing-false/same-seed/wallclock-time/run	📈 view plot 🚷 view threshold	24,777,000.00 ns (-3.53%) Baseline: 25,684,667.64 ns	27,106,786.37 ns (91.41%)
transfer/pacing-false/varying-seeds/simulated-time/run	📈 view plot 🚷 view threshold	25,190,000,000.00 ns (+0.06%) Baseline: 25,175,405,247.81 ns	25,224,533,916.06 ns (99.86%)
transfer/pacing-false/varying-seeds/wallclock-time/run	📈 view plot 🚷 view threshold	25,040,000.00 ns (-2.83%) Baseline: 25,768,239.07 ns	27,412,870.51 ns (91.34%)
transfer/pacing-true/same-seed/simulated-time/run	📈 view plot 🚷 view threshold	25,301,000,000.00 ns (-1.08%) Baseline: 25,577,483,965.01 ns	25,883,944,847.15 ns (97.75%)
transfer/pacing-true/same-seed/wallclock-time/run	📈 view plot 🚷 view threshold	26,401,000.00 ns (-2.33%) Baseline: 27,030,889.21 ns	28,602,554.35 ns (92.30%)
transfer/pacing-true/varying-seeds/simulated-time/run	📈 view plot 🚷 view threshold	25,016,000,000.00 ns (+0.08%) Baseline: 24,995,355,685.13 ns	25,044,029,996.46 ns (99.89%)
transfer/pacing-true/varying-seeds/wallclock-time/run	📈 view plot 🚷 view threshold	25,322,000.00 ns (-3.61%) Baseline: 26,269,545.19 ns	27,994,757.32 ns (90.45%)

🐰 View full continuous benchmarking report in Bencher

Signed-off-by: Lars Eggert <lars@eggert.org>

codspeed-hq · 2025-11-12T07:08:40Z

CodSpeed Performance Report

Merging #3015 will improve performances by 11.16%

_{Comparing larseggert:fix-261 (95245b9) with main (b9c32c7)}

Summary

⚡ 1 improvement
✅ 22 untouched

Benchmarks breakdown

	Mode	Benchmark	`BASE`	`HEAD`	Change
⚡	Simulation	`client`	852.3 ms	766.7 ms	+11.16%

github-actions · 2025-11-12T07:25:37Z

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to b9c32c7.

neqo-latest as client

neqo-latest vs. aioquic: A ⚠️L1 C1
neqo-latest vs. go-x-net: A BP BA
neqo-latest vs. haproxy: M A ⚠️L1 C1 BP BA
neqo-latest vs. kwik: BP BA
neqo-latest vs. linuxquic: A 🚀C1 ⚠️L1
neqo-latest vs. lsquic: L1 C1
neqo-latest vs. msquic: R Z A L1 C1
neqo-latest vs. mvfst: A L1
neqo-latest vs. nginx: A L1 C1 BP BA
neqo-latest vs. ngtcp2: A 🚀C1 CM
neqo-latest vs. picoquic: 🚀R ⚠️Z A L1 C1
neqo-latest vs. quic-go: A 🚀C1 ⚠️L1
neqo-latest vs. quiche: A 🚀L1 ⚠️C1 BP BA
neqo-latest vs. quinn: A 🚀L1 ⚠️C1
neqo-latest vs. s2n-quic: A 🚀BP BA CM
neqo-latest vs. tquic: S A ⚠️L1 BP BA
neqo-latest vs. xquic: run cancelled after 20 min

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest vs. aioquic: H DC LR C20 M S R Z 3 B U ⚠️L1 L2 ⚠️C1 C2 6 V2 BP BA
neqo-latest vs. go-x-net: H DC LR M B U L2 C2 6
neqo-latest vs. haproxy: H DC LR C20 S R Z 3 B U ⚠️L1 L2 C2 6 V2
neqo-latest vs. kwik: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
neqo-latest vs. linuxquic: H DC LR C20 M S R Z 3 B U E ⚠️L1 L2 🚀C1 C2 6 V2 BP BA CM
neqo-latest vs. lsquic: H DC LR C20 M S R Z 3 B U E A L2 C2 6 V2 BP BA CM
neqo-latest vs. msquic: H DC LR C20 M S B U L2 C2 6 V2 BP BA
neqo-latest vs. mvfst: H DC LR M R Z 3 B U L2 C1 C2 6 BP BA
neqo-latest vs. neqo: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
neqo-latest vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
neqo-latest vs. nginx: H DC LR C20 M S R Z 3 B U L2 C2 6
neqo-latest vs. ngtcp2: H DC LR C20 M S R Z 3 B U E L1 L2 🚀C1 C2 6 V2 BP BA
neqo-latest vs. picoquic: H DC LR C20 M S ⚠️Z 🚀R 3 B U E L2 C2 6 V2 BP BA
neqo-latest vs. quic-go: H DC LR C20 M S R Z 3 B U ⚠️L1 L2 🚀C1 C2 6 BP BA
neqo-latest vs. quiche: H DC LR C20 M S R Z 3 B U 🚀L1 L2 ⚠️C1 C2 6
neqo-latest vs. quinn: H DC LR C20 M S R Z 3 B U E 🚀L1 L2 ⚠️C1 C2 6 BP BA
neqo-latest vs. s2n-quic: H DC LR C20 M S R 3 B U E L1 L2 C1 C2 6 🚀BP
neqo-latest vs. tquic: H DC LR C20 M R Z 3 B U ⚠️L1 L2 C1 C2 6

neqo-latest as server

aioquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2 BP BA
chrome vs. neqo-latest: 3
go-x-net vs. neqo-latest: H DC LR M B U A L2 C2 6 BP BA
kwik vs. neqo-latest: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
linuxquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
lsquic vs. neqo-latest: H DC LR C20 M S R 3 B E A ⚠️L1 L2 C1 C2 6 V2 BP BA CM
msquic vs. neqo-latest: H DC LR C20 M S R Z B A L1 L2 C1 C2 6 V2 🚀BP BA
mvfst vs. neqo-latest: H DC LR M 3 B L2 C2 6 BP BA
neqo vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
ngtcp2 vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
openssl vs. neqo-latest: H DC C20 S R 3 B L2 C2 6 BP BA
picoquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
quic-go vs. neqo-latest: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 BP BA
quiche vs. neqo-latest: H DC LR M S R Z 3 B A L1 L2 C1 C2 6 BP BA
quinn vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 BP BA
s2n-quic vs. neqo-latest: H DC LR M S R 3 B E A ⚠️L1 L2 C1 C2 6 BP BA
tquic vs. neqo-latest: H DC LR M S R Z 3 B A L1 L2 C1 C2 6 BP BA
xquic vs. neqo-latest: H DC LR C20 S R Z 3 B U A L1 L2 C1 C2 6 BP BA

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest vs. aioquic: E CM
neqo-latest vs. go-x-net: C20 S R Z 3 E L1 C1 V2 CM
neqo-latest vs. haproxy: E CM
neqo-latest vs. kwik: E CM
neqo-latest vs. msquic: 3 E CM
neqo-latest vs. mvfst: C20 S E V2 CM
neqo-latest vs. nginx: E V2 CM
neqo-latest vs. picoquic: CM
neqo-latest vs. quic-go: E V2 CM
neqo-latest vs. quiche: E V2 CM
neqo-latest vs. quinn: V2 CM
neqo-latest vs. s2n-quic: Z V2
neqo-latest vs. tquic: E V2 CM

neqo-latest as server

aioquic vs. neqo-latest: E
chrome vs. neqo-latest: H DC LR C20 M S R Z B U E A L1 L2 C1 C2 6 V2 BP BA CM
go-x-net vs. neqo-latest: C20 S R Z 3 E L1 C1 V2
kwik vs. neqo-latest: E
lsquic vs. neqo-latest: Z U
msquic vs. neqo-latest: 3 E
mvfst vs. neqo-latest: C20 S R U E V2
openssl vs. neqo-latest: Z U E L1 C1 V2
quic-go vs. neqo-latest: E V2
quiche vs. neqo-latest: C20 U E V2
s2n-quic vs. neqo-latest: C20 Z U V2
tquic vs. neqo-latest: C20 U E V2
xquic vs. neqo-latest: E V2

github-actions · 2025-11-12T08:04:09Z

Benchmark results

Performance differences relative to b9c32c7.

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💔 Performance has regressed.

       time:   [204.47 ms 204.78 ms 205.09 ms]
       thrpt:  [487.58 MiB/s 488.32 MiB/s 489.06 MiB/s]
change:
       time:   [+1.6604% +1.8996% +2.1422%] (p = 0.00 < 0.05)
       thrpt:  [-2.0972% -1.8642% -1.6333%]

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: 💔 Performance has regressed.

       time:   [288.57 ms 290.44 ms 292.34 ms]
       thrpt:  [34.207 Kelem/s 34.431 Kelem/s 34.654 Kelem/s]
change:
       time:   [+1.0774% +1.9952% +2.9143%] (p = 0.00 < 0.05)
       thrpt:  [-2.8318% -1.9561% -1.0659%]
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high mild

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.

       time:   [38.443 ms 38.595 ms 38.770 ms]
       thrpt:  [25.793   B/s 25.910   B/s 26.013   B/s]
change:
       time:   [-0.7411% -0.1602% +0.4553%] (p = 0.60 > 0.05)
       thrpt:  [-0.4532% +0.1604% +0.7467%]
Found 5 outliers among 100 measurements (5.00%)

5 (5.00%) high severe

1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: Change within noise threshold.

       time:   [208.37 ms 208.66 ms 208.99 ms]
       thrpt:  [478.50 MiB/s 479.25 MiB/s 479.91 MiB/s]
change:
       time:   [-0.7270% -0.5355% -0.3313%] (p = 0.00 < 0.05)
       thrpt:  [+0.3324% +0.5384% +0.7323%]
Found 2 outliers among 100 measurements (2.00%)

1 (1.00%) high mild

1 (1.00%) high severe

decode 4096 bytes, mask ff: No change in performance detected.

       time:   [11.308 µs 11.341 µs 11.380 µs]
       change: [-1.0832% -0.4440% +0.0548%] (p = 0.14 > 0.05)
Found 13 outliers among 100 measurements (13.00%)

1 (1.00%) low severe

4 (4.00%) low mild

1 (1.00%) high mild

7 (7.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.

       time:   [2.9936 ms 3.0030 ms 3.0141 ms]
       change: [-0.3209% +0.1246% +0.5597%] (p = 0.61 > 0.05)
Found 10 outliers among 100 measurements (10.00%)

10 (10.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.

       time:   [19.582 µs 19.633 µs 19.691 µs]
       change: [-0.1588% +0.2884% +0.8085%] (p = 0.25 > 0.05)
Found 17 outliers among 100 measurements (17.00%)

1 (1.00%) low mild

2 (2.00%) high mild

14 (14.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.

       time:   [5.0337 ms 5.0483 ms 5.0657 ms]
       change: [-0.2913% +0.1011% +0.5009%] (p = 0.63 > 0.05)
Found 15 outliers among 100 measurements (15.00%)

15 (15.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.

       time:   [6.2161 µs 6.2522 µs 6.2960 µs]
       change: [-0.0048% +0.4681% +1.0177%] (p = 0.08 > 0.05)
Found 18 outliers among 100 measurements (18.00%)

8 (8.00%) low mild

1 (1.00%) high mild

9 (9.00%) high severe

decode 1048576 bytes, mask 3f: Change within noise threshold.

       time:   [1.7581 ms 1.7609 ms 1.7664 ms]
       change: [-1.9950% -1.0072% -0.2316%] (p = 0.02 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high severe

1-streams/each-1000-bytes/wallclock-time: No change in performance detected.

       time:   [585.12 µs 586.31 µs 587.71 µs]
       change: [-0.4693% +0.0600% +0.5569%] (p = 0.82 > 0.05)
Found 2 outliers among 100 measurements (2.00%)

2 (2.00%) high severe

1-streams/each-1000-bytes/simulated-time

time:   [118.76 ms 118.97 ms 119.17 ms]

thrpt:  [8.1944 KiB/s 8.2087 KiB/s 8.2232 KiB/s]

change:

time:   [-0.5300% -0.2576% +0.0046%] (p = 0.05 > 0.05)

thrpt:  [-0.0046% +0.2583% +0.5328%]

No change in performance detected.

1000-streams/each-1-bytes/wallclock-time: 💚 Performance has improved.

       time:   [12.476 ms 12.511 ms 12.546 ms]
       change: [-2.5823% -2.1537% -1.7459%] (p = 0.00 < 0.05)
Found 2 outliers among 100 measurements (2.00%)

2 (2.00%) high mild

1000-streams/each-1-bytes/simulated-time

time:   [2.3292 s 2.3333 s 2.3374 s]

thrpt:  [427.82   B/s 428.57   B/s 429.33   B/s]

change:

time:   [-0.3136% -0.0696% +0.1533%] (p = 0.57 > 0.05)

thrpt:  [-0.1530% +0.0697% +0.3146%]

No change in performance detected.

1000-streams/each-1000-bytes/wallclock-time: 💚 Performance has improved.

       time:   [49.733 ms 49.854 ms 49.977 ms]
       change: [-1.9723% -1.4463% -1.0104%] (p = 0.00 < 0.05)

1000-streams/each-1000-bytes/simulated-time: No change in performance detected.

       time:   [16.214 s 16.476 s 16.737 s]
       thrpt:  [58.346 KiB/s 59.271 KiB/s 60.229 KiB/s]
change:
       time:   [-1.8031% +0.3896% +2.6943%] (p = 0.72 > 0.05)
       thrpt:  [-2.6236% -0.3881% +1.8362%]

coalesce_acked_from_zero 1+1 entries: No change in performance detected.

       time:   [89.258 ns 89.640 ns 90.018 ns]
       change: [-0.1880% +0.2282% +0.6918%] (p = 0.32 > 0.05)
Found 14 outliers among 100 measurements (14.00%)

11 (11.00%) high mild

3 (3.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.

       time:   [106.18 ns 106.54 ns 106.93 ns]
       change: [-0.5614% +0.0503% +0.6192%] (p = 0.87 > 0.05)
Found 14 outliers among 100 measurements (14.00%)

1 (1.00%) high mild

13 (13.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.

       time:   [105.41 ns 105.74 ns 106.15 ns]
       change: [-1.1771% -0.3829% +0.3972%] (p = 0.35 > 0.05)
Found 11 outliers among 100 measurements (11.00%)

3 (3.00%) low mild

1 (1.00%) high mild

7 (7.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.

       time:   [91.929 ns 92.075 ns 92.238 ns]
       change: [-0.0523% +0.6052% +1.3694%] (p = 0.09 > 0.05)
Found 11 outliers among 100 measurements (11.00%)

6 (6.00%) high mild

5 (5.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.

       time:   [110.14 ms 110.34 ms 110.66 ms]
       change: [+0.5859% +0.7739% +1.0999%] (p = 0.00 < 0.05)
Found 19 outliers among 100 measurements (19.00%)

2 (2.00%) low severe

13 (13.00%) low mild

2 (2.00%) high mild

2 (2.00%) high severe

sent::Packets::take_ranges: No change in performance detected.

       time:   [4.4608 µs 4.5608 µs 4.6527 µs]
       change: [-2.3333% +0.9104% +4.5284%] (p = 0.62 > 0.05)
Found 2 outliers among 100 measurements (2.00%)

2 (2.00%) high mild

transfer/pacing-false/varying-seeds/wallclock-time/run: Change within noise threshold.

       time:   [24.988 ms 25.040 ms 25.101 ms]
       change: [+0.9575% +1.2453% +1.5561%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high severe

transfer/pacing-false/varying-seeds/simulated-time/run: No change in performance detected.

       time:   [25.155 s 25.190 s 25.225 s]
       thrpt:  [162.38 KiB/s 162.61 KiB/s 162.83 KiB/s]
change:
       time:   [-0.1498% +0.0425% +0.2387%] (p = 0.67 > 0.05)
       thrpt:  [-0.2381% -0.0424% +0.1500%]

transfer/pacing-true/varying-seeds/wallclock-time/run: Change within noise threshold.

       time:   [25.262 ms 25.322 ms 25.385 ms]
       change: [-1.0386% -0.7009% -0.3595%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high mild

transfer/pacing-true/varying-seeds/simulated-time/run: No change in performance detected.

       time:   [24.980 s 25.016 s 25.052 s]
       thrpt:  [163.50 KiB/s 163.74 KiB/s 163.97 KiB/s]
change:
       time:   [-0.0809% +0.1287% +0.3384%] (p = 0.21 > 0.05)
       thrpt:  [-0.3373% -0.1285% +0.0810%]
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high mild

transfer/pacing-false/same-seed/wallclock-time/run: Change within noise threshold.

       time:   [24.746 ms 24.777 ms 24.823 ms]
       change: [+1.5039% +1.7092% +1.9363%] (p = 0.00 < 0.05)
Found 5 outliers among 100 measurements (5.00%)

4 (4.00%) high mild

1 (1.00%) high severe

transfer/pacing-false/same-seed/simulated-time/run: No change in performance detected.

       time:   [25.234 s 25.234 s 25.234 s]
       thrpt:  [162.32 KiB/s 162.32 KiB/s 162.32 KiB/s]
change:
       time:   [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
       thrpt:  [+0.0000% +0.0000% +0.0000%]

transfer/pacing-true/same-seed/wallclock-time/run: No change in performance detected.

       time:   [26.386 ms 26.401 ms 26.417 ms]
       change: [-0.0512% +0.1611% +0.3147%] (p = 0.08 > 0.05)

transfer/pacing-true/same-seed/simulated-time/run: No change in performance detected.

       time:   [25.301 s 25.301 s 25.301 s]
       thrpt:  [161.89 KiB/s 161.89 KiB/s 161.89 KiB/s]
change:
       time:   [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
       thrpt:  [+0.0000% +0.0000% +0.0000%]

Download data for profiler.firefox.com or download performance comparison data.

github-actions · 2025-11-12T08:22:01Z

Client/server transfer results

Performance differences relative to b9c32c7.

Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.

Client vs. server (params)	Mean ± σ	Min	Max	MiB/s ± σ	Δ `main`	Δ `main`
google vs. google	455.0 ± 3.8	449.1	465.6	70.3 ± 8.4
google vs. neqo (cubic, paced)	279.8 ± 4.7	271.6	288.9	114.4 ± 6.8	-0.6	-0.2%
msquic vs. msquic	197.1 ± 67.6	145.5	457.5	162.4 ± 0.5
msquic vs. neqo (cubic, paced)	217.5 ± 64.9	143.7	466.5	147.1 ± 0.5	-4.2	-1.9%
neqo vs. google (cubic, paced)	758.1 ± 8.4	748.9	828.2	42.2 ± 3.8	1.1	0.1%
neqo vs. msquic (cubic, paced)	156.8 ± 4.2	148.9	165.2	204.1 ± 7.6	💚 -1.5	-0.9%
neqo vs. neqo (cubic)	94.7 ± 4.4	86.9	104.5	338.0 ± 7.3	💔 2.3	2.5%
neqo vs. neqo (cubic, paced)	94.6 ± 4.2	87.7	105.4	338.3 ± 7.6	-0.5	-0.5%
neqo vs. neqo (reno)	95.6 ± 4.1	85.7	106.4	334.6 ± 7.8	💔 2.2	2.3%
neqo vs. neqo (reno, paced)	96.2 ± 4.0	89.1	106.4	332.5 ± 8.0	1.0	1.1%
neqo vs. quiche (cubic, paced)	193.3 ± 4.4	186.7	202.5	165.5 ± 7.3	-0.9	-0.5%
neqo vs. s2n (cubic, paced)	218.8 ± 3.7	213.3	226.3	146.3 ± 8.6	💚 -4.0	-1.8%
quiche vs. neqo (cubic, paced)	155.8 ± 4.6	142.0	164.4	205.4 ± 7.0	💚 -2.0	-1.3%
quiche vs. quiche	146.2 ± 4.5	138.5	158.7	218.9 ± 7.1
s2n vs. neqo (cubic, paced)	174.2 ± 4.8	161.8	185.7	183.7 ± 6.7	💔 3.5	2.0%
s2n vs. s2n	244.6 ± 21.3	231.4	342.0	130.8 ± 1.5

Download data for profiler.firefox.com or download performance comparison data.

Copilot AI review requested due to automatic review settings September 26, 2025 10:19

larseggert changed the title ~~feat(http3): optimize stream sending to avoid repeatedly trying block…~~ feat(http3): optimize stream sending to avoid blocked streams Sep 26, 2025

Copilot AI reviewed Sep 26, 2025

View reviewed changes

neqo-http3/src/connection.rs Outdated Show resolved Hide resolved

neqo-http3/src/connection.rs Outdated Show resolved Hide resolved

larseggert marked this pull request as ready for review September 26, 2025 13:28

larseggert requested review from KershawChang and martinthomson as code owners September 26, 2025 13:28

Copilot AI review requested due to automatic review settings September 26, 2025 13:28

larseggert requested a review from mxinden as a code owner September 26, 2025 13:28

Copilot AI reviewed Sep 26, 2025

View reviewed changes

Add tests

a376d4b

mxinden reviewed Sep 28, 2025

View reviewed changes

Fixes

15f1ad4

Copilot AI review requested due to automatic review settings September 29, 2025 12:15

Copilot AI reviewed Sep 29, 2025

View reviewed changes

Merge branch 'main' into fix-261

95245b9

Signed-off-by: Lars Eggert <lars@eggert.org>

larseggert marked this pull request as draft February 12, 2026 13:18

	if stream.has_data_to_send() {
	if stream.has_data_to_send() && !self.streams_with_pending_data.contains(&stream_id) {

Conversation

larseggert commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Copilot AI Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mxinden left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

Uh oh!

codspeed-hq bot commented Nov 12, 2025

CodSpeed Performance Report

Merging #3015 will improve performances by 11.16%

Summary

Benchmarks breakdown

Uh oh!

github-actions bot commented Nov 12, 2025

Failed Interop Tests

neqo-latest as client

neqo-latest as server

Succeeded Interop Tests

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

neqo-latest as client

neqo-latest as server

Uh oh!

github-actions bot commented Nov 12, 2025

Benchmark results

Uh oh!

github-actions bot commented Nov 12, 2025

Client/server transfer results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

larseggert commented Sep 26, 2025 •

edited

Loading

codecov bot commented Sep 26, 2025 •

edited

Loading

github-actions bot commented Sep 29, 2025 •

edited

Loading