Skip to content

fetch payload of multiple datagrams at once#651

Open
kazuho wants to merge 25 commits intomasterfrom
kazuho/scatter-stream4
Open

fetch payload of multiple datagrams at once#651
kazuho wants to merge 25 commits intomasterfrom
kazuho/scatter-stream4

Conversation

@kazuho
Copy link
Copy Markdown
Member

@kazuho kazuho commented Jan 29, 2026

Problem: Up until now, the on_send_emit callback has been invoked for each STREAM frame being built. This has become a bottleneck, due to two reasons:

  • Applications might have high static cost for generating each payload. For examples, they might be calling pread for each call to on_send_emit.
  • Running accounting and prioritization logic for each packet being built is also expensive.

Solution:

Earlier attempts (#609 and #648) tried to reduce the overhead by fetching the payload for multiple QUIC packets in one go—either via preadv, or via pread followed by memmove so the payload ends up exactly where it will be encrypted in place.

The drawback of those approaches is that the payload has to be scattered per packet before encrypted, which is inefficient.

This PR takes a different approach. Assuming the buffer passed to quicly_send has some extra space beyond the region where packets are built, it:

  • Switches to out-of-place encryption: frames (including application payload) are written starting roughly one MTU after the packet-build area.
  • When quicly_send_stream runs, it reads payload for multiple QUIC packets contiguously into the frame area (i.e., without scattering). It then repeatedly builds a full-sized QUIC packet from the first bytes of the frame data, and prepends a STREAM header to the remaining payload, until the remaining frame data becomes smaller than one MTU.

When running at full speed, quicly_send typically builds multiple packets in batch (often up to ten). Because the packet and frame construction regions overlap within the same buffer, the additional L1$ footprint due to the use of out-of-place encryption is minimized.

Separately, this pull request allows the on_send_emit callback to return length of zero to indicate that payload is not immediately available, which causes quicly_send_stream to bail out with an error code of QUICLY_ERROR_SEND_EMIT_BLOCKED. When using an on_send_emit callback that behaves as such, the stream scheduler should handle the new error code and refrain from calling quicly_send_stream until the payload becomes available.

Comment thread lib/quicly.c Outdated
@kazuho kazuho changed the title Kazuho/scatter stream4 fetch payload of multiple datagrams at once Jan 29, 2026
@kazuho kazuho force-pushed the kazuho/scatter-stream4 branch from 5ee458a to 4b7df80 Compare January 29, 2026 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant