Skip to content

feat: add Large Message Segmentation extension for GossipSub v1.3#1323

Open
shivv23 wants to merge 2 commits into
libp2p:mainfrom
shivv23:feat/large-message-segmentation
Open

feat: add Large Message Segmentation extension for GossipSub v1.3#1323
shivv23 wants to merge 2 commits into
libp2p:mainfrom
shivv23:feat/large-message-segmentation

Conversation

@shivv23
Copy link
Copy Markdown

@shivv23 shivv23 commented May 8, 2026

Summary

Implement the Large Message Segmentation extension for GossipSub v1.3 in py-libp2p. This allows nodes to transparently split oversize pubsub payloads (>256 KiB) into ordered segments, propagate them independently through the mesh, and reassemble them on the receiving side — without any changes to the application-layer API.

The implementation follows the experimental spec draft from seetadev/specs#2 and integrates with the existing GossipSub v1.3 Extensions Control Message framework (gossipsub-v1.3.md).

Why this matters: Several emerging libp2p workloads — distributed ML training checkpoints, Ethereum blob propagation (EIP-7594 / PeerDAS), large event logs — routinely produce pubsub messages in the megabyte range. Currently, py-libp2p silently drops any message whose serialized RPC exceeds 1 MiB (see RpcQueue.split_rpc). This PR closes that gap.


Design

Wire format

A new experimental protobuf message is added to rpc.proto:

message LargeMessageSegmentationExtension {
  optional bytes  messageID     = 1;   // groups segments of the same original message
  optional uint32 segmentIndex  = 2;   // ordinal position (0-based)
  optional uint32 totalSegments = 3;   // total segment count
  optional bytes  payload       = 4;   // raw bytes for this segment
  optional bytes  checksum      = 5;   // SHA-256 of the full payload (receiver verifies)
}

Registered on the RPC message at field 6492435 (experimental range, >0x200000 per v1.3 spec).

Extension advertisement

Peers advertise support via the v1.3 ControlExtensions handshake. PeerExtensions gains a large_message_segmentation: bool field. The ExtensionsState.both_support_large_message_segmentation(peer_id) query gates segmentation to peers that have mutually negotiated the extension — unmodified peers receive the full message as before.

Segmentation (sender side, in GossipSub.publish)

  1. Compute pubsub_msg.SerializeToString(). If the serialized size ≤ segment_size (default 256 KiB), send as-is.
  2. Otherwise, generate a unique messageID (reuses seqno when available), split the serialized message into ceiling(len(data) / segment_size) chunks.
  3. For each chunk, emit a standalone RPC containing only the LargeMessageSegmentationExtension — no publish, control, or subscriptions fields.
  4. Only segment to peers that advertised largeMessageSegmentation support. All other peers receive the original message (which may be dropped at the transport if >1 MiB, but the protocol remains correct).

Reassembly (receiver side, via ReassemblyBuffer in segmentation.py)

  1. continuously_read_stream detects the largeMessageSegmentation field on the incoming RPC and routes it to the ReassemblyBuffer.
  2. The buffer indexes segments by messageID. It tracks total expected count, received indices, and a first-seen timestamp.
  3. When len(received) == totalSegments, the segments are sorted by index, concatenated, and the SHA-256 checksum is verified.
  4. The reconstructed bytes are parsed as an rpc_pb2.Message and injected into the normal push_msg validation pipeline. The application sees a normal pubsub message — no API changes required.
  5. Incomplete sets older than 120 seconds are garbage-collected to prevent memory exhaustion.

Backward compatibility

Scenario Behavior
New sender → New receiver Segmented
New sender → Old receiver Full message sent (fallback)
Old sender → New receiver Normal message; no segment field → no-op
Mixed mesh Each peer pair independently negotiates

Files changed

File Change
libp2p/pubsub/pb/rpc.proto Added LargeMessageSegmentationExtension message + largeMessageSegmentation field on RPC and ControlExtensions
libp2p/pubsub/segmentation.py New: Segment, ReassemblyBuffer, segment_message(), reassemble_segments(), should_segment()
libp2p/pubsub/extensions.py Added large_message_segmentation to PeerExtensions + query methods
libp2p/pubsub/gossipsub.py Segmentation check in publish(); segment intercept in handle_rpc(); max_msg_size / segment_size params
libp2p/pubsub/pubsub.py max_msg_size plumbing → RpcQueue; segment intercept in continuously_read_stream

Test plan

  • Unit tests for segment_message / reassemble_segments (happy path, checksum mismatch, empty input)
  • Unit tests for ReassemblyBuffer (single segment, out-of-order delivery, partial set, timeout eviction, max cap eviction)
  • Integration test: py-libp2p sender → py-libp2p receiver with 4 MiB payload
  • Interop test: py-libp2p sender → nim-libp2p receiver (requires companion PR in nim-libp2p)

Open questions for reviewers

  1. messageID collision resistance: Currently derived from seqno which is (peer_id, counter). Is this sufficient, or should we include a topic hash to avoid cross-topic collisions?
  2. Segment backpressure: Should segments be queued with priority below normal pubsub messages to avoid starving small messages?
  3. Segment-level flow control: The spec draft doesn't define a window/ACK mechanism. For very large messages (>10 MiB), should we recommend application-level pacing?

Implements transparent segmentation of large pubsub payloads (>256 KiB)
in py-libp2p. Messages exceeding the segment size threshold are split into
ordered chunks, propagated independently through the mesh, and reassembled
on the receiving side.

Key components:
- rpc.proto: LargeMessageSegmentationExtension message + capability flag
- segmentation.py: segment_message(), reassemble_segments(), ReassemblyBuffer
- extensions.py: large_message_segmentation field in PeerExtensions
- gossipsub.py: segmentation in publish(), intercept in handle_rpc()
- pubsub.py: max_msg_size config, segment routing in read loop
- tests: 20 unit tests for segmentation and reassembly

Spec: seetadev/specs#2
@shivv23
Copy link
Copy Markdown
Author

shivv23 commented May 8, 2026

@shivv23C4GT DMP 2026 contributor, working on GossipSub 1.4 Large Message Handling

👋 Hey @seetadev, @johannamoran, @theUtkarshRaj, @mhchia, @ralexstokes — this PR implements the Large Message Segmentation extension for py-libp2p's GossipSub v1.3.

Context: This is part of the GossipSub 1.4 spec effort (experimental draft at seetadev/specs#2). The spec defines the wire format and protocol semantics; this PR is the py-libp2p reference implementation.

Design decisions I'd particularly like feedback on:

  • The extension is negotiated via the v1.3 ControlExtensions handshake — unmodified peers fall back to receiving the full message. Does this approach to backward compat look right?
  • Segment-level flow control isn't in the spec draft yet. For payloads in the 10+ MiB range, do you see a need for a window/ACK mechanism, or is application-level pacing sufficient?
  • I've registered largeMessageSegmentation at field number 6492435 (experimental range, >0x200000). Happy to change this if there's a canonical allocation process I should follow.

Happy to iterate on any of this — design, style, test coverage, whatever makes sense for the project.

@theUtkarshRaj
Copy link
Copy Markdown

@shivv23C4GT DMP 2026 contributor, working on GossipSub 1.4 Large Message Handling

👋 Hey @seetadev, @johannamoran, @theUtkarshRaj, @mhchia, @ralexstokes — this PR implements the Large Message Segmentation extension for py-libp2p's GossipSub v1.3.

Context: This is part of the GossipSub 1.4 spec effort (experimental draft at seetadev/specs#2). The spec defines the wire format and protocol semantics; this PR is the py-libp2p reference implementation.

Design decisions I'd particularly like feedback on:

  • The extension is negotiated via the v1.3 ControlExtensions handshake — unmodified peers fall back to receiving the full message. Does this approach to backward compat look right?
  • Segment-level flow control isn't in the spec draft yet. For payloads in the 10+ MiB range, do you see a need for a window/ACK mechanism, or is application-level pacing sufficient?
  • I've registered largeMessageSegmentation at field number 6492435 (experimental range, >0x200000). Happy to change this if there's a canonical allocation process I should follow.

Happy to iterate on any of this — design, style, test coverage, whatever makes sense for the project.

Hey @shivv23, this is great to see — the spec PR was sitting in a vacuum without an implementation tracking it, so this helps a lot. Your approach (handshake-gated fallback through v1.3 ControlExtensions) is basically what I had in mind when drafting it.

Couple of things on the spec side worth aligning on:

Field numbers — I went with 8473921 in the spec, you went with 6492435. Both are in the experimental range so neither is wrong, but we should pick one. I don't really care which — let me know if 6492435 works for you and I'll update the spec, or I can keep 8473921 and you can adjust. Just need them to match.

Segment size — you defaulted to 256 KiB, I have 1 MiB as the max in the spec. These don't actually conflict, 256 KiB is fine as an implementation default under a 1 MiB ceiling, but I should probably make that more explicit in the spec text so it doesn't look like we disagree.

On your open questions:

  1. messageID collisions — I went with SHA-256(publisherPeerID || topic || nonce)[:16] in the spec for exactly the reason you raised. Seqno alone is fine within a single publisher but doesn't cover cross-topic, so I'd suggest pulling topic into the derivation on your end too.

  2. Backpressure / segment priority — honestly I don't think this belongs in the spec at all. It's an implementation choice. The spec already says segments share the parent mesh, beyond that I think queueing strategy should be left to the implementer. Your idea of deprioritizing segments below small messages sounds reasonable.

  3. Flow control / ACK window — I deliberately left retry out of the current draft because it gets messy and is better answered with real interop data. For very large payloads I'd lean on application-level pacing in v1.4. If we hit limits in testing we can revisit, maybe as a separate spec proposal.

I'll push a small spec update soon to clarify the segment-size point and tighten the messageID wording. For interop testing — once you have the test plan landing, happy to coordinate on the nim-libp2p side.

@shivv23
Copy link
Copy Markdown
Author

shivv23 commented May 8, 2026

Thanks @theUtkarshRaj — this is exactly the kind of alignment I was hoping for. Let me lock down the action items:

Field numbers: Let's go with 6492435 since it's already in the code here. If you update the spec to match, we're consistent on day one. I can open a PR against your spec draft with the number change and the segment-size clarification if that saves you a cycle — just say the word.

messageID derivation: You're right that seqno alone is too narrow. I'll switch to SHA-256(publisherPeerID || topic || nonce)[:16] in the next push. The topics are available in GossipSub.publish() where segmentation happens, so it's a straightforward change — and it covers the cross-topic collision case cleanly.

Segment size / ceiling: Agreed — I'll leave the code default at 256 KiB and note in a comment that the spec ceiling is 1 MiB. Your clarification in the spec text would be helpful for other implementers reading it.

Interop: Once the next iteration is pushed (messageID fix + any review feedback), I'll draft the interop test plan. Would be good to set up a quick sync — if there's a channel or meeting cadence for this project, loop me in.

One thing I'd like your eyes on specifically: the ReassemblyBuffer in segmentation.py uses a flat timeout of 120s for GC. Does nim-libp2p's prototype use the same window, or should I parameterize this per-implementation?

@theUtkarshRaj
Copy link
Copy Markdown

Thanks @theUtkarshRaj — this is exactly the kind of alignment I was hoping for. Let me lock down the action items:

Field numbers: Let's go with 6492435 since it's already in the code here. If you update the spec to match, we're consistent on day one. I can open a PR against your spec draft with the number change and the segment-size clarification if that saves you a cycle — just say the word.

messageID derivation: You're right that seqno alone is too narrow. I'll switch to SHA-256(publisherPeerID || topic || nonce)[:16] in the next push. The topics are available in GossipSub.publish() where segmentation happens, so it's a straightforward change — and it covers the cross-topic collision case cleanly.

Segment size / ceiling: Agreed — I'll leave the code default at 256 KiB and note in a comment that the spec ceiling is 1 MiB. Your clarification in the spec text would be helpful for other implementers reading it.

Interop: Once the next iteration is pushed (messageID fix + any review feedback), I'll draft the interop test plan. Would be good to set up a quick sync — if there's a channel or meeting cadence for this project, loop me in.

One thing I'd like your eyes on specifically: the ReassemblyBuffer in segmentation.py uses a flat timeout of 120s for GC. Does nim-libp2p's prototype use the same window, or should I parameterize this per-implementation?

Thanks @shivv23 — sounds good, this is roughly how I was hoping the spec and implementation would converge.

Yes, please go ahead and open the PR against my spec draft with the field number update (8473921 → 6492435) and the segment-size ceiling note. Saves me a cycle and it's cleaner for reviewers to see the spec and code align in the same iteration.

On the messageID change — should be a small diff in your publish path. Once it lands I'll cross-reference your implementation from the spec text as the reference derivation.

On the reassembly timeout: I think this is worth pinning down in the spec rather than leaving fully per-implementation. 120s is a reasonable default but the right number depends on expected payload size and mesh propagation latency. My lean is to parameterize it as a configurable per-topic value with a recommended default in the 60–120s range — implementations expose the knob, applications tune it. nim-libp2p's prototype hasn't standardized this either as far as I've seen, so we have room to set the precedent. I'll add a reassembly lifecycle section to the spec in the next push covering this alongside the buffer caps from the security considerations.

On sync — there's no formal channel set up that I'm aware of. For now I'd suggest using the PR comment threads as the primary surface and pinging @seetadev / @johannamoran when we hit decisions that need mentor input. Closer to interop testing it'll probably make sense to set up a real-time sync, happy to organize that when we get there.

…field alignment

- Switch messageID from raw seqno to SHA-256(publisherID || topic || nonce)[:16]
  per spec draft recommendation (avoids cross-publisher/cross-topic collisions)
- Add compute_message_id() helper to segmentation.py
- Add reassembly_timeout config parameter (default 120s)
- Align extensions.proto field number 8473921 -> 6492435
  (coordination PR: theUtkarshRaj/specs#1)
@shivv23
Copy link
Copy Markdown
Author

shivv23 commented May 8, 2026

All aligned and pushed:

Spec PR: theUtkarshRaj/specs#1 — field number updated to 6492435, RPC.largeSegmentation renamed to RPC.largeMessageSegmentation, and added a note about the 256 KiB default under the 1 MiB ceiling.

py-libp2p update (c05a75b):

  • messageID now uses SHA-256(publisherPeerID || topic || nonce)[:16] — exactly what the spec draft prescribes
  • reassembly_timeout is now configurable (default 120s), surfaced through GossipSub.__init__
  • ready for the reassembly lifecycle section when your spec update lands

theUtkarshRaj pushed a commit to theUtkarshRaj/specs that referenced this pull request May 8, 2026
- Change largeMessageSegmentation field from 8473921 to 6492435
  in both ControlExtensions and RPC to match py-libp2p (PR libp2p/py-libp2p#1323)
- Rename RPC.largeSegmentation to RPC.largeMessageSegmentation for consistency
- Note py-libp2p's 256 KiB default segment size under Open Question 2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants