From 05a41cac5e671690ee32d2f2c0a3efdac10dc948 Mon Sep 17 00:00:00 2001 From: theutkarshraj Date: Wed, 6 May 2026 23:49:47 +0530 Subject: [PATCH 1/4] gossipsub: add experimental large message segmentation draft Co-authored-by: Cursor --- .../large-message-segmentation.md | 84 +++++++++++++++++++ pubsub/gossipsub/extensions/extensions.proto | 16 ++++ 2 files changed, 100 insertions(+) create mode 100644 pubsub/gossipsub/extensions/experimental/large-message-segmentation.md diff --git a/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md b/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md new file mode 100644 index 000000000..eaee8a83c --- /dev/null +++ b/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md @@ -0,0 +1,84 @@ +# Large Message Segmentation Extension + +| Lifecycle Stage | Maturity | Status | Latest Revision | +| --------------- | ------------- | ------ | --------------- | +| 1A | Working Draft | Active | r0, 2026-05-06 | + +Authors: [@theUtkarshRaj] + +Interest Group: [@seetadev], [@johannamoran] + +[@theUtkarshRaj]: https://github.com/theUtkarshRaj +[@seetadev]: https://github.com/seetadev +[@johannamoran]: https://github.com/johannamoran + +See the [lifecycle document][lifecycle-spec] for context about the maturity level +and spec status. + +[lifecycle-spec]: https://github.com/libp2p/specs/blob/master/00-framework-01-spec-lifecycle.md + +## Overview + +This draft explores transparent segmentation of large payloads at the Gossipsub +layer so implementations can propagate data that may not fit practical message +size expectations in a single unit. This differs from the Partial Messages +extension: partial-messages optimizes for the case where a peer already holds +most of a message, while segmentation handles the case where no peer holds the +full payload yet and it must be chunked, propagated, and reassembled. The two +approaches are complementary, not competing, and future interoperability +testing between py-libp2p and nim-libp2p will help validate the boundary. + +## Motivation + +One motivation is emerging workloads where single logical payloads are often +large, such as distributed AI model updates, large event logs, and state +snapshots. One approach may be to segment these payloads for transport while +preserving existing pubsub topic behavior. + +## Segment Structure + +One approach may be to encode each segment with a compact envelope: + +The `messageID` identifies which segments belong together across the mesh. +The `segmentIndex` communicates the ordering position for reassembly. +The `totalSegments` tells a receiver when a full set is present. +The `payload` carries the raw bytes for this segment. +The `checksum` enables integrity verification before or after reassembly. + +## Reconstruction + +Receivers buffer segments by `messageID` until all expected indexes are +available. Once all segments are present, implementations reassemble in index +order and pass the reconstructed message through existing validation flows. +Incomplete segment sets are discarded after a configurable window. + +## Interaction with Peer Scoring + +This draft explores scoring at the reconstructed message level rather than the +segment level. For the P3 question specifically, a delivery is counted only +when a complete message is successfully reassembled. Segments that arrive but +never form a complete set are not counted as successful deliveries. If the +delivery window expires before reconstruction completes, one approach may be to +treat that outcome as a missed delivery for scoring purposes. + +## Open Questions + +1. Should `messageID` be application-provided or protocol-generated? +2. What is the recommended maximum segment payload size, and should this be + fixed in the spec or left to implementations? + +## Protobuf + +Refer to the protobuf registry at ./extensions/extensions.proto + +```protobuf +syntax = "proto2"; + +message LargeMessageSegmentationExtension { + optional bytes messageID = 1; + optional uint32 segmentIndex = 2; + optional uint32 totalSegments = 3; + optional bytes payload = 4; + optional bytes checksum = 5; +} +``` diff --git a/pubsub/gossipsub/extensions/extensions.proto b/pubsub/gossipsub/extensions/extensions.proto index 04f539d58..44beb1510 100644 --- a/pubsub/gossipsub/extensions/extensions.proto +++ b/pubsub/gossipsub/extensions/extensions.proto @@ -8,6 +8,10 @@ message ControlExtensions { optional bool testExtension = 6492434; + // Experimental: Large Message Segmentation + // Spec: ./experimental/large-message-segmentation.md + optional bool largeMessageSegmentation = 8473921; + } message ControlMessage { @@ -47,6 +51,10 @@ message RPC { optional TestExtension testExtension = 6492434; + // Experimental: Large Message Segmentation + // Spec: ./experimental/large-message-segmentation.md + optional LargeMessageSegmentationExtension largeSegmentation = 8473921; + } message PartialMessagesExtension { @@ -59,3 +67,11 @@ message PartialMessagesExtension { // An encoded representation of the parts a peer has and wants. optional bytes partsMetadata = 4; } + +message LargeMessageSegmentationExtension { + optional bytes messageID = 1; + optional uint32 segmentIndex = 2; + optional uint32 totalSegments = 3; + optional bytes payload = 4; + optional bytes checksum = 5; +} From f909f550789b768198c588cafa260fe076b93cc3 Mon Sep 17 00:00:00 2001 From: theutkarshraj Date: Fri, 8 May 2026 17:09:34 +0530 Subject: [PATCH 2/4] spec: add security considerations, wire format details, and gossipsub interaction --- .../large-message-segmentation.md | 46 ++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md b/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md index eaee8a83c..74f4cf9bf 100644 --- a/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md +++ b/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md @@ -43,7 +43,7 @@ The `messageID` identifies which segments belong together across the mesh. The `segmentIndex` communicates the ordering position for reassembly. The `totalSegments` tells a receiver when a full set is present. The `payload` carries the raw bytes for this segment. -The `checksum` enables integrity verification before or after reassembly. +The `checksum` is the SHA-256 hash of `messageID || segmentIndex || payload`, used for per-segment integrity verification before reassembly. ## Reconstruction @@ -52,6 +52,17 @@ available. Once all segments are present, implementations reassemble in index order and pass the reconstructed message through existing validation flows. Incomplete segment sets are discarded after a configurable window. +## Interaction with Existing Gossipsub Mechanics + +Segments propagate through the same mesh as their parent topic. Each segment is +itself a gossipsub message and is subject to the standard `MessageID` +computation, IHAVE/IWANT, and IDONTWANT mechanics at the segment level. The +`messageID` field defined in this extension identifies the parent payload only +and is distinct from gossipsub's per-message ID computed over the segment +envelope. Implementations MUST NOT forward a duplicate segment (same `messageID` +plus `segmentIndex`) and SHOULD treat duplicates as a signal for IDONTWANT +propagation. + ## Interaction with Peer Scoring This draft explores scoring at the reconstructed message level rather than the @@ -61,11 +72,44 @@ never form a complete set are not counted as successful deliveries. If the delivery window expires before reconstruction completes, one approach may be to treat that outcome as a missed delivery for scoring purposes. +## Security Considerations + +**Reassembly buffer exhaustion.** A malicious peer can announce large +`totalSegments` values and send only a subset, forcing receivers to buffer +indefinitely. Mitigations: per-peer cap on outstanding incomplete reassembly +buffers; per-messageID memory cap derived from `totalSegments * maxSegmentSize`; +configurable reassembly timeout after which buffers are evicted. + +**Segment flooding under forged messageID.** Without binding `messageID` to the +publisher, an attacker can pollute reassembly buffers with junk segments sharing +a victim's `messageID`. Mitigation: derive `messageID` deterministically from +publisher identity, or require segments to carry the same publisher signature as +the parent gossipsub message. + +**Last-segment withholding.** A peer can deliver `totalSegments - 1` segments +and withhold the final one to grief reassembly. Mitigations: reassembly timeout +with eviction; peer scoring penalty for repeatedly contributing to incomplete +reassemblies. + +**Inconsistent totalSegments.** Two segments claiming the same `messageID` but +different `totalSegments` indicate forgery or implementation bug. Receivers MUST +discard the entire reassembly buffer for that `messageID` on detection. + ## Open Questions 1. Should `messageID` be application-provided or protocol-generated? + Tentative answer: protocol-generated as + `SHA-256(publisherPeerID || topic || nonce)[:16]`, set by the publisher. + This avoids cross-publisher collisions and lets receivers index reassembly + buffers without trusting application semantics. Application-provided IDs + remain a possible alternative where publisher-side determinism is required. + 2. What is the recommended maximum segment payload size, and should this be fixed in the spec or left to implementations? + Tentative answer: maximum of 1 MiB matching common gossipsub + `MaxMessageSize` defaults, with publishers free to choose any size at or + below the maximum. A single fixed size is rejected because optimal sizing + depends on MTU, topic semantics, and bandwidth. ## Protobuf From fa824f2e0a50c35ac589cedcf3b87053368e2c03 Mon Sep 17 00:00:00 2001 From: shivv23 Date: Fri, 8 May 2026 22:47:40 +0530 Subject: [PATCH 3/4] align field number and naming with py-libp2p reference implementation - Change largeMessageSegmentation field from 8473921 to 6492435 in both ControlExtensions and RPC to match py-libp2p (PR libp2p/py-libp2p#1323) - Rename RPC.largeSegmentation to RPC.largeMessageSegmentation for consistency - Note py-libp2p's 256 KiB default segment size under Open Question 2 --- .../extensions/experimental/large-message-segmentation.md | 2 ++ pubsub/gossipsub/extensions/extensions.proto | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md b/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md index 74f4cf9bf..3cacd5804 100644 --- a/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md +++ b/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md @@ -110,6 +110,8 @@ discard the entire reassembly buffer for that `messageID` on detection. `MaxMessageSize` defaults, with publishers free to choose any size at or below the maximum. A single fixed size is rejected because optimal sizing depends on MTU, topic semantics, and bandwidth. + The py-libp2p reference implementation defaults to 256 KiB per segment + payload as a practical starting point for implementations. ## Protobuf diff --git a/pubsub/gossipsub/extensions/extensions.proto b/pubsub/gossipsub/extensions/extensions.proto index 44beb1510..d9368cbd0 100644 --- a/pubsub/gossipsub/extensions/extensions.proto +++ b/pubsub/gossipsub/extensions/extensions.proto @@ -10,7 +10,7 @@ message ControlExtensions { // Experimental: Large Message Segmentation // Spec: ./experimental/large-message-segmentation.md - optional bool largeMessageSegmentation = 8473921; + optional bool largeMessageSegmentation = 6492435; } @@ -53,7 +53,7 @@ message RPC { // Experimental: Large Message Segmentation // Spec: ./experimental/large-message-segmentation.md - optional LargeMessageSegmentationExtension largeSegmentation = 8473921; + optional LargeMessageSegmentationExtension largeMessageSegmentation = 6492435; } From af1d3a1cc6dad07a34b027ae5866066833645907 Mon Sep 17 00:00:00 2001 From: theutkarshraj Date: Sat, 9 May 2026 14:39:58 +0530 Subject: [PATCH 4/4] spec: add reassembly lifecycle section (per-peer caps, timeouts, eviction) Addresses the implementer-raised gap from py-libp2p#1323. Defines normative MUSTs/SHOULDs for per-peer caps, per-messageID memory bounds, timeouts, inconsistency handling, successful reassembly, and eviction. Promotes existing security mitigations from inferred to normative. --- .../large-message-segmentation.md | 45 +++++++++++++++---- 1 file changed, 37 insertions(+), 8 deletions(-) diff --git a/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md b/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md index 3cacd5804..40884c896 100644 --- a/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md +++ b/pubsub/gossipsub/extensions/experimental/large-message-segmentation.md @@ -72,13 +72,43 @@ never form a complete set are not counted as successful deliveries. If the delivery window expires before reconstruction completes, one approach may be to treat that outcome as a missed delivery for scoring purposes. +## Reassembly Lifecycle + +**Per-peer cap on incomplete reassemblies.** A peer MUST limit the number of +concurrent incomplete reassemblies tracked per remote peer. The RECOMMENDED +default is 16 per peer. This prevents resource exhaustion attacks where a single +peer floods with partial messages that never complete. + +**Per-messageID memory cap.** For each in-progress reassembly, a peer MUST +bound memory usage to `totalSegments × maxSegmentSize`. If the announced +`totalSegments` value would cause this bound to exceed an implementation-defined +ceiling, the message MUST be rejected at the first segment. + +**Reassembly timeout.** An incomplete reassembly MUST be evicted if no new +segments arrive within a configurable timeout. The RECOMMENDED range is 60–120 +seconds, parameterizable per-topic. This mitigates last-segment-withholding +attacks noted in Security Considerations. + +**MUST-discard on inconsistency.** If two segments for the same `messageID` +announce different `totalSegments` values, the entire reassembly MUST be +discarded and the `messageID` SHOULD be added to a short-lived deny list to +prevent re-attack. + +**Successful reassembly.** Upon receiving the final outstanding segment, a peer +MUST verify the checksum of each segment, MUST verify segment count consistency +against `totalSegments`, and SHOULD deliver the reconstructed message to the +application layer atomically. Reassembly state MUST be released upon delivery. + +**Eviction policy.** When the per-peer cap is reached, implementations MAY use +LRU eviction to discard the least-recently-active incomplete reassembly. Evicted +reassemblies MUST NOT be silently restarted by the receiver; the publisher must +re-segment and retransmit if needed. + ## Security Considerations **Reassembly buffer exhaustion.** A malicious peer can announce large `totalSegments` values and send only a subset, forcing receivers to buffer -indefinitely. Mitigations: per-peer cap on outstanding incomplete reassembly -buffers; per-messageID memory cap derived from `totalSegments * maxSegmentSize`; -configurable reassembly timeout after which buffers are evicted. +indefinitely. See §Reassembly Lifecycle for normative mitigations. **Segment flooding under forged messageID.** Without binding `messageID` to the publisher, an attacker can pollute reassembly buffers with junk segments sharing @@ -87,13 +117,12 @@ publisher identity, or require segments to carry the same publisher signature as the parent gossipsub message. **Last-segment withholding.** A peer can deliver `totalSegments - 1` segments -and withhold the final one to grief reassembly. Mitigations: reassembly timeout -with eviction; peer scoring penalty for repeatedly contributing to incomplete -reassemblies. +and withhold the final one to grief reassembly. See §Reassembly Lifecycle for +normative mitigations. **Inconsistent totalSegments.** Two segments claiming the same `messageID` but -different `totalSegments` indicate forgery or implementation bug. Receivers MUST -discard the entire reassembly buffer for that `messageID` on detection. +different `totalSegments` indicate forgery or implementation bug. See §Reassembly +Lifecycle for the normative MUST-discard rule. ## Open Questions