Skip to content

docs: add recommended metadata encodings for Partial Messages extension#13

Open
adarsh-7-satyam wants to merge 1 commit into
seetadev:masterfrom
adarsh-7-satyam:docs/gossipsub-metadata-encodings
Open

docs: add recommended metadata encodings for Partial Messages extension#13
adarsh-7-satyam wants to merge 1 commit into
seetadev:masterfrom
adarsh-7-satyam:docs/gossipsub-metadata-encodings

Conversation

@adarsh-7-satyam
Copy link
Copy Markdown

Description

The Partial Messages extension for Gossipsub 1.4 introduces the partsMetadata field as a mechanism for peers to communicate which parts of a large message they hold. However, the current partial-messages.md specification deliberately leaves the encoding of this field as "application defined", providing no concrete guidance on what encoding formats to use. While this flexibility is intentional, it creates a practical problem: when two independent implementations like py-libp2p and nim-libp2p attempt to exchange partial messages, they must agree on how partsMetadata is encoded, but the specification gives them no shared vocabulary to do so.

Problem

Without standard encoding recommendations, every application and implementation team invents their own encoding independently. This leads to fragmentation across the ecosystem, makes interoperability testing between implementations significantly harder, and forces every new implementer to rediscover the same tradeoffs from scratch. The specification currently mentions bitmaps, ranges, and bloom filters only as passing examples in a single sentence, without any technical detail on how to implement or choose between them.

Changes Made

Added a new document pubsub/gossipsub/metadata-encodings.md that provides detailed technical documentation for three recommended standard encoding approaches for partsMetadata:

  1. Bitmask Encoding — Represents part availability as a fixed-length bit array where each bit position maps to a part index. Includes a concrete 8-part worked example in binary and hex, wire format specification using big-endian byte packing, and tradeoff analysis highlighting its suitability for Ethereum DAS columns with 32-64 parts.

  2. Range-based Encoding — Represents contiguous blocks of held parts as start/length pairs encoded as unsigned varints. Includes a concrete 100-part worked example, wire format using the multiformats unsigned varint spec, and tradeoff analysis noting its efficiency for streaming applications where data arrives in ordered chunks.

  3. Bloom Filter Encoding — A probabilistic data structure for representing very large sparse part sets. Includes a concrete worked example with m=8 bits and k=2 hash functions, tradeoff analysis covering false positive rates and the impossibility of false negatives, and guidance on when the size savings justify the probabilistic accuracy tradeoff.

The document also includes an Encoding Selection Guide table that maps message part count, distribution pattern, and accuracy requirements to the recommended encoding, and an Interoperability Considerations section explaining that implementations MUST treat partsMetadata as opaque bytes at the Gossipsub layer and that topic-level specifications SHOULD document their chosen encoding to ensure cross-implementation compatibility.

How It Solves The Problem

By providing concrete technical documentation, worked examples, wire format specifications, and a selection guide, this document gives implementers a shared reference point for encoding decisions. Teams building py-libp2p and nim-libp2p implementations can now refer to the same document to ensure their encoding choices are compatible without requiring out-of-band coordination for common use cases.

Impact

  • Reduces interoperability friction between Gossipsub 1.4 implementations by providing a shared encoding vocabulary
  • Lowers the barrier for new implementers by documenting encoding tradeoffs that would otherwise require independent research
  • Strengthens the Gossipsub 1.4 specification's readiness for Working Draft and Candidate Recommendation status by addressing a known gap between the protocol layer and the application layer
  • Directly supports the goal of validating interoperability between py-libp2p and nim-libp2p as outlined in the DMP 2026 project

Related Issue

Closes #12

Signed-off-by: Adarsh Satyam <adarsh5.satyam@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

docs: add recommended metadata encodings for Partial Messages extension

1 participant