Skip to content

BIPs: SwiftSync Specification#2152

Open
rustaceanrob wants to merge 3 commits into
bitcoin:masterfrom
rustaceanrob:swiftsync-bips
Open

BIPs: SwiftSync Specification#2152
rustaceanrob wants to merge 3 commits into
bitcoin:masterfrom
rustaceanrob:swiftsync-bips

Conversation

@rustaceanrob

Copy link
Copy Markdown
Member

SwiftSync is a protocol for clients to parallelize initial block download, based on the original writeup.

@murchandamus murchandamus left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a quick first glance, but could you please break your text into shorter lines? That makes it easier to leave review and track what changed between commits. Either 100 or 120 characters per line seems to work well enough.

Comment thread bip-xxxx-swiftsync.md
Comment thread bip-xxxx-swiftsync.md Outdated
@murchandamus murchandamus added New BIP PR Author action required Needs updates, has unaddressed review comments, or is otherwise waiting for PR author labels May 6, 2026
@jonatack jonatack changed the title SwiftSync Specification BIP drafts: SwiftSync Specification May 6, 2026
@jonatack

jonatack commented May 6, 2026

Copy link
Copy Markdown
Member

FWIW, I don't mind the unbroken lines and even prefer them. Avoids rejigging line lengths to keep them consistent when updating or having lines with very different lengths.

@danielabrozzoni danielabrozzoni left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did an initial pass and left some comments. I read the BIPs in the commit order (block undo -> histfile -> swiftsync) and it was pretty easy to follow.

Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated

@jurraca jurraca left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some writing nits but overall the concept is clear enough.

Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated
@rustaceanrob rustaceanrob force-pushed the swiftsync-bips branch 2 times, most recently from 92093e1 to f4cd99a Compare May 10, 2026 09:31
@murchandamus

Copy link
Copy Markdown
Member

Thanks for the review, @danielabrozzoni and @jurraca, as well as the quick turnaround @rustaceanrob. I notice that this pull request is still marked as a Draft PR. Are you still planning significant changes? If your submission is ready for another BIP Editor review, please mark the PR as "ready for review".

@rustaceanrob

Copy link
Copy Markdown
Member Author

I will keep these as a draft as the hintsfile format is subject to change.

Comment thread bip-xxxx-block-undo.md
Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md
| Amount | 64 bit unsigned integer | Defined above | Satoshi denominated value |
### Messages

#### MSG_GET_SPENT_COINS

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that a peer would issue this request for every block in the chain? If we assume mainnet at height, and a 150 ms round trip time, then a peer would spend nearly 80 hours just downloading this undo data.

You may want to consider a batched variant, similar to the way messages like getheaders works.

@rustaceanrob rustaceanrob Jun 22, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've found that bandwidth throughput is the limiting factor when downloading blocks in parallel. Not all spent coins have to be downloaded if a client keeps a cache, as this document describes. In the batched variant, the cache is not possible and the bandwidth requirement increases significantly.

Comment thread bip-xxxx-block-undo.md Outdated

| Field | Value |
| :----------------- | :---------- |
| `NODE_BLOCK_UNDO` | `1 << ???` |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rationale should be added for the choice of a new node version over the more common place (as of the past few years) exchange of a sendX message during the version handshake.

IMO a version makes sense here, as it can be used to filter out peers upfront that support sending this undo data over the network.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted for a BIP-434 feature message, which has a similar mechanism for the sendX

Comment thread bip-xxxx-block-undo.md

#### MSG_SPENT_COINS

`MSG_SPENT_COINS` defines the data structure for inputs of a block.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not really y'all's intended use case, but if you optionally make it possible to include merkle proofs for the set of coins, then this message can be used to obtain a proof that an output was spent in a given block.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would actually also be useful for BIP 157+158 peers, as the final version that shipped includes the script spent (instead of the outpoint), which means that if you're using the filters to find a block where a given script has been spent, you need to make some assumptions about what the prev script is for a given transaction.

@rustaceanrob rustaceanrob Jun 23, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most recent response on this mailing list post mentions commitment to the UTXO set as part of the block header. There are additional ways to do this outside of a soft fork as well, i.e. utreexo proofs. For now I think it best to leave this unspecified in this version of the message while the community shares ideas, but I do think this is interesting.

Comment thread bip-xxxx-swiftsync.md Outdated
@rustaceanrob

Copy link
Copy Markdown
Member Author

Wrapped text and standardized formatting with mdformat, will address outstanding feedback soon.

@murchandamus

Copy link
Copy Markdown
Member

Great, thanks! I’ll give it a read when you’re done with that.

melvincarvalho added a commit to bitcoin-kernel/kernel that referenced this pull request Jun 20, 2026
hintsfile.js implements the SwiftSync hintsfile (bitcoin/bips#2152 'Hints for
unspent coins'): per-block unspent output indices encoded with Elias-Fano
(CompactSize(n) || CompactSize(m) || L || H; low bits LSB-first, unary gap high
bits), plus the 'UTXO' magic/version/height/vector container.

This is the ONE cross-compatibility artifact (per Somsen), so it's validated
byte-for-byte against the BIP's own elias_fano.json vectors — all three match,
plus round-trips, edge cases, and a container round-trip. Exported from index +
package exports (./hintsfile).
melvincarvalho added a commit to bitcoin-kernel/kernel that referenced this pull request Jun 20, 2026
…matches BIP vectors

undo.js implements the full-validation spent-coin data (bitcoin/bips#2152 'Peer
sharing of block spent coins'): Core's CompressAmount/DecompressAmount, the
reconstructable-script prefix table (P2PKH/P2SH/P2PK/P2WPKH/P2WSH/P2TR/raw), the
height code (height<<1|coinbase), and a spent-coin record. Amount + script
compression validated byte-for-byte against the BIP's compressed_amount.json and
reconstructable_script.json vectors (+ round-trips).

Also factored CompactSize/concat into varint.js, shared by hintsfile + undo.
Full suite: 21 pass / 0 fail.
@rustaceanrob rustaceanrob changed the title BIP drafts: SwiftSync Specification BIPs: SwiftSync Specification Jun 22, 2026
@rustaceanrob rustaceanrob marked this pull request as ready for review June 22, 2026 08:54
@rustaceanrob

Copy link
Copy Markdown
Member Author

Given there are a few clients that have started implementations of SwiftSync, and new hintsfile encodings may simply increment the file version, I am moving these out of draft. Some outstanding comments addressed, others require some additional thought.

@rustaceanrob rustaceanrob force-pushed the swiftsync-bips branch 5 times, most recently from 24ae4e7 to ecbda2a Compare June 23, 2026 09:00
@jonatack jonatack removed the PR Author action required Needs updates, has unaddressed review comments, or is otherwise waiting for PR author label Jun 23, 2026

@edilmedeiros edilmedeiros left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for documenting the protocol in this draft.

Did a deep dive together with the guys from @vinteumorg and left many comments concerning conceptual aspects of the BIPs. I have many editing suggestions, but left them for a second round after the higher-level aspects are more mature.

Comment thread bip-xxxx-hintsfile.md Outdated
Comment thread bip-xxxx-hintsfile.md Outdated
Comment thread bip-xxxx-hintsfile.md Outdated
in Bitcoin Core, and reasonable for most clients to hold directly in memory. This encoding represents elements in $2n +
n \\lceil \\log_2(m/n) \\rceil$ bits, which is within a reasonable bound of the theoretical optimum.

Partitioning the hints by block is an intuitive choice, and allows for efficient random access of hints. Groupings of

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partitioning the hints by block is an intuitive choice, and allows for efficient random access of hints.

I don't see how this can be true: the bistream has a header (magic, version, height) followed by a sequence of EliasFano items, each of which are composed by N, M (fixed size info), L, H (variable size info).

So, imagine I have a hintsfile. To find data for block k, one do need to (minimally) process data for block 1 to discover the size of the first EliasFano item (because of the variable size parts). Then, block 2 and so forth, until the intended block target. This would be true if the EliasFano items were fixed size among all blocks to allow for offset arithmetic, but the amount of padding would be prohibitively high.

Thus, the hints payload works more like a List<EliasFano> (requires sequential access) than a vector<EliasFano> (allows random access). Of course, the decoder could create an index of offsets, but this is not only an implementation detail, but also something that will add to the required resources to process the hintsfile.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, the original version had a header section, but was removed as it could be reconstructed as you described. I will remove that note.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if having more blocks taken together will not improve compression sensibly (I'll experiment with it). We can add a counter in the bitstream to allow the encoder to choose it freely (at the cost of having more bytes in the bitstream) but we potentially gain:

  1. Less H, L pairs in the bitstream.
  2. More data to feed to the Elias-fano process (tends to push it closer to the theoretical entropy).

Comment thread bip-xxxx-hintsfile.md Outdated
Comment thread bip-xxxx-hintsfile.md Outdated
Comment thread bip-xxxx-swiftsync.md Outdated
Comment thread bip-xxxx-swiftsync.md Outdated
Comment thread bip-xxxx-swiftsync.md Outdated
Comment thread bip-xxxx-swiftsync.md Outdated
Comment thread bip-xxxx-swiftsync.md Outdated

@johnnyasantoss johnnyasantoss left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I participated with @edilmedeiros in the review and had a few concept concerns, most linked to the undo data, its tradeoff (the worst case scenario seems to be really bad) and the optional 5-block index.

Comment thread bip-xxxx-block-undo.md
Comment thread bip-xxxx-block-undo.md
Comment thread bip-xxxx-block-undo.md
Comment on lines +146 to +151
The lifetime, or interval between creation and spending height, of the coins on the Bitcoin blockchain demonstrate an
empirical phenomena that the majority of coins are spent within 100 blocks. In fact, approximately 41 percent of coins
are spent within 10 blocks at the time of writing[^1]. Clients may leverage this to reduce the bandwidth required to
fetch undo data by using an in-memory cache. For example, a client may store coins that were created in a 5 block
window, and request only coins that are older than this height via the `cutoff` filter. This results in a significant
bandwidth reduction at the cost of a cache that can be set dynamically by the client depending on available memory.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cutoff cache optimization seems to nudge implementors back to sequentially processing blocks with the added burden of requesting extra data over the wire.

Also with the current messages I still need to get the data for the block (even if there's only one unspent cache miss?), right?
If that's true, wouldn't a parameter for inputs of interest (delta encoded index) help here?

At 150ms RTT * 955233 blocks that's ~39.8 hrs of round-trip latency for uncached requests alone, before counting download time (as Roasbeef noted in an earlier review). It seems to me that the cache mitigates this but at the cost of reintroducing the very sequentiality it aims to eliminate.

Is this understanding correct?

@RubenSomsen RubenSomsen Jun 25, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to nudge implementors back to sequentially processing blocks

Caching requires sequential processing, but you can have multiple sequential threads in parallel.

added burden of requesting extra data over the wire

You're going to have to request the undo data regardless for non-assumevalid SwiftSync - it is not related to caching.

I still need to get the data for the block (even if there's only one unspent cache miss?), right?

It seems to me that the cache mitigates [round-trip latency]

Caching does not prevent the need for requesting undo data. You can safely assume pretty much every block has cache misses. No cache missses is equivalent to having the full UTXO set (and impossible with multiple sequential threads), which defeats the point.

round-trip latency for uncached requests

I have no strong opinion on batching, but round-trip latency won't add up sequentially if requests are sent out in parallel.

Concrete example: Let's say you're starting another sequential thread from block height 1001 and you intend to cache the last 5 blocks worth of outputs. For the first block you'd request the full undo data. For block 1002 until 1005 you'd request everything created until block height 1000. From height 1006 onwards your 5-block window starts to shift so you'd request everything created until block height 1001, and so on.

All this data can be requested in parallel. As long as your caching strategy is not based on what you witnessed during the previous block, at no point do you have to wait for one block to finish processing before requesting the data for upcoming blocks.

@rustaceanrob rustaceanrob force-pushed the swiftsync-bips branch 13 times, most recently from 27c42b6 to 2606bd8 Compare June 26, 2026 08:52

@murchandamus murchandamus left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the first document "Peer sharing of block spent coins". Given my prior knowledge it’s pretty clear what’s going on, but I think people reading about the topic for the first time could use more context in some passages. I noticed a couple sections with potential for improvement.

  • The Abstract and first sentences of the Motivation are a bit repetitive.
  • For the Definitions and Data Structures sections, I could have used a little more context. What will I be shown? Why? How is the table to be read? What do the columns mean?

Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md

## Motivation

A current limitation of IBD is that it must be done sequentially. This is a result of the height, coinbase flag, input

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A current limitation of IBD is that it must be done sequentially.

Given the postulation or existence of alternative syncing models, that feels a bit loaded. Maybe mention that this specifically refers to Bitcoin Core or alternatively consider something along the lines of: "The common approach to IBD is to process blocks sequentially as that ensures the existence of TXO details when input validation requires them to be available."

This is a result of the height, coinbase flag, input script, and amount of the block inputs being omitted from the data committed to by proof of work in the current block

This is jumping several steps from the prior statement at once. Maybe you could segue that a bit more, e.g., by mentioning that fields you introduce are TXO details, before going into them being only implicitly or not at all committed to by transaction inputs, before explaining how that makes it impossible to verify what is provided by a peer.

Comment thread bip-xxxx-block-undo.md
script, and amount of the block inputs being omitted from the data committed to by proof of work in the current block,
and, thus, this data cannot be trusted if received over the wire naively. Using the SwiftSync protocol, a client is able
to verify the correctness of this data, even if served by a potentially untrusted party. This allows a significant
improvement in IBD performance, as block downloads may be done in parallel.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit imprecise: block download is always done in parallel, it’s just validation that is sequential. Do you mean that block validation can be parallelized?

Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md
#### Height Code

When validating a block, a client must confirm coinbase outputs are mature, which is given by the height of the coin.
The height and coinbase flag are encoded as a 32 bit integer. To encode the height and flag, binary left shift the

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could add a footnote why both are stored in one data structure, and/or mention here that even with sacrificing one bit, heights up to 2,147,483,647 can be expressed and 30,000+ years of blocks is plenty planning horizon? :p

Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md
Comment on lines +147 to +148
window, and request only coins that are older than this height via the `cutoff` filter. This results in a significant
bandwidth reduction at the cost of a cache that can be set dynamically by the client depending on available memory.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah cool. I was missing this context above when cutoff was introduced.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a short note when introducing the request message that the cutoff field is motivated in the rationale section.

Comment thread bip-xxxx-block-undo.md
Comment on lines +153 to +154
11gb reduction in bandwidth is achieved. The application of `VARINT` as opposed to `CompactSize` offers a further
reduction of 4gb, however the `VARINT` primitive is currently a Bitcoin Core implementation detail. Reusing existing

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the confusing terminology in regard to CompactSize and VARINT and Bitcoin Core, you probably want to define these terms more concretely.

Comment thread bip-xxxx-block-undo.md Outdated
Comment thread bip-xxxx-block-undo.md
@rustaceanrob rustaceanrob force-pushed the swiftsync-bips branch 2 times, most recently from dfab2a7 to 473007d Compare June 27, 2026 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants