-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Bitswap is a simple file exchange protocol. Given that it was not designed to be a content routing protocol, its feature-set was kept to a minimum, which is probably the right thing to do. I believe there is lots that can be done to improve its performance and this project is certainly on the right direction, but I think we should categorise carefully the nature of changes we're proposing in order to be able to evaluate the significance of the results we get out of the study.
I'm presenting some thoughts on the RFCs currently being worked on. The current set of RFCs can be split in two categories: i) those that improve content fetching, and ii) those that improve content discovery.
Content Fetching RFCs
RFC|BB|L2-03A: Use of compression and adjustable block size
RFC|BB|L2-07: Request minimum piece size and content protocol extension
RFC|BB|L12-01: Bitswap/Graphsync exchange messages extension and transmission choice
I suggest that the first two RFCs above get integrated under the umbrella of the third one. The third RFC is the most sophisticated one IMO. The reasoning is that it is difficult to set global standards/values for optimisation, e.g., all content transferred by Bitswap is compressed or all applications use the same block/chunk size. These decisions depend largely on the application and our approach should be to provide to applications all the knobs they need in order to optimise for their own case.
Content Discovery RFCs
RFC|BB|L1-04: Track WANT Message for future queries
RFC|BB|L1-02: TTLs for rebroadcasting WANT messages
These RFCs go towards the content routing approach. I suggest that we do implement and evaluate their performance, but we should target situations where Bitswap is used within the remit of some specific application, i.e., not globally, say within the global IPFS network. My fear is that doing content discovery with a simple protocol won't scale as the network size increases and we'll end up re-inventing the wheel by optimising towards directions that DHTs or pubsub protocols do out of the box.
Consider the following: assuming each node has ~1k connections: with a TTL=1 if every node sends a WANT message to all of its connections then we effectively flood the network in it's current size - the message will be propagated to 1M nodes (larger than the entire IPFS network right now). It might actually be ok to do flooding with the current network size in order to improve content discovery performance (this would be something cool to evaluate and propose), but this won't take us far as the network size increases (i.e., too much overhead).
Furthermore, the evaluation here needs to take into account the popularity of content. For instance, we should not try to optimise for content that has only one copy in the entire network. There is a tradeoff here between the request span we impose and the success rate this returns. In other words, given we now span across the connections of many peers, what we could do is limit the number of peers each node is reaching out to. We could envision a scheme where the requestor node is sending the request to 75% of its peers, then those nodes are sending to 50% of their peers and the final hop nodes send to 25% of their own peers. Again, these percentages can be tunable by the application. The applications are incentivised to set the values right in order to avoid overloading their own users.
On the "symmetric vs asymmetric" routing options of RFC|BB|L1-02, I would suggest that we evaluate both, as they can serve different purposes. For instance, "symmetric" routing can improve privacy out of the box (it "emulates" an onion-routing-like approach), which we should not disregard. There is the IPFS principle of not storing content into nodes that have not requested it, which we should not forget. However, we can work around it in several ways, e.g., as opt-in. We should also not forget that Bitswap is a protocol of its own, so it can possibly deviate slightly, without removing the options that IPFS actually requires.