Skip to content

Commit 147cd51

Browse files
committed
Added BB RFCs
1 parent 8f82295 commit 147cd51

File tree

9 files changed

+324
-3
lines changed

9 files changed

+324
-3
lines changed

beyond-bitswap/README.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,12 @@ In short, the aim of the project is two-fold: to drive speed-ups in file-sharing
2525
* [Test Results](https://docs.google.com/document/d/1zPpgnr9ykJr5PAvShJBGhKKRDRbsglb00MPc5eVEU4Q/edit#): This document collects the results of the tests performed in the scope of the project.
2626

2727
### Enhancement RFCs
28-
* [RFC|BB|L1-04: Track WANT messages for future queries](./rfc/rfcbbL104.md)
28+
* [RFC|BB|L1-04: Track WANT messages for future queries](./rfc/rfcBBL104.md)
29+
* [RFC|BB|L2-03A: Use of compression and adjustable block size](./rfc/rfcBBL203A.md)
30+
* [RFC|BB | L2-07: Request minimum piece size and content protocol extension](./rfc/rfcBBL207.md)
31+
* [RFC| BB | L12-01: Bitswap/Graphsync exchange messages extension and transmission choice](./rfc/rfcBBL1201.md)
32+
* [RFC| BB | L1-02: TTLs for rebroadcasting WANT messages](./rfc/rfcBBL102.md)
33+
2934

3035
### Code
3136
* [Testbed and related assets](https://github.com/adlrocha/beyond-bitswap/): This repo includes all the code used for the implementation and other auxiliary testing assets. Additional documentation will be provided in the repo.
53.3 KB
Loading
51.2 KB
Loading

beyond-bitswap/rfc/rfcBBL102.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# RFC| BB | L1-02: TTLs for rebroadcasting WANT messages
2+
Status: `Draft`
3+
Implementation here:
4+
5+
## Abstract
6+
7+
This RFC proposes setting a TTL on Bitswap WANT messages and TTL ceiling per node, in order to increase the chance of a node finding a provider that has the content without resorting to the DHT. This does mean that the WANT messages need to have an additional field of “requester” so that the receiving node knows who to dial to deliver a block.
8+
9+
<!-- Full description here: https://docs.google.com/document/d/1zjJCZel8zJzgK3XuHK0YZlNffEHThq7tUOssGgRTryY/edit#heading=h.6qnrq913vou6 -->
10+
11+
12+
13+
## Shortcomings
14+
15+
Bitswap only sends WANT messages to its directly connected peers. This limits the potential for finding the peer with the content to the peers directly connected to or the ones that result from a DHT query, which has its cost in time and connectivity.
16+
17+
## Description
18+
19+
The idea is to include a TTL to WANT messages. That way instead of forwarding WANT messages to our directly connected peers, we can increase the scope to, for instance, the connected peers of our connected peers (TTL=2). With this, we increase the span of discovery of content without having to resort to the DHT. This TTL needs to be limited to a small number to avoid flooding the network with WANT requests. It also complicates the implementation of the protocol, as now nodes need to track not only sessions from their directly connected peers but also from the ones x-hops away from them. Several design decisions would have to be made in the implementation such as the following (ideally the best value for these fields will be determined in testing. Additionally, we could set them to be dynamic according to the state of the network or the developer's desire. This will be explored in the future work).
20+
21+
- Max TTL allowed. [This study proves](http://conferences2.sigcomm.org/acm-icn/2015/proceedings/p9-wang.pdf) that a Max TTL = 2 achieves the best performance (for moderately popular content) without severe impact in latency, so we can consider this as the baseline value. However, The impact and performance of this will depend heavily on how many connections each node maintains.
22+
23+
- Forwarder of discovered blocks: Nodes x-hops away from the source of the requests can send responses following two approaches:
24+
25+
- Symmetric routing: Messages are forwarded to the requestor following the same path followed by the WANT messages.
26+
27+
- Asymmetric routing: Messages do not follow the same path followed by the WANT message, and responses are directly forwarded to its original requestor. In this alternative, nodes follow a "fire-and-forget approach" where intermediate nodes only act as relays and don't track the status of sessions, the receiving node X-hops away answer the requestor node directly, and the only one tracking the state of the session is the originating peer (and maybe the directly connected peers while the session has not been canceled, so that if they see any of the requested blocks it can notify its discovery). When implementing this approach we have to also bear in mind that establishing connections is an expensive process so in order for this approach to be efficient we should evaluate when it is worth for nodes to open a dedicated connection to forward messages back to the original requestor.
28+
29+
Initially, the protocol will be designed using symmetric routing, and will explore other routing alternatives in the future work. When exploring asymmetric routing we need to bear in mind that according to IPFS values, nodes shouldn't push content to other peers that haven't requested it.
30+
31+
Again, this proposal should include schemes to avoid flooding attacks and the forgery of responses. It may be sensible to include networking information also in the request to allow easy discovery to forward responses X-hop away.
32+
33+
## Implementation plan
34+
- [ ] Include TTL in WANT messages. Nodes receiving the WANT message track the session (creating a new one or updating an existing one), reduce in one the TTL of the WANT message and forward it to its connected peers. Duplicate WANT messages with lower or equal TTL should be discarded to avoid loops (higher TTLs could represent request updates). WANT sessions should be identified at least with the following tuple: {SOURCE, WANT_ID} so nodes know to whom it needs to send discovered blocks.
35+
36+
- [ ] Test the performance and bandwidth overhead of this scheme compared to plain Bitswap for different values of TTL.
37+
38+
- [ ] Evaluate the use of a symmetric and asymmetric routing approach for the forwarding of discovered blocks.
39+
40+
- [ ] Consider the implementation of "smart TTLs" in WANT requests, so according to the status of the network, bandwidth available, requests alive, number of connections or any other useful value, the TTL is determined.
41+
42+
# Impact
43+
We should expect a latency reduction in the discovery of content but it may lead to an increase in the bandwidth overhead of the protocol. We do not expect the increase in the bandwidth overhead to be substantial, given that response messages are not big in size
44+
45+
## Evaluation Plan
46+
- [The IPFS File Transfer benchmarks.](https://docs.google.com/document/d/1LYs3WDCwpkrBdfrnB_LE0xsxdMCIhXdCchIkbzZc8OE/edit#heading=h.nxkc23tlbqhl)
47+
48+
- Compare the times a node resorts to DHT according to the TTL used, and the bandwidth overhead due to control messages.
49+
50+
## Prior Work
51+
This RFC was inspired by this proposal. The RFC is based on the assumption that DHT lookups are slow and therefore is better to increase our “Bitswap span” than resorting to the DHT. It would be great if we could validate this assumption before considering its implementation.
52+
53+
## Results
54+
55+
56+
## Future Work
57+
Some future work lines to consider:
58+
59+
- Combine with RFC|BB|L1-04 so apart from setting a TTL to WANT messages, every peer receiving a WANT message tracks it in its peer-block registry enhancing also the discovery scope with peer-block registries tables.
60+
61+
- With a very high number of connections the network is effectively flooded, which is not something we want. We could envision this technique as an efficient alternative to keeping many (questionable quality) connections. [[slides](http://conferences.sigcomm.org/acm-icn/2015/slides/01-01.pdf)]
62+
63+
- If we end up using request manifests as suggested in RFC | BB | L1/2-01, max TTLs could be specified in the exchange request message or determined according to the total connection of a peer to limit the network flooding. Even more, it'd be interesting to explore this RFC with RFC | BB | L1-06 so using GossipSub overlay network as a base, and according to scores and max connections of peers, WANT TTLs are determined.
64+
65+
- Evaluate techniques used in GossipSub to fine-tune or enhance the use of WANT TTLs preventing the network from being flooded.

beyond-bitswap/rfc/rfcbbL104.md renamed to beyond-bitswap/rfc/rfcBBL104.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# RFC|BB|L1-04: Track WANT messages for future queries
2-
2+
Status: `Draft`
33

44
## Abstract
55

@@ -45,13 +45,18 @@ Initial explorations indicated that an HAMT with an accumulator like approach ar
4545
- HAMTRegisry updates the key with the new list of peers. There is a maximum number of entries allowed in each key.
4646
- The garbage collection strategy will be defined according to the results of the memory footprint tests and the accummulator ceiling.
4747

48+
![](./images/rfcbbL104.png)
49+
50+
# Impact
51+
We can expect the time to discover content in the network to be reduced.
52+
4853
## Evaluation Plan
4954
- [The IPFS File Transfer Benchmarks](https://docs.google.com/document/d/1LYs3WDCwpkrBdfrnB_LE0xsxdMCIhXdCchIkbzZc8OE/edit#heading=h.nxkc23tlbqhl)
5055
- [x] Create a test case that simulates the interest in a dataset by a growing population of nodes (e.g Use different waves of peers interested in a file). This will create the scenario in which the next wave will benefit from having the knowledge that the first wave might already have the file.
5156
- [ ] Include noise in the test case. Along with the regularly accessed files, nodes request random CIDs to pollute their registries.
5257
- [ ] Clear registries between run counts to remove advantage with files with similar blocks.
5358
- [ ] Track memory footprint of peers.
54-
![](./images/rfcbbL104.png)
59+
5560
## Future Work
5661
- Protocol to share peer-block registries between nodes to increase “local views”.
5762
- A good idea for reducing the scope of the content we keep track of is to somehow monitor the latency to the node and keep track of content that lives nearby.

beyond-bitswap/rfc/rfcBBL1201.md

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
# RFC| BB | L12-01: Bitswap/Graphsync exchange messages extension and transmission choice
2+
Status: `Draft`
3+
4+
Implementation here: https://github.com/
5+
6+
## Abstract
7+
This RFC proposes expanding Bitswap and Graphsync exchange messages with additional information. This information is used in content requests for receivers to be able to understand clearly the content requested; and in responses so responders can share his specific level of fulfillment of the request to the requestor. With this information the requestor can select the best nodes to perform the actual request to have the best performance possible in the transmission.
8+
9+
10+
<!-- Full description here: https://docs.google.com/document/d/1zjJCZel8zJzgK3XuHK0YZlNffEHThq7tUOssGgRTryY/edit#heading=h.6qnrq913vou6 -->
11+
12+
13+
## Shortcomings
14+
15+
Bitswap and Graphsync’s current discovery process is blind and optimistic. An IPLD selector or a plain WANT list with the list of request CIDs are shared with connected peers hoping that someone will have the content. When a peer answers saying that it has the requested block, a subsequent request needs to be performed to get the rest of the blocks belonging to the DAG structure of the requested CID. The idea behind this RFC is to add a way for requestor and connected peers to give more directed feedback about the result of the request.
16+
17+
## Description
18+
To request content to the network, instead of sending plain WANT messages or an IPLD selector, the requests will include the following information:
19+
20+
- Plain legacy request (want list or IPLD selector). This would allow this RFC to be backward compatible with existing exchange interfaces.
21+
22+
- Parameters for the exchange protocol (such as "send blocks directly if you have them", or "send only leaf blocks", "send all the DAG structure for the root CIDs I send", or any other extension we may come up with).
23+
24+
- Specific requirements (such as the minimum latency of the bandwidth desired for the exchange).
25+
26+
- Any additional data that may be useful and that we can act upon at a protocol level.
27+
28+
Nodes receiving this message will respond with the level of fulfillment of the request (number/range of blocks belonging to the request that the node stores , and if they fulfill or not the specified transmission requirements). This request can also include the list of blocks under the CID/IPLD select the request will eventually look for. No blocks are shared (except explicitly specified) in this exchange, it is only used as a way of "polling the surroundings" for the content.
29+
30+
With this information, the requestor inspects the characteristics and percentage of fulfillment of all the responses and chooses the best peers to request the blocks from distributing the load depending on the nodes it is connected to, and to parallelize as much as possible the exchange. This offers peers an opportunity to try and find the optimal distribution of requests for blocks that maximizes the output. The transmission flow with the chosen peers is triggered through a TRANSFER message, where the desired blocks and the transmission parameters are specified (this opens the door to the use of compression, network coding and other schemes in the transmission phase).
31+
32+
While the requester is receiving blocks through different transmission flows, it can trigger new rounds of discovery sending additional request messages to connected peers or selected peers in the DHT to increase the overall level of fulfilment or find better transmission candidates. The discovery and transmission loop will be permanently communicating.
33+
34+
### Implementation
35+
36+
Nodes receiving this manifest will answer with the level of fulfillment of the request. Upon reception of these responses, the node can start transmission requests to all the desired nodes. Meanwhile, we can resort to the DHT to send these exchange requests to peers we are not directly connected to. The flow of the protocol would be:
37+
38+
- Send exchange requests (IPLD selector/list of blocks, network conditions, node conditions) to connected peers.
39+
40+
- Receive responses: R1=50% fulfillment; R2=30% fulfillment; R3=5% fulfillment; We select the peers that lead to a larger level of fulfillment U(R1, R2)=75% fulfillment, and request the start of a transmission flow with them. Meanwhile, we resort to the DHT or perform an additional lookup to find the data pending for full fulfillment of the request. All of these phases should be in constant contact, so in case we receive better responses from peers we can act upon start new transmission or adapt to the conditions of the network.
41+
42+
The above proposal may present a few shortcomings for which we would have to include schemes to prevent such as:
43+
44+
- Reducing the number of RTTs when the number of blocks requested and their size is small. We need to include a way of merging the discovery and transmission phases to minimize the RTTs when appropriate.
45+
46+
- For large files send only the first 2 layers in the response before the requestor triggers the transmission phase.
47+
48+
- Use of accumulators in the level of fulfilment in responses to improve checks and the time between request and transmission phase.
49+
50+
- Avoid response forgery. This is out of the scope of this RFC but is something worth exploring in the future work.
51+
52+
## Implementation plan
53+
- [ ] Include additional information for exchange requests in WANT messages.
54+
55+
- [ ] Determine the basic structure of exchange requests, the information included in it, and how it will be leveraged by nodes. When designing these messages we need to ensure that it is compatible with existing WANT messages for backward compatibility. Thus, if an outdated Bitswap node receives an exchange request it still knows how to interpret the request. Along with this exchange request, the TRANSFER message should be designed.
56+
57+
- [ ] Use of Graphsync selectors in WANT messages.
58+
59+
- [ ] Design and implement the message exchange protocol for the content discovery and negotiation phases:
60+
61+
- [ ] 1\. Send exchange requests and collect responses for content availability and network status.
62+
63+
- [ ] 2\. Start transmission channels (TRANSFER) with peers fulfilling the request and keep the content discovery loop open in case better content servers appear (either because they are found through the exchange request broadcast, or because we chose to extend the lookup through the DHT and found better peers).
64+
65+
- [ ] 3\. Fine-tune peer interaction for best performance.
66+
67+
- [ ] Performance benchmark with plain Bitswap to fine-tune protocol configuration.
68+
69+
- [ ] Implement more complex queries in request messages.
70+
71+
- [ ] Use a utility function / score to evaluate the "best peers" for content discovery and transmission.
72+
73+
# Impact
74+
Adding these exchange request and negotiation phases opens the door to the clear differentiation of content discovery and transmission. This enables the inclusion of new schemes to optimize both levels according to the needs of an application. It will also enable the parallelization of many processes in current exchange interfaces. It also enables a way for clients to influence the operation of the protocol.
75+
76+
77+
## Evaluation Plan
78+
- [The IPFS File Transfer benchmarks.](https://docs.google.com/document/d/1LYs3WDCwpkrBdfrnB_LE0xsxdMCIhXdCchIkbzZc8OE/edit#heading=h.nxkc23tlbqhl)
79+
80+
## Prior Work
81+
82+
## Results
83+
84+
85+
## Future Work

0 commit comments

Comments
 (0)