You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: beyond-bitswap/README.md
+24-9Lines changed: 24 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,12 @@ In high level, this project is about:
12
12
* Research and prototype new strategies to acquire new speed ups.
13
13
* Acquire leverage by exposing the harness to the whole Open Source and Research community, in a way that others feel compelled to join the effort and try their own strategies.
14
14
15
-
In short, the aim of the project is two-fold: to drive speed-ups in file-sharing for IPFS and other P2P networks; and to enable a framework for anyone to join the quest of designing, implementing and evaluating brand new file-sharing strategies in P2P networks.
15
+
In short, the aim of the project is two-fold: to drive speed-ups in file-sharing for IPFS and other P2P networks; and to enable a framework for anyone to join the quest of designing, implementing and evaluating brand new file-sharing strategies in P2P networks.
16
+
17
+
## Why this project code name?
18
+
Bitswap has been for some time the file-sharing subsystem within IPFS, then Graphsync came to propose a new way of approaching file-sharing on IPFS. The scope of the project is not only to improve Bitswap's performance, but file-sharing in P2P networks as a whole. We don't restrict ourselves exlusively to Bitswap or IPFS for our exploration.
19
+
20
+
Being said that, the fact that IPFS had an infrastructure in place to start testing our ideas, and Bitswap being its file-sharing module, made us start our initial explorations over Bitswap and IPFS, but our aim is to go way farther and improve file-sharing performance with new protocols and proposals every P2P network can leverage and benefit from. In short, we want to go "Beyond Bitswap". The project can be considered a success if by the end of it one has a set of pluggable protocols and modules to achieve file-sharing in P2P environments, along with all the testbeds, tools and benchmarks required to improve this protocols and go _"Beyond Bitswap"_.
16
21
17
22
## Contributions & Results
18
23
@@ -21,24 +26,34 @@ In short, the aim of the project is two-fold: to drive speed-ups in file-sharing
21
26
*[Beyond Bitswap Slides](https://docs.google.com/presentation/d/18_aRTye2t6Xs_VhKwEbhvCYYu9ePaLgamIrJkpUDtfY/edit#slide=id.p): Set of slides introducing the project and summarizing the Related Work document from above. These slides were used to introduce the project in the following [talk]().
22
27
*[Survey of the state of the art](https://docs.google.com/document/d/172q0EQFPDrVrWGt5TiEj2MToTXIor4mP1gCuKv4re5I/edit#heading=h.nxkc23tlbqhl): It summarizes a list of papers on file-sharing strategies in P2P networks used as a groundwork for the projects.
23
28
*[Evaluation Plan](https://docs.google.com/document/d/1LYs3WDCwpkrBdfrnB_LE0xsxdMCIhXdCchIkbzZc8OE/edit#heading=h.nxkc23tlbqhl): Document describing the testbed and evaluation plan designed to test the performane of current implementation of file-sharing systems, and compare it with the improvements implemented within the scope of this work.
24
-
*[Enhancements RFC](https://docs.google.com/document/d/1zjJCZel8zJzgK3XuHK0YZlNffEHThq7tUOssGgRTryY/edit#heading=h.nxkc23tlbqhl): A list of enhancements proposals and ideas to improve file-sharing in IPFS.
29
+
*[Enhancements RFC](https://docs.google.com/document/d/1zjJCZel8zJzgK3XuHK0YZlNffEHThq7tUOssGgRTryY/edit#heading=h.nxkc23tlbqhl): A list of enhancements proposals and ideas to improve file-sharing in IPFS and P2P networks.
25
30
*[Test Results](https://docs.google.com/document/d/1zPpgnr9ykJr5PAvShJBGhKKRDRbsglb00MPc5eVEU4Q/edit#): This document collects the results of the tests performed in the scope of the project.
26
31
27
32
### Enhancement RFCs
28
-
*[RFC|BB|L1-04: Track WANT messages for future queries](./rfc/rfcBBL104.md)
29
-
*[RFC|BB|L2-03A: Use of compression and adjustable block size](./rfc/rfcBBL203A.md)
30
-
*[RFC|BB|L2-07: Request minimum piece size and content protocol extension](./rfc/rfcBBL207.md)
31
-
*[RFC|BB|L12-01: Bitswap/Graphsync exchange messages extension and transmission choice](./rfc/rfcBBL1201.md)
32
-
*[RFC|BB|L1-02: TTLs for rebroadcasting WANT messages](./rfc/rfcBBL102.md)
33
+
This section shares a list of improvement RFCs that are being currently tackled, discussed and prototyped. Each RFC aims to test a specific idea or assumption, and they may initially be implemented over Bitswap, but that doesn't mean the conclusions drawn are exclusively applicable to the Bitswap protocol. RFCs are divided in the different layers for file-sharing in P2P sytems identified in the [Related Work](https://docs.google.com/document/d/14AE8OJvSpkhguq2k1Gfc9h0JvorvLgOUSVrj3CnOkQk/edit#heading=h.nxkc23tlbqhl).
34
+
35
+
> Layer 1 RFCs: Discovery and announcement of content
36
+
*[RFC|BB|L1-04: Track WANT messages for future queries](./rfc/rfcBBL104.md): Evaluates how using information from a nodes surrounding can help the discovery and fetching of popular content in the network.
37
+
*[RFC|BB|L1-02: TTLs for rebroadcasting WANT messages](./rfc/rfcBBL102.md): It evaluates how broadcasting exchange requests TTL hops away may help the discovery of content improving performance.
38
+
39
+
> Layer 2 RFCs: Negotiation and transmission of content.
40
+
*[RFC|BB|L2-07: Request minimum piece size and content protocol extension](./rfc/rfcBBL207.md): Evaluates how the size of the chunks that comprises content requested in a P2P network may affect performance.
41
+
*[RFC|BB|L12-01: Bitswap/Graphsync exchange messages extension and transmission choice](./rfc/rfcBBL1201.md): Proposes dividing the exchange of content in two phases: a negotiation phase used to discover the holders of the different chunks of a file, and a transfer file to explicitly request blocks from different chunk holders. This opens the door to additional exchange strategies and schemes to improve performance.
42
+
*[RFC|BB|L2-03A: Use of compression and adjustable block size](./rfc/rfcBBL203A.md): Evaluates the potential performance improvementes on the use of compression for the exchange of content in P2P networks.
43
+
44
+
Feel free to jump into the discussions around the project or to propose your own RFC opening an issue in the repo.
33
45
34
46
35
47
### Code
36
-
*[Testbed and related assets](https://github.com/adlrocha/beyond-bitswap/): This repo includes all the code used for the implementation and other auxiliary testing assets. Additional documentation will be provided in the repo.
37
-
*[Bitswap fork](https://github.com/adlrocha/go-bitswap): This fork of `go-bitswap` is the one being used to implement some of the RFCs and where additional metrics that want to be tracked in the testbed are being included. Expect RFCs to be imeplemented in different branches.
48
+
*[Testbed and related assets](https://github.com/adlrocha/beyond-bitswap/): This repo includes all the code used for the implementation and other auxiliary testing assets. Additional documentation is provided in the repo.
49
+
*[Bitswap fork](https://github.com/adlrocha/go-bitswap): This fork of `go-bitswap` is the one being used to implement and evaluate some of the RFCs and where additional metrics that want to be tracked in the testbed are being included. RFCs are imeplemented in different branches with the code of the RFC.
50
+
*[Graphsync fork](): This fork of `go-graphsync` is also being used to test some of the RFCs.
51
+
*[New exchange protocol?](): This is a work in progress to be determined with the result of our explorations.
38
52
39
53
### Talks
40
54
*[Introduction to Beyond Bitswap project](): Introductory talk to show the initial work and motivate the project.
41
55
*[How rfcBBL104 was implemented](https://drive.google.com/file/d/1YS3RoNdeeG1vauJpfvHvKUQzPHr97eHF/view?usp=sharing): Video on how the implementation of rfcBBL104 was approached.
56
+
*[Progress update September 2020](https://drive.google.com/file/d/1vUWnfQMIqz9hoqWB941vbzqkP16-_ydd/view?usp=sharing): Progress update of the project explaining the RFCs implemented, the testbed and some preliminary results.
Copy file name to clipboardExpand all lines: beyond-bitswap/rfc/rfcBBL102.md
+21-4Lines changed: 21 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,7 @@ Bitswap only sends WANT messages to its directly connected peers. This limits th
16
16
17
17
## Description
18
18
19
-
The idea is to include a TTL to WANT messages. That way instead of forwarding WANT messages to our directly connected peers, we can increase the scope to, for instance, the connected peers of our connected peers (TTL=2). With this, we increase the span of discovery of content without having to resort to the DHT. This TTL needs to be limited to a small number to avoid flooding the network with WANT requests. It also complicates the implementation of the protocol, as now nodes need to track not only sessions from their directly connected peers but also from the ones x-hops away from them. Several design decisions would have to be made in the implementation such as the following (ideally the best value for these fields will be determined in testing. Additionally, we could set them to be dynamic according to the state of the network or the developer's desire. This will be explored in the future work).
19
+
The idea is to include a TTL to WANT messages. That way instead of forwarding WANT messages to our directly connected peers, we can increase the scope to, for instance, the connected peers of our connected peers (TTL=1). With this, we increase the span of discovery of content without having to resort to the DHT. This TTL needs to be limited to a small number to avoid flooding the network with WANT requests. It also complicates the implementation of the protocol, as now nodes need to track not only sessions from their directly connected peers but also from the ones x-hops away from them. Several design decisions would have to be made in the implementation such as the following (ideally the best value for these fields will be determined in testing. Additionally, we could set them to be dynamic according to the state of the network or the developer's desire. This will be explored in the future work).
20
20
21
21
- Max TTL allowed. [This study proves](http://conferences2.sigcomm.org/acm-icn/2015/proceedings/p9-wang.pdf) that a Max TTL = 2 achieves the best performance (for moderately popular content) without severe impact in latency, so we can consider this as the baseline value. However, The impact and performance of this will depend heavily on how many connections each node maintains.
22
22
@@ -31,14 +31,30 @@ Initially, the protocol will be designed using symmetric routing, and will explo
31
31
Again, this proposal should include schemes to avoid flooding attacks and the forgery of responses. It may be sensible to include networking information also in the request to allow easy discovery to forward responses X-hop away.
32
32
33
33
## Implementation plan
34
-
-[] Include TTL in WANT messages. Nodes receiving the WANT message track the session using indirect sessions, reduce in one the TTL of the WANT message and forward it to its connected peers. Duplicate WANT messages with lower or equal TTL should be discarded to avoid loops (higher TTLs could represent request updates). WANT sessions should be identified at least with the following tuple: {SOURCE, WANT_ID} so nodes know to whom it needs to send discovered blocks. (See figures below for the proposed implementation of the symmetric approach).
34
+
-[X] Include TTL in WANT messages. Nodes receiving the WANT message track the session using indirect sessions, reduce in one the TTL of the WANT message and forward it to its connected peers. Duplicate WANT messages with lower or equal TTL should be discarded to avoid loops (higher TTLs could represent request updates). WANT sessions should be identified at least with the following tuple: {SOURCE, WANT_ID} so nodes know to whom it needs to send discovered blocks. (See figures below for the proposed implementation of the symmetric approach).
35
35
36
36
-[ ] Test the performance and bandwidth overhead of this scheme compared to plain Bitswap for different values of TTL.
37
37
38
38
-[ ] Evaluate the use of a symmetric and asymmetric routing approach for the forwarding of discovered blocks.
39
39
40
40
-[ ] Consider the implementation of "smart TTLs" in WANT requests, so according to the status of the network, bandwidth available, requests alive, number of connections or any other useful value, the TTL is determined.
41
41
42
+
## Implementation details
43
+
### Basic implementation
44
+
* An additional TTL field has been added to Bitswap WANT entries in Bitswap messages to
45
+
enable the forwarding of exchange requests to peers TTL+1 hops away.
46
+
* Bitswap is set with a defualt TTL of 1, so corresponding messages will only be forwarded
47
+
to nodes two hops away.
48
+
* Sessions now include a TTL parameter to determine how far their WANT messages can go. Sessions started within the peer (because the peer wants a block) are considered `direct`, while the ones triggered from the reception of a WANT mesages with enough TTLs are referred as `indirect` (the peer is doing the work on behalf of another peer and it is not explicitly interested in the block).An `indirect` flag has also been added to sessions in case in the future a different strategy want
49
+
to be implemented for indirect sessions (like the use of a degree to limit the number of WANT messages broadcasted to connected nodes to prevent flooding the network). Currently direct and indirect sessions follow the exact same strategy for block discovery and transmission.
50
+
51
+
* All the logic around indirect sessions is done in `engine.go`:
52
+
- The engine tracks the number of indirect sessions opened through an `indirectSession` registry.
53
+
- Whenever a peer receives a WANT message from which it doesn't have the block and its TTL is not zero, it sends a DONT_HAVE right away, and it triggers a new indirect sessions for those WANT messages with TTL-1.
54
+
- Whenever a new block or HAVE messages are received in an intermediate node for an active indirect session, these messages are forwarded to the source (the initial requester). This action updates the DONT_HAVE status of the intermediate node so it is again included in the session.
55
+
-_We need to be careful, in the current implementation blocks from indirect sessions are stored in the datastore for convenience, but they should be removed once all the interested indirect sessions for the block are closed and they have been successfully forwarded to avoid peers storing content they didn't explicitly requested._
56
+
- When receiving a HAVE the indirect session will automatically send the WANT-BLOCK to the corresponding peers, we have identified the interest from every peer (including direct ones) so when a peer receives a block for an indirect file it will automatically forward it to the source (there is no need to forward interest for WANT-BLOCKS because this is automatically managed withing the indirect sessions). Indirect sessions work in the same as direct sessions in this first implementation.
57
+
42
58
### Symmetric approach message flows
43
59

44
60

@@ -48,9 +64,10 @@ Again, this proposal should include schemes to avoid flooding attacks and the fo
48
64
We should expect a latency reduction in the discovery of content but it may lead to an increase in the bandwidth overhead of the protocol. We do not expect the increase in the bandwidth overhead to be substantial, given that response messages are not big in size
49
65
50
66
## Evaluation Plan
51
-
-[The IPFS File Transfer benchmarks.](https://docs.google.com/document/d/1LYs3WDCwpkrBdfrnB_LE0xsxdMCIhXdCchIkbzZc8OE/edit#heading=h.nxkc23tlbqhl)
67
+
-[ ][The IPFS File Transfer benchmarks.](https://docs.google.com/document/d/1LYs3WDCwpkrBdfrnB_LE0xsxdMCIhXdCchIkbzZc8OE/edit#heading=h.nxkc23tlbqhl)
68
+
- To evaluate the performance of this RFC we need a network where the `MAX_CONNECTION_RATE` of nodes is small, the number of passive nodes in the network (neither seeding nor leeching content) is high, and the number of seeders providing the content small. This will force content to be several hops away from leechers. Leechers should request the content all at the same time (if done in waves leechers in a wave would become seeders in the next wave and may add noise to the measurement).
52
69
53
-
- Compare the times a node resorts to DHT according to the TTL used, and the bandwidth overhead due to control messages.
70
+
-[ ] An additional measurement to consider is to compare the times a node needs to resort to the DHT to find the content in plain Bitswap compared to the RFC (this would determine how effective the strategy is).
54
71
55
72
## Prior Work
56
73
This RFC was inspired by this proposal. The RFC is based on the assumption that DHT lookups are slow and therefore is better to increase our “Bitswap span” than resorting to the DHT. It would be great if we could validate this assumption before considering its implementation.
0 commit comments