You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/topics/custom-domain/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,7 +17,7 @@ Traditional replication methods rely on fixed partitions or discrete addressing.
17
17
18
18
For example, subscribing to a live feed might involve replicating only the latest segment (e.g., [0.999, 1]). As the stream evolves, the replication range shifts dynamically. Or when you are buffering a video, then you will replicate the segments that are being buffered and continously expand the replication range as the video progresses. See the video below
Copy file name to clipboardExpand all lines: docs/topics/sharding/addressing.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,24 +6,24 @@ Sharding in Peerbit is based on the content being committed. Each commit will be
6
6
Every change in Peerbit is committed with an explicit link to the content it depends on. This means that by following the dependencies to the root, we can get the full state.
@@ -33,46 +33,46 @@ This is important background in order to understand how replicators/content lead
33
33
Imagine the commit above is made, so that the merged graph gets the label "DOG", how can we choose replicators in a fully connected network in a simple random way? (By being a replicator you have the task of storing the log and potentially also make it searchable for peers)
The first thing we need to do is to hash the labels of the peers (PeerIds) and the DOG label with a hash function (more details on this function later).
The hash function is seeded with the checksum of the content itself, so it changes for every new commit. This means that the results would differ if the content changes. E.g.
Copy file name to clipboardExpand all lines: docs/topics/sharding/sharding.md
+18-19Lines changed: 18 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,7 +34,7 @@ const db = peer.open("address", {
34
34
For (A), we already have many solutions that work well but generally do not consider (B) and (C). For instance, in common DHT systems, we can use the identities of the participants to distribute content and pick neighbors to satisfy the minimum replication degree constraint.
@@ -52,24 +52,24 @@ For simplicity, we consider that every peer can only have one range. And that ra
52
52
A piece of data that needs to be stored will be stored at a location that depends on its hash. But instead of using the hash, we transform it into a number bounded by [0,1].
If the vertical line intersects with a range, that peer will be responsible for replicating this data. A nice consequence of this is that peers can participate with different degrees of trust in how much work others will perform.
By replicating with a factor (width) of 1, every data point will intersect the range, hence the node will always be responsible for every data point. This means that if anyone in a network creates data, it will always be sent to this peer. This is also useful property if you want to create an app where every peer always should hold the complete state locally at all times.
63
63
64
64
Another nice consequence of this is that if you only want to "pin" a specific data point, you only need to make your width as small as the floating points allow, to only cover that particular data point. *A line is a special case of a curve* and *pinning is a special case of range replication* (a range with width that approaches 0).
This means that even if the longer range is further away by measuring from the closest edge, it still needs to replicate the data due to that the transformed distance gets shorter because of the wider range. This property is important, because we wan't to make sure that someone who replicates with width 0 does not get delegated any replication work.
@@ -86,11 +86,11 @@ and to find all points, do the calculation for
If you think of the content space as a circle, this would represent a rotation of `360° / min replicas`. So if `min replicas = 2` and the start point is the north pole, the second point would be the south pole.
@@ -100,28 +100,28 @@ But we will stick with the line representation because it will be easier to visu
With this in (A) place, now it is time to consider constraint (B). The innovative step here is that we adjust our width to satisfy any resource constraint. Is the memory or CPU usage too high? Just reduce the width of your responsibility until satisfied. Do you have capacity? Then perhaps it would be helpful for others if you increase your width of responsibility.
We cannot feasibly predict the optimal width for every participant in one go because we cannot continuously share all node info (CPU, memory, and other) usage to every other node at all times. Additionally, while data is inserted, storage-limited nodes will take up less width over time, so this is a continuous process. Therefore, the solution is to work iteratively where everyone adjusts their widths in small steps, and eventually the system converges to an optimal point.
121
121
122
122
For clarity these iterations on what happens when you update your width over time:
For better understanding, consider this analogy: it's as if we're trying to regulate the temperature in three houses simultaneously, where the thermostat in one house is influenced by the others. But the twist is that if one house requires less heating, another might need to compensate by heating more.
@@ -160,17 +160,17 @@ Below are illustrations of how aggregation is executed:
The source code for the aggregation is accessible [here](https://github.com/dao-xyz/peerbit/blob/95420cd37cb8d2ced4733495b6901b2b5e445e01/packages/programs/data/shared-log/src/ranges.ts#L155).
@@ -182,13 +182,13 @@ Initially, we observe peers receiving segments within the content space, with st
When memory limitation is enabled, the ranges are observed to update accordingly to what we set the limit to. Also note on the top left "Used storage" and how that changes with the limit set.
188
188
189
189
Upon enabling CPU limitation, it's noticeable that minimizing a client's tab halts data replication. This occurs because a minimized tab is typically subject to significant throttling, thereby constraining processing capacity. However, once the tab is reopened, operations resume to their optimal state.
Explore this yourself and review the source code [here](https://github.com/dao-xyz/peerbit-examples/tree/master/packages/file-share).
194
194
@@ -216,4 +216,3 @@ The parameters for the PID regulator might need to be adaptive, depending on net
216
216
### Numerical Optimizers
217
217
218
218
As previously described, the resource optimization problem was solved with a PID controller, under the assumption that the problem has desirable "convex" properties. While this assumption may hold in many cases, there might be scenarios where more robust (and more resource-intensive) solvers would be preferable. For instance, when non-numerical properties and non-linear features are involved, a [Recurrent Neural Network (RNN)](https://en.wikipedia.org/wiki/Recurrent_neural_network) might perform better.
0 commit comments