Data Availability Sampling (DAS) is a way for the network to check that data is available without putting too much strain on any individual node. Each node (including non-staking nodes) downloads some small, randomly selected subset of the total data. Successfully downloading the samples confirms with high confidence that all of the data is available. This relies upon data erasure coding, which expands a given dataset with redundant information (the way this is done is to fit a function known as a _polynomial_ over the data and evaluating that polynomial at additional points). This allows the original data to be recovered from the redundant data when necessary. A consequence of this data creation is that if _any_ of the original data is unavailable, _half_ of the expanded data will be missing! The amount of data samples downloaded by each node can be tuned so that it is _extremely_ likely that at least one of the data fragments sampled by each client will be missing _if_ less than half the data is really available.
0 commit comments