Adjust WindowPoSt parameters without damaging network security #537
Replies: 7 comments
-
I'll try to give some context and answer to the first 2 points below (cc @irenegia ): Extend WindoPost proving period from 24h to 48h**
Let me try to give an overview of the whole WindowPost mechanism in order to address the points raised At the current stage, WindowPost relies on the fact that storing a sealed sector is more rational than re-computing (part of) it in order to answer challenges. What we get is that storing less than 80% of a sector will lead to an expensive recomputation step in order to answer WindowPost challenges with high probability. To give a more details, the analysis shows that, due to the many dependencies each node has in the graph, when storing less than 80% of the sealed sector, there is (with some probability) at least one challenged node that, in order to be recomputed, needs the recomputation of the 80% of one of the layer of the graph. This computation is not parallelizable. We have been really conservative when we designed the cost model, and we are convinced that this approach is more secure than considering the “average sealing cost” as proposed here. In essence, this is a lower bound on the computation the prover needs to do in order to answer all challenges correctly. Let’s call this computation C. The cost analysis (which is based on the cost of SHA and the cost of storage) shows that running C is K times more expensive than storgin the entire sealed sector for 24h. If we want to extend the proving period from 24 to 48 hours, what we would get would be that the the cost assumption becomes “worse” by a factor 2, meaning that running C would now become K/2 times more expensive than storgin the entire sealed sector for 48h, which would make the level of security worse than what we have. Given all the above, we stress that
Modify Challenge count (halving the number of challenges for 32GB sectors)
The suggestion here is to cut by 2 the number of challenges we ask for 32GB sectors given that 64GB sectors and 32GB sectors share the same number of challenges. Unfortunately this is not as straightforward as it seems. Here is why: Let’s recall that we want to catch a provider who is storing less than the 80% of a sector (whatever size it has) with probability p (in order to apply our cost assumption). No matter the size of the sector (it is a “percentage check”), with T challenges we have that a provider storing less than 80% of a sector is caught with probability p ≥ 1- (1-0.2)T where
If now we differentiate between 64GB sectors and 32GB sectors, by asking T/2 challenges for the latter, we would have that:
Basically, if we do so, providers storing less than 80% in a 32GB sectors have higher probability to pass the WindoPost step with respect to providers storing less than 80% in a 64GB one. If there is something which is not clear/needs more clarification, please let us know! |
Beta Was this translation helpful? Give feedback.
-
Thanks @lucaniz @irenegia for the insightful security information. With all those being said,
@Pythonac - if you believe that the Filecoin network security will not be reduced by the proposed change, could you please provide the supporting analysis and data that can back up the statement? |
Beta Was this translation helpful? Give feedback.
-
On behalf of Seal Storage Technology an Enterprise focused Storage Provider, we support this initiative as it would make maintenance easier for infrastructure updates etc. |
Beta Was this translation helpful? Give feedback.
-
@lucaniz @jennijuju @salstorage |
Beta Was this translation helpful? Give feedback.
-
@lucaniz @jennijuju
Given that a SP with P storage power and S active sectors storing 80% of each sector. The loss L for the failures include instant penalty and potential block reward loss: Supposing sealing computation cost for each sector is C, the total computation cost each day will be D: So, the total cost for storing only 80% of each sector per day is G: Supposing storage price is A, the total cost for storing the additional 20% of each sector per day is K:
Compared with storing 80% of each sector, it seems better for malicious modes to store 80% of complete sectors. The chart below displays the cost comparison.
Parameter Explanation: AWS Pricing: To sum up, even in the worst case, the cost gap is so big that no one would even choose to seal rather than store. Incentive is the only way to guarantee SPs manipulate the sectors rationally. Based on the cost analysis above, adjusting the proving period deliberately will not impact the incentive model and will not involve risk to the network.
I guess you’re right about this: reducing T for 32 GiB sector is not as safe as keeping T for 64 GiB sector, though the Window PoSt cost for 32 GiB sector is doubled for 64 GiB sector. Yet based on the analysis above, reducing challenge count is safe for 32 & 64 GiB sectors both. Let’s assume SP A has 1PiB storage power with 32 GiB sectors, storing less than 80% of each sector. If we reduce the challenge value to 5, there are about 22031 sectors that can't pass Window PoSt. The cost for resealing 22031 sectors is about $31857, while the cost of storing the additional 20% data per day is $147. I don’t think a rational SP will do it like this. |
Beta Was this translation helpful? Give feedback.
-
Hi @Pythonac, There are a couple of details I want to highlight, in order to better understand where we are coming from. I understand your point in the number of failed sector (on expectation: I think there is a typo and it is F = (1- 0.8^T) * S, but is really minor :) ). Let me comment on some parts of your analysis that I'd like to better specify. Wrt the following:
unfortunately this is not really the case. Given the security analysis of the construction, the only thing that we can prove is that the cost for a provider who gets caught with challenges is at least 80% of a single layer of the graph (that is, ~ C/10, if C is the cost of the whole sealing procedure). The real issue is that we do not really know which nodes of the graph he is storing and we need to be generic, relying on lower bounds that we can prove in a security analysis for "any adversarial strategy". Of course one can restrict to a precise strategy/rely on different assumption/less conservative approaches, but this is not what we did. When considering the following
we actually work under the assumption that either you regenerate or you pay the penalty L. Actually, when dealing with windowpost, we generally consider an adversary who either is storing or is regenerating (which means that he is passing the proof in both cases). Again one can consider a less conservative approach/different assumptions, but it is not what we took into account. Coming back to adversarial strategies, as far as I can tell you consider 2 specific strategies which are the one where the adversary is storing the 80% of each sector and the one where the adversary stores the 80% of sector entirely, and not storing the 20% of them, incurring in the sealing cost for that 20% (in this particular case I agree with you that regeneration cost = sealing cost). Nevertheless, similarly to what I wrote above, unless we prove that those are the 2 optimal strategies, we should not restrict to those 2 (what if there is a more adaptive strategy that the adversary can exploit with the same "storage budget"? This is actually the reason why we use worst case bounds, and also the reason why these analysis are that complicated). In terms of cost estimation
when we run our cost analysis we considered the scenario of providers buying their own hardware and using them at max capacity for the entire lifecycle (which gave us lower costs). With respect to the challenge count, again we should not consider the cost of the "whole" sealing given that the only thing we can mathematically prove is that if one need to regenerate, he would incur in a cost which is at least the one of computing the 80% of a single layer of the graph. I hope this helps, let me know if you have any comment or question cc @irenegia |
Beta Was this translation helpful? Give feedback.
-
@lucaniz Sorry for late reply! It took me a long time to think about your thoughtful feedback, which I feel much appreciate. Yet I have to put things in another way. I understand your concern about the network security, but I think you are overestimating the impact of this proposal. The proposed change would only affect a small fraction of miners who are facing temporary difficulties due to external factors, such as network congestion or hardware failures. It would not encourage malicious behavior or reduce the overall quality of service of the Filecoin network1. In fact, it would help to maintain a healthy and diverse storage market by preventing unnecessary loss of power and reputation for honest miners2. I believe this proposal is aligned with the vision and principles of Filecoin3, which aims to create a decentralized, efficient, and robust storage network for humanity’s most important information. Looking forward to your feedback on this perspective. Thank you. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Motivation
WindowPoSt is used as a proof that a copy of the data has been continuously maintained over time,which makes it irrational for a SP to not keep a sealed copy of the data (i.e., it is more expensive to seal a copy of the data every time they are asked to submit a WindowPoSt challenge).
Current WindowPoSt parameters (proving period & challenge count) are overstrict. Deliberative adjustment would be beneficial to the Filecoin network.
Goals
Proposals
Extend proving period
Based on current network stats (November 10, 2022 ):
RawBytePower is about 15.9 EiB(while QualityAdjPower is about 18.7 EiB);
Active miner count is 3,967;
Average raw RawBytePower for each miner is about 4.1 PiB.
Setting the proving period at 24 hours means if a miner could complete the proof, he has the capacity to seal 4.1 PiB/day. Yet no matter whether a miner can achieve the capability, the cost for sealing is much larger than keeping a copy. The cost gap is so big that slight changes of the proving period would leave less impact on the network security, unless a very large change, for example, changing to 1 week or 1 month.
At the same time, SubmitWindowedPoSt occupies about 18% of the total network gas fee. If extending the proving period to 48 hours, it will save about 10% network capacity without impacting network security.
Modify challenge count
32GiB & 64 GiB seal proof share the same parameters for PoSt:
NODE_SIZE at 32;
winning post challenge count at 66;
window post challenge count at 10.
While leaf count for 64 GiB seal proof is 2 times of 32 GiB seal proof, the sampling frequency for 32 GiB is doubled. Since the challenge count setting for 64 GiB seal proof was proved to be safe by the network, we should adjust the setting for 32 GiB seal proof to align with 64 GiB seal proof.
Extend miner actor’s DeclareFaultsRecovered() method
There are many reasons for missing window PoSt deadlines. For example, base fee changes drastically;SubmitWindowedPoSt message can’t be published; GPU broke accidentally and can’t finish the computing on time etc. These faults result in SP’s power reduction, which leads to block reward loss. SPs are incentivized to fix these faults as quickly as possible but have to wait for the corresponding deadline in the next proving period to recover the sectors.
We can leverage the existing DeclareFaultsRecovered() method to recover the fault once they are fixed. We can add a “EarlyProvePartition” property in DeclareFaultsRecoveredParams. EarlyProvePartition is an array of <partition, deadline> pairs:
The minimum early proving unit is partition.
Deadline should be no later than the assigned deadline for the partition and no earlier than the first safe deadline when the message is on chain.
EarlyProvePartition value can be changed via different messages before taking effect.
Discussion
Some policy constants need to be updated to align with the proving period changes:
To update the challenge count for 32 GiB seal proof, besides code change, Bls12 Groth params need to be regenerated and published.
Beta Was this translation helpful? Give feedback.
All reactions