Skip to content

Conversation

@rvagg
Copy link
Contributor

@rvagg rvagg commented Jan 8, 2026

Also tagging @lucaniz for review here; it's based off https://www.notion.so/filecoindev/PDP-security-and-ProofSet-size-28edc41950c180d088dee430e1249a2a but is user-focused: "My specific piece wasn't challenged in the last X days—how do I know it's still safe?"

@rvagg rvagg requested a review from timfong888 January 8, 2026 01:01
@FilOzzy FilOzzy added this to FOC Jan 8, 2026
@github-project-automation github-project-automation bot moved this to 📌 Triage in FOC Jan 8, 2026
@BigLep BigLep moved this from 📌 Triage to 🔎 Awaiting review in FOC Jan 8, 2026
@rjan90 rjan90 moved this to 🔎 Awaiting review in PDP Jan 8, 2026
@lucaniz
Copy link

lucaniz commented Jan 8, 2026

Added some comments (minor). good to go for me.

@BigLep
Copy link
Contributor

BigLep commented Jan 8, 2026

Added some comments (minor). good to go for me.

@lucaniz : FYI that I don't see any comments from you in the PR.

@lucaniz
Copy link

lucaniz commented Jan 8, 2026

mmh... weird...don t you see this?
Screenshot 2026-01-08 at 17 22 24

@BigLep
Copy link
Contributor

BigLep commented Jan 8, 2026

mmh... weird...don t you see this?

@lucaniz : your comment is "pending", which I think means you haven't submitted your review.

@rvagg rvagg enabled auto-merge (squash) January 9, 2026 00:33
p_T = (1-α)^(K×T)
```

**Example detection rates (K=5 challenges per day):**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, let's say we want a commitment that AFR-1 = 99%. How do we communicate this?

A customer has one data set and that's where they continue to feed all of their data (say files), and get to 1000 files.
Withint a year, we want to say they will only lose 1 file. What is the DealBot threshold to know that SLA can be reached?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the numbers, apparently for 1 file lost out of 1000 (α = 0.1%):

  • Daily detection: 1 - (0.999)^5 ≈ 0.5%
  • 30-day detection: 1 - (0.999)^150 ≈ 14%
  • Annual detection: 1 - (0.999)^1825 ≈ 84%

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think thresholds for 1% loss detection would be:

  • 30 days 77.9%
  • 90 days 98.9%
  • 180 days 99.99%

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timfong888 I think the problem we have matching what PDP can do to what AFR wants to do is that PDP looks backward: "If data is lost, what's the probability we catch it?". But AFR wants more than that: "What's the probability data will be lost in the first place?". We don't control enough of the pipeline and infra to be able to say that without using historical track record and assuming things are consistent going forward. So maybe we can't and shouldn't frame in terms of AFR?

docs/design.md Outdated

**What this means for individual pieces:**

If a storage provider has lost any significant fraction of a data set, they will be caught with high probability regardless of which specific pieces are missing. The random challenge selection ensures that:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where I could use more clarity using the use case. "significant fraction" will be caught. I am using the 1% because that is the AFR we are using with as a base case. meaning the Annual Failure Rate is 1%. We want to approve an SP based on a period of time/number of successful proofs. How do we reason this on the customer's behalf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want 99% confidence in an SP before approving then we should go with 90 days of successful proofs before approving. That only shows history and we have to judge whether that's enough confidence for us of their ability to not lose data into the future.

For the user, we need to express this in terms of data loss detection and it's a curve, it looks something like:

  • Large loss (5%+): Caught within days
  • Medium loss (1-5%): Caught within weeks to months
  • Small loss (<1%): May take months to a year

I'v make a change on line 194:

As shown in the table above, detection confidence depends on the fraction of data lost and the proving period. For a 1% data loss, detection reaches 77.9% confidence within 30 days and exceeds 99% within 90 days. Larger losses are caught faster—5% loss reaches 99.95% detection in just 30 days.

@BigLep BigLep added this to the M4: Filecoin Service Liftoff milestone Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🔎 Awaiting review
Status: 🔎 Awaiting review

Development

Successfully merging this pull request may close these issues.

5 participants