perf!(ServiceProviderRegistry): Bloom Schema #308

wjmelements · 2025-10-20T22:34:40Z

Reviewer @rvagg
Closes #307
This significantly reduces the size of ServiceProviderRegistry: 21,290 -> 17,751 (-3539)

Motivation

We want to allow all product attributes to be queryable on-chain. Therefore we are removing the encoded productData. Mandatory schema will be loosely enforced on-chain with a bloom filter.

Upgrade guide for synapse and curio

All productInfo previously abi-encoded is removed and everything is now capability key-value store. Keys are string; values are bytes. This affects registerProvider, addProduct, and updateProduct. Unsigned integers should be encoded big-endian. Addresses should be encoded as bytes[20]. Strings should be encoded utf-8. See the examples in test/PDPOffering.sol. We don't validate these values on-chain so if they don't look right, throw.
getPDPOffering is not how you get PDP product info anymore. Now use getAllProductCapabilities which will return all keys and values for a product.
The ProviderWithProduct return type now contains productCapabilityValues so you don't have to fetch them separately.
getProductCapabilities no longer returns the exists bool array. This array was misleading because it only indicated whether the bytes were empty.
getProductCapability is removed. Use productCapabilities instead, which only differs in not having exists.
updatePDPServiceWithCapabilities is removed. Use updateProduct.
Capability values cannot be empty. Exclude the key to signal that the product does not have the capability.
getProvidersByProductType and getActiveProvidersByProductType are merged into one method, getProvidersByProductType, which now has a boolean flag parameter onlyActive.
getProduct is replaced by getProviderWithProduct, which returns ProviderWithProduct
storagePricePerTibPerMonth is now storagePricePerTibPerDay

Changes

define and configure BloomSet with k=16
move PDPOffering struct to a testing helper library
redefine required schema as BloomSet16
only enforce required schema probabilistically
ensure synapse has a good method for fetching these keys all at once
remove misleading exists
remove ipni capability flags from required schema
fix tests
add and test BigEndian helper library for encoding and decoding integers

service_contracts/src/lib/BloomSet.sol

service_contracts/src/ServiceProviderRegistry.sol

…ing documentation

wjmelements · 2025-10-23T02:10:36Z

I'm changing the type of capabilities values to bytes.

Also I have noticed that getProductCapabilities and getProductCapability use bytes length for exists, which is unhelpful and wrong, because value length can be 0. Existence is actually determined by membership in the capability keys array. In fact have been using some empty values in some places.

            bytes memory value = capabilities[keys[i]];
            if (value.length > 0) {
                exists[i] = true;
                values[i] = value;
            }

wjmelements · 2025-10-23T03:41:49Z

Remaining tasks:

view helper for fetching all of the key-values for a product
blackbox testing of bloom filter failure case

Open questions:

can we make some of the schema fields optional such as the ipni flags?
can we remove the exists bools since they are only reporting whether the returned bytes are empty?

service_contracts/src/ServiceProviderRegistry.sol

rvagg · 2025-10-23T03:52:21Z

I'm fine with bytes, my main concern with bytes has always been:

Ease of decoding in off-chain tooling - SDK, cast, etc. But I think we have the tools we need in those places to do decoding ..?
Consumption by subgraphs - does it make it harder to consume, display and make assumptions about these fields if they are bytes? If someone puts non-utf8 in here, what does a subgraph do and is it disruptive?

peerId is one case of wanting bytes
others where we expect strings, we just need to do a bit more work on the client side to validate, and I'm fine with that but we should document these expectations really well -- you're essentially making a schema with this filter so you should document it very clearly what a client can expect and what a client should do

…idersByProductType and getProvidersByProductType

rvagg · 2025-10-24T08:31:31Z

service_contracts/src/ServiceProviderRegistry.sol

        view
        providerExists(providerId)
-        returns (bytes memory productData, string[] memory capabilityKeys, bool isActive)
+        returns (string[] memory capabilityKeys, bool isActive)


can we just return the keys and values here and be done with it? this is one of the most awkward spots - if we don't use key-existence as a signal then we're always going to want the values too

key existence is a signal

how about we just return ProviderWithProduct here? so this is the single version of getProvidersByProductType

key existence is a signal

you're the one arguing to make value necessary even for booleans; in that world I can't think of a case where just getting the keys is useful to me, I just want all of it

I think you still don't understand. If a capability is a boolean, its existence is sufficient. But that existence cannot be the signaled with the empty string because it is indistinguishable when doing a single key lookup in a solidity mapping.

If a key's absence unambiguously signals the capability is not supported, then the key's presence can unambiguously signal the capability is supported. Any nonzero length is thus truthy.

I don't think we need many of these methods. I agree they aren't useful on or off chain. I will check how we are using them tomorrow.

If a key's absence unambiguously signals the capability is not supported, then the key's presence can unambiguously signal the capability is supported. Any nonzero length is thus truthy.

This is what I've been arguing for here and which is why I want an exists boolean return any time I want to ask for a specific key. I don't want to have to put a value in the value map, I just want to know the key exists and then not care about the value, and to work around the limitations of not having a null or non-zero sentinel in solidity. But we agree that the current implementation of doing that is broken - it should do it properly by iterating over keys that it has and figuring out whether it exists or not. But I also now think we can just do away with that entirely. There may be a case for "tell me the value for this key" or "tell me if you have this key", but with the way this is shaping up, I think all I really ever want out of this is be able to get the full product, keys and values, and deal with it on the client side.

The use case for querying a single key is for onchain lookup. It should not loop over all of the keys to do that.

rvagg · 2025-10-24T08:35:48Z

getActiveProvidersByProductType -> let's just make an onlyActive bool argument to getProvidersByProductType and ditch a method

wjmelements · 2025-10-24T08:37:45Z

getActiveProvidersByProductType -> let's just make an onlyActive bool argument to getProvidersByProductType and ditch a method

I believe I suggested something similar in the original PR. Would you believe that synapse is fetching all of them and them filtering by isActive?

rvagg · 2025-10-24T08:46:40Z

Would you believe

Oh yes I would. This whole registry was done way too quick, on both sides.

rvagg · 2025-10-24T10:14:23Z

Trying out my own feedback as a PR: #328

rvagg · 2025-10-24T11:25:06Z

I'm just coming to terms with the per-day here vs per-month before and per-month that we have in FWSS, it's a bit odd that we have two versions of this. Now an SP has to divide the per month charge that everyone talks about to figure out how many days. We have a standard for what a "month" is in epochs that we use everywhere, it's not abnormal to encode a month as 30 days.

Anyway, just my 2c, not a big deal but it's jarring as I update Curio to work with this and think through what an SP has to deal with. The default I have to encode is 83333333333333333 to get close to the 2.5 USDFC we have in WarmStorage.

Original thread is #297 (comment)

Ref: FilOzone/filecoin-services#308 Ref: FilOzone/filecoin-services#328

rvagg · 2025-10-24T11:33:33Z

Curio version on top of #328: filecoin-project/curio#736

service_contracts/src/lib/BloomSet.sol

Co-authored-by: Jakub Sztandera <[email protected]>

Kubuxu

SGWM but needs a rebase

Kubuxu · 2025-10-24T18:59:29Z

service_contracts/src/ServiceProviderRegistry.sol

+            }
+        }
+        // Enforce minimum schema
+        require(BloomSet16.mayContain(foundKeys, requiredKeys), Errors.InsufficientCapabilitiesForProduct(productType));


In theory, one can find a set of keys that match this bloom filter, but it is fine, since the final decision is on the client side and in the approval list.

If the Bloom filter is used only for that verification, the problem could be avoided by requiring keys to be provided in order. And then stepping through the list of required and provided in order, while allowing extra provided keys.

Yes we are ultimately validating these fields off-chain. The filter will help prevent accidental omissions.

the problem could be avoided by requiring keys to be provided in order

That is a good idea, but this contract's codesize would then scale in the number of required keys rather than the number of products. It can be reduced by using a separate validation library per product. We could then have more specialized on-chain validation of known keys.

* feat(pdp): deal with new ServiceProviderRegistry changes Ref: FilOzone/filecoin-services#308 Ref: FilOzone/filecoin-services#328 * fixup! feat(pdp): deal with new ServiceProviderRegistry changes * fix: treat key presence as truthy for boolean options Co-authored-by: William Morriss <[email protected]> * feat(pdp): add IpniPeerID to PDPOfferingData Signed-off-by: Jakub Sztandera <[email protected]> * feat(pdp): show IpniPeerID in webui, use IpniPeerID in FSUpdatePDP Signed-off-by: Jakub Sztandera <[email protected]> --------- Signed-off-by: Jakub Sztandera <[email protected]> Co-authored-by: William Morriss <[email protected]> Co-authored-by: Jakub Sztandera <[email protected]>

Ref: FilOzone/filecoin-services#308 Ref: FilOzone/filecoin-services#328

* feat(pdp): deal with new ServiceProviderRegistry changes Ref: FilOzone/filecoin-services#308 Ref: FilOzone/filecoin-services#328 * fixup! feat(pdp): deal with new ServiceProviderRegistry changes * fix: treat key presence as truthy for boolean options Co-authored-by: William Morriss <[email protected]> * feat(pdp): add IpniPeerID to PDPOfferingData Signed-off-by: Jakub Sztandera <[email protected]> * feat(pdp): show IpniPeerID in webui, use IpniPeerID in FSUpdatePDP Signed-off-by: Jakub Sztandera <[email protected]> --------- Signed-off-by: Jakub Sztandera <[email protected]> Co-authored-by: William Morriss <[email protected]> Co-authored-by: Jakub Sztandera <[email protected]>

wip bloom schema

ba25df6

wjmelements requested a review from rvagg October 20, 2025 22:34

FilOzzy added this to FS Oct 20, 2025

github-project-automation bot moved this to 📌 Triage in FS Oct 20, 2025

wjmelements commented Oct 20, 2025

View reviewed changes

service_contracts/src/lib/BloomSet.sol Show resolved Hide resolved

wjmelements added 2 commits October 20, 2025 17:40

only shift by 8

8a503aa

parameterize K

934f14e

wjmelements commented Oct 20, 2025

View reviewed changes

service_contracts/src/ServiceProviderRegistry.sol Outdated Show resolved Hide resolved

wjmelements added 6 commits October 21, 2025 00:10

Merge remote-tracking branch 'origin/main' into bloom-schema

393a5a3

test BloomSet16

c08fb7a

Errors.InsufficientCapabilitiesForProduct, and restore prior PDPOffer…

35dd6c3

…ing documentation

Merge remote-tracking branch 'origin/main' into bloom-schema

62fd4b2

must tag as dev

2fb481e

bytes[] capability values

1fdfd06

wjmelements added 4 commits October 22, 2025 21:48

tests build but do not yet pass

11dc0ba

tests pass

e317e43

mv PDPOffering.sol to test

fc80b4b

make update-abi

3d6682d

wjmelements commented Oct 23, 2025

View reviewed changes

service_contracts/src/ServiceProviderRegistry.sol Show resolved Hide resolved

wjmelements added 6 commits October 23, 2025 14:07

update subgraph: ProductUpdated and create product

17bd9ec

subgraph: bytes capabilityValues

8061aa6

remove misleading exists bools

621f789

make ipni fields optional and document ipniPeerId

f565a89

getAllProductCapabilities, and add capability values to getActiveProv…

891e3b6

…idersByProductType and getProvidersByProductType

make update-abi

7a572f1

wjmelements changed the title ~~perf(ServiceProviderRegistry): Bloom Schema~~ perf!(ServiceProviderRegistry): Bloom Schema Oct 23, 2025

wjmelements marked this pull request as ready for review October 23, 2025 21:31

rvagg reviewed Oct 24, 2025

View reviewed changes

rjan90 moved this from 📌 Triage to 🔎 Awaiting review in FS Oct 24, 2025

rvagg added a commit to filecoin-project/curio that referenced this pull request Oct 24, 2025

feat(pdp): deal with new ServiceProviderRegistry changes

8187b23

Ref: FilOzone/filecoin-services#308 Ref: FilOzone/filecoin-services#328

rvagg added a commit to filecoin-project/curio that referenced this pull request Oct 24, 2025

feat(pdp): deal with new ServiceProviderRegistry changes

8dbde91

Ref: FilOzone/filecoin-services#308 Ref: FilOzone/filecoin-services#328

rvagg mentioned this pull request Oct 24, 2025

feat(pdp): deal with new ServiceProviderRegistry changes filecoin-project/curio#736

Merged

feat(registry): provide more useful accessors (#328)

a4d0062

Kubuxu reviewed Oct 24, 2025

View reviewed changes

service_contracts/src/lib/BloomSet.sol Outdated Show resolved Hide resolved

Update service_contracts/src/lib/BloomSet.sol

1628fde

Co-authored-by: Jakub Sztandera <[email protected]>

Kubuxu approved these changes Oct 24, 2025

View reviewed changes

github-project-automation bot moved this from 🔎 Awaiting review to ✔️ Approved by reviewer in FS Oct 24, 2025

Kubuxu reviewed Oct 24, 2025

View reviewed changes

wjmelements added 3 commits October 24, 2025 14:40

Merge remote-tracking branch 'origin/main' into bloom-schema

602b0dc

BigEndian library

3c28451

use big endian encoding in test

94eed74

wjmelements merged commit 1c54968 into main Oct 24, 2025
6 checks passed

wjmelements deleted the bloom-schema branch October 24, 2025 22:44

github-project-automation bot moved this from ✔️ Approved by reviewer to 🎉 Done in FS Oct 24, 2025

wjmelements mentioned this pull request Oct 25, 2025

feat!(ServiceProviderRegistry): add ipniPeerId to PDPOffering #266

Closed

rjan90 mentioned this pull request Oct 27, 2025

build: prep FWSS v1.0.0 release #332

Open

wjmelements mentioned this pull request Oct 27, 2025

feat(ServiceProviderRegistry): KeyValue productData FilOzone/synapse-sdk#356

Closed

rvagg added a commit to filecoin-project/curio that referenced this pull request Oct 29, 2025

feat(pdp): deal with new ServiceProviderRegistry changes

2e354b6

Ref: FilOzone/filecoin-services#308 Ref: FilOzone/filecoin-services#328

rjan90 mentioned this pull request Oct 29, 2025

add PeerID for PDP offering metadata #226

Closed

perf!(ServiceProviderRegistry): Bloom Schema #308

perf!(ServiceProviderRegistry): Bloom Schema #308

Uh oh!

Conversation

wjmelements commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Upgrade guide for synapse and curio

Changes

Uh oh!

Uh oh!

Uh oh!

wjmelements commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wjmelements commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

rvagg commented Oct 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rvagg commented Oct 24, 2025

Uh oh!

wjmelements commented Oct 24, 2025

Uh oh!

rvagg commented Oct 24, 2025

Uh oh!

rvagg commented Oct 24, 2025

Uh oh!

rvagg commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rvagg commented Oct 24, 2025

Uh oh!

Uh oh!

Kubuxu left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wjmelements commented Oct 20, 2025 •

edited

Loading

wjmelements commented Oct 23, 2025 •

edited

Loading

wjmelements commented Oct 23, 2025 •

edited

Loading

rvagg commented Oct 24, 2025 •

edited

Loading

Kubuxu left a comment •

edited

Loading