-
Notifications
You must be signed in to change notification settings - Fork 32
Backport index and ipni pdp tasks to PDPv0 #670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: pdpv0
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some DB changes are required to make this work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This task will only produce IPFS style ads. So, there would be no IPNI lookup path for pieces. Which was a requirement confirmed by Jen and Miro both in Dubai.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 That's fine for now. I just want to get ipfs adds working first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
piece CIDs into IPNI is lower priority; if it can be bundled into the same work without too much effort then that's great, but it doesn't buy us very much so it can be treated as a stretch goal
@ZenGround0 : will https://github.com/filecoin-project/curio/blob/feat/index-and-ipni/deps/config/types.go#L116 be updated to include filecoinpin.contact, or will that be somewhere else? |
IPNI: IPNIConfig{ | ||
ServiceURL: []string{"https://cid.contact"}, | ||
DirectAnnounceURLs: []string{"https://cid.contact/ingest/announce"}, | ||
DirectAnnounceURLs: []string{"https://cid.contact/ingest/announce", "https://filecoinpin.contact/announce"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are the implications of having both in here? I'm guessing it's inviting both of them to hit us? a failure trying to talk to one won't stop us from trying the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is common config. So updating here in not an option without making it common between poRep and PDP and publishing all existing PoRep chain to Filecoin pin. I plan to create a tool for basic PDP stuff in Curio, like a init after Curio init or maybe part of guided-setup. We can use that to insert values for PDP only clusters. A more cleaner approach in my opinion and allows for much more complex config changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concretely having both just means we tell both indexers about our latest IPNI head when publishing advertisements. Then I assume both will hit us when looking for retrieval endpoints.
a failure trying to talk to one won't stop us from trying the other?
From reading libipni code and observing a failure to propagate to filecoinpin.contact and success with cid.contact I'm confident that the answer is yes.
A more cleaner approach in my opinion and allows for much more complex config changes.
I don't like this either, I'd rather have people put this into an ipni layer, but I've been told repeatedly to do this for now. We'll happily get rid of it when we go full main branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is common config
Correct me if I'm wrong here @rvagg, but I think that is fine for now. I believe we are assuming someone who is participating in Filecoin Onchain Cloud at this point to be doing so with a Curio instance that is strictly for PDP. It would be unwise to also be doing mainnet PoRep with this instance. So for now, if this common place is the easiest way to specify multiple indexers, lets do it.
what are the implications of having both in here? I'm guessing it's inviting both of them to hit us? a failure trying to talk to one won't stop us from trying the other?
These are good questions. I assume @ZenGround0 or @LexLuthr can answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(doh, sorry, I typed before seeing @ZenGround0. The key part is answered, in that the curio node us operationally ok with listing two indexers, even if one fails or is unresponsive).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no worries, classic collision
order_number BIGSERIAL PRIMARY KEY, -- Unique increasing order number | ||
ad_cid TEXT NOT NULL, | ||
context_id BYTEA NOT NULL, -- abi.PieceInfo in Curio | ||
context_id BYTEA NOT NULL, -- abi.PieceInfo || PDPIPNIContext in Curio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... Spark squealing in the distance ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already discussed with Spark about this. They agreed with the new contextID.
indexStore *indexstore.IndexStore | ||
sc *chunker.ServeChunker | ||
keys map[string]*peerInfo // map[peerID String]Private_Key | ||
latest map[string]cid.Cid // map[peerID String]last published head, used to avoid duplicate announce |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the changes in here wrt latest
look like bugfixes, or is it just the fact that 30 seconds now make this task more onerous so it's worth being spare with what we provide?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes its a blend of both. It would have been good to do this with 10 minute interval and non-real time constraints but now its become important.
} | ||
|
||
// Create a warm storage service viewer. | ||
warmStorageService, err := NewFilecoinWarmStorageServiceStateView(metadataAddr, ethClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should not pull in all of warm storage here, have a look at pdp/contract/ListenerServiceWithViewContract.abi
and just do the same with FWSS, call it ListenerServiceWithMetadata
and then in that file just have the function you need:
[
{
"type": "function",
"name": "getDataSetMetadata",
"inputs": [
{
"name": "dataSetId",
"type": "uint256",
"internalType": "uint256"
},
{
"name": "key",
"type": "string",
"internalType": "string"
}
],
"outputs": [
{
"name": "exists",
"type": "bool",
"internalType": "bool"
},
{
"name": "value",
"type": "string",
"internalType": "string"
}
],
"stateMutability": "view"
}
]
generate that, then do a NewListenerServiceWithViewContract(metadataAddr, ethClient)
and if that fails, just silently return false, "", nil
.
one of the goals of ListenerServiceWithViewContract
is to keep Curio relatively agnostic of FWSS, hence the err == nil
fall-through if it exists. We should continue that and make it happy if it finds it's talking to something that doesn't have a metadata key. Maybe an info log would be good at that point, but I don't think we need to be strict about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I was just being lazy but this is obviously the right thing to do. I hadn't realized it was clean to do this without publishing a whole new contract interface in pdp repo and it sounds like maybe I can avoid that.
// Hard to differenctiate between unsupported listener type OR internal error | ||
// So we log on debug and skip indexing attempt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is what I think you should move up into GetDataSetMetadataAtKey
- we can differentiate so let's just do it there
if err != nil { | ||
return false, xerrors.Errorf("parsing announce address domain: %w", err) | ||
} | ||
if build.BuildType != build.BuildMainnet && build.BuildType != build.BuildCalibnet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sad .. I've been using non-TLS on calibnet for testing, but I guess we don't have a way of determining whether we should or shouldn't have an https announce address in the config. I was wondering whether I could get away with testing with IPNI without having to sort out certificates but I think this answers it.
PieceCID cid.Cid | ||
|
||
// Payload determines if the IPNI ad is TransportFilecoinPieceHttp or TransportIpfsGatewayHttp | ||
Payload bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
who do we expect to consume it this way @LexLuthr? are we still designing for Spark making sense of our contexts? because PieceCID is going to be different for payload vs piece anyway so it should be unique as it is without having to add the additional uniqueness of the bool. Or are we expecting someone to decode this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple reasons:
- Early on I was told, we should be able to publish IPFS style and piece multihash only ads for each piece in PDP. So, pieceCID will be same in that case. I am not sure what you mean by "PieceCID is going to be different for payload vs piece anyway"
- If we decide to some kind of checker stuff down the line (Spark), this allows decoding and making sense. Better safe then sorry as it will require another ad chain migration which is not cheap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing fatal in here, this seems pretty good; but I've left a bunch of inline suggestions, giving approval in the assumption that those will be reasonably considered at least
CIDGravityTokens: []string{}, | ||
}, | ||
IPNI: IPNIConfig{ | ||
ServiceURL: []string{"https://cid.contact"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZenGround0 : I don't know what this does, but should we also add filecoinpin.contact
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From curio docs this is a curio webUI thing and is not critical for data flowing from place to place. I won't change any defaults here, if curio operators are seeing issues with their view on advertisements they can easily make a config layer including filecoinpin.contact or others.
Co-authored-by: Rod Vagg <[email protected]>
Backporting the indexing and ipni tasks from mkv2 to the pdpv0 branch.
Code is at least not obviously broken but testing is a work in progress.