-
Notifications
You must be signed in to change notification settings - Fork 15
feat: get piece leaf count #249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Leaf count is nice, and useful for total data set if we can get it, but at the piece level we should have it in the Piece CID because we're using Piece CID v2. The only catch is that the way we currently get the piece list for a data set is a little broken and will be returning the wrong size (but the right multihash). There's two separate issues to resolve here:
Total leaf count for a whole data set though would be useful too if we can get that off the chain. |
My understanding of the situation is that the ball is in @SgtPooki court to:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self review.. some comments pointing reviewers to places where i'm not sure about things
* @returns The number of leaves for this piece | ||
*/ | ||
async getPieceLeafCount(dataSetId: number, pieceId: number): Promise<number> { | ||
// TODO: DO we need to call the contract for leaf count? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
callout here.. not sure if piece.ts leafCount calculation is enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is ok, but I don't think it's all that useful here; let's remove it for now and ignore leaf counts -- they mostly shouldn't be a concern to the user other than their fairly close relationship to size
// TODO: should we call the contract for leaf count? i.e. pdpVerifier.getPieceLeafCount(this._dataSetId, piece.pieceId) | ||
const leafCount = getLeafCount(piece.pieceCid) ?? 0 | ||
// TODO: is there a better way to get the raw size? | ||
const rawSize = getRawSize(piece.pieceCid) ?? 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the only place where we're getting setting rawSize and leafCount now.. should we be calling the contract instead of these helper methods?
// Parse the piece data as a PieceCID | ||
// The contract stores the full PieceCID multihash digest (including height and padding) | ||
// The data comes as a hex string from ethers, we need to decode it as bytes then as a CID | ||
const pieceDataHex = result.pieces[i].data | ||
const pieceDataBytes = ethers.getBytes(pieceDataHex) | ||
|
||
const cid = CID.decode(pieceDataBytes) | ||
const pieceCid = asPieceCID(cid) | ||
if (!pieceCid) { | ||
throw createError( | ||
'StorageContext', | ||
'getAllActivePiecesGenerator', | ||
`Invalid PieceCID returned from contract for piece ${result.pieceIds[i]}` | ||
) | ||
} | ||
|
||
yield { | ||
pieceId: result.pieceIds[i], | ||
pieceCid, | ||
subPieceCid: pieceCid, | ||
subPieceOffset: 0, // TODO: figure out how to get the sub piece offset | ||
} satisfies DataSetPieceData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: a core contributor should really eyeball this to make sure i'm doing things properly
// Expected leaf count is 2^height where height is calculated from size | ||
const expectedHeight = Size.Unpadded.toHeight(BigInt(size)) | ||
const expectedLeafCount = 2 ** expectedHeight | ||
|
||
assert.isNotNull(leafCount) | ||
assert.strictEqual(leafCount, expectedLeafCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this accurate?
// Expected raw size is leaf count * 32 | ||
const expectedHeight = Size.Unpadded.toHeight(BigInt(size)) | ||
const expectedLeafCount = 2 ** expectedHeight | ||
const expectedRawSize = expectedLeafCount * 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this accurate?
}) | ||
}) | ||
|
||
describe('getAllActivePieces', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lots of tests and mocking setup in here. I tried following existing patterns, but some mock helpers would be nice. especially for contract calls. Something like this would be amazing:
const mockContractContext = createMockContractContext()
mockContractContext.mock(key, singleCallResponse) // key maps to transaction data prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
synapse.test.ts has a new mock system that we are trying to move towards
rawSize, | ||
leafCount, | ||
subPieceCid: piece.pieceCid, | ||
subPieceOffset: 0, // TODO: figure out how to get the sub piece offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do I fulfill this value accurately?
Couple of comments here:
Anyway i dont feel like i can fully review this properly, we probably need rod to answers some of the question you added |
That makes sense. Can remove.
I'm guessing you're talking about the We still need some methods exposed from synapse-sdk that aren't currently. I'm all ears for what we want to handle in the sdk and can handle other in filecoin-pin.. I think we should at least export these methods:
Will wait for update from @rvagg before doing anything else |
* @param pieceCid - The PieceCID to extract raw size from | ||
* @returns The raw size in bytes or null if invalid | ||
*/ | ||
export function getRawSize(pieceCid: PieceCID | CID | string): number | null { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite right because height only gets us the padded size, we need to unpad using the padding that's also encoded. See #283
// TODO: should we call the contract for leaf count? i.e. pdpVerifier.getPieceLeafCount(this._dataSetId, piece.pieceId) | ||
const leafCount = getLeafCount(piece.pieceCid) ?? 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's ditch this
OK, here's my current thoughts about this:
Which leaves us with just "get pieces from contract". So I think we should pivot here slightly, so here's my suggestion:
Then we just consume |
I've just realised that we also need |
|
🎯 Add Piece Details with Leaf Count Information
Overview
This PR adds the ability to retrieve detailed piece information including leaf counts and calculated raw sizes for pieces stored in data sets. This provides users with more granular insights into their stored data, particularly useful for understanding storage costs and data organization.
New Features
1. Enhanced Piece Data Interface
DataSetPieceDataWithLeafCount
interface that extends existingDataSetPieceData
leafCount
(number of leaves in piece's merkle tree) andrawSize
(calculated asleafCount * 32
bytes)2. New API Methods
StorageManager:
getDataSetPieceDataWithLeafCount(dataSetId, pieceId)
- Get leaf count for individual piecesStorageContext:
getDataSetPiecesWithDetails()
- Get all pieces with leaf count and raw size informationWarmStorageService:
getPieceLeafCount(dataSetId, pieceId)
- Core method to fetch leaf count from PDPVerifier contract3. PDPVerifier Integration
getPieceLeafCount(dataSetId, pieceId)
method to interact with the PDPVerifier contractKey Benefits
Usage Examples
Files Changed
Core Implementation:
src/types.ts
- AddedDataSetPieceDataWithLeafCount
interfacesrc/pdp/verifier.ts
- AddedgetPieceLeafCount()
methodsrc/warm-storage/service.ts
- AddedgetPieceLeafCount()
wrapper methodsrc/storage/manager.ts
- Added public API methodsrc/storage/context.ts
- AddedgetDataSetPiecesWithDetails()
methodTesting:
src/test/warm-storage-service.test.ts
- Tests for WarmStorageService methodssrc/test/storage.test.ts
- Tests for StorageManager and StorageContext methodsDocumentation:
utils/example-piece-details.js
- Working example script demonstrating the new functionalityTesting
Example Output
Related filecoin-project/filecoin-pin#50