Skip to content

Conversation

@mverzilli
Copy link
Contributor

@mverzilli mverzilli commented Jan 9, 2026

Second part of the series started with #19445.

This makes the CapsuleStore work based on staged writes. With this, capsules aren't written to persistent storage until PXE decides to commit the job.

@mverzilli mverzilli added ci-no-squash ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure and removed ci-no-squash labels Jan 9, 2026
@mverzilli mverzilli force-pushed the martin/capsule-store-with-staged-writes branch from e705402 to e770876 Compare January 9, 2026 16:06
@mverzilli mverzilli requested review from Thunkar, benesjan and nventuro and removed request for benesjan January 9, 2026 16:26
@mverzilli mverzilli requested a review from nventuro January 9, 2026 21:09
Copy link
Contributor

@nventuro nventuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some more comments, but it's already looking great! Address them and merge as you see fit. Thanks!

* Deletes a capsule on the stage of a job. Note the capsule will still
* exist in storage until the job is committed.
*/
#deleteOnStage(jobId: string, dbSlotKey: string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is is 'stage' or 'staged'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are staged capsules which mean they are on stage 🤡. Naming in this whole series of PR has not been my forte tbh. I re-think how I should call this.

Comment on lines +86 to +91
const dataBuffer = await this.#capsules.getAsync(dbSlotKey);
if (!dataBuffer) {
return null;
}

return dataBuffer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const dataBuffer = await this.#capsules.getAsync(dbSlotKey);
if (!dataBuffer) {
return null;
}
return dataBuffer;
return await this.#capsules.getAsync(dbSlotKey) ?? null;

*/
async #getFromStage(jobId: string, dbSlotKey: string): Promise<Buffer | null | undefined> {
const jobStagedCapsules = this.#getJobStagedCapsules(jobId);
let staged: Buffer | null | undefined = jobStagedCapsules.get(dbSlotKey);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need explicit types here?

Comment on lines +63 to +67
if (staged === undefined) {
// If we don't have a staged version of this dbSlotKey, first we check if there's one in DB
staged = await this.#loadCapsuleFromDb(dbSlotKey);
}
return staged;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do

Suggested change
if (staged === undefined) {
// If we don't have a staged version of this dbSlotKey, first we check if there's one in DB
staged = await this.#loadCapsuleFromDb(dbSlotKey);
}
return staged;
if (staged !== undefined) {
return staged;
} else {
// If we don't have a staged version of this dbSlotKey, first we check if there's one in DB
return this.#loadCapsuleFromDb(dbSlotKey);
}

but maybe that's just me. It feels weird to do staged = db when we don't actually make the db value become staged.

async deleteCapsule(contractAddress: AztecAddress, slot: Fr): Promise<void> {
await this.#capsules.delete(dbSlotToKey(contractAddress, slot));
deleteCapsule(contractAddress: AztecAddress, slot: Fr, jobId: string) {
// When we commit this, we will interpret null as a deletion, so we'll propagate the delete to the KV store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// When we commit this, we will interpret null as a deletion, so we'll propagate the delete to the KV store

I don't think needs needs clarifying at the callsite

* exist in storage until the job is committed.
*/
#deleteOnStage(jobId: string, dbSlotKey: string) {
this.#getJobStagedCapsules(jobId).set(dbSlotKey, null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.#getJobStagedCapsules(jobId).set(dbSlotKey, null);
// A staged null value indicates deletion, and will result in the KV store entry being deleted on commit
this.#getJobStagedCapsules(jobId).set(dbSlotKey, null);

Comment on lines +191 to +194
// This transactional context gives us "copy atomicity":
// there shouldn't be concurrent writes to what's being copied here.
// Equally important: this in practice is expected to perform thousands of DB operations
// and not using a transaction here would heavily impact performance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the full 120 line length?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because there are places where it hurts my eyes to introduce a line break. Is this a case of conflicting OCD's? :P

@mverzilli
Copy link
Contributor Author

Thanks for the review! I'll merge as is and address comments on the next one in line, so I don't miss the integration train any longer

@mverzilli mverzilli added ci-squash-and-merge and removed ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure labels Jan 12, 2026
@mverzilli mverzilli enabled auto-merge January 12, 2026 18:32
Second part of the series started with #19445.

This makes the CapsuleStore work based on staged writes. With this, capsules aren't written to persistent storage until PXE decides to commit the job.
@AztecBot AztecBot force-pushed the martin/capsule-store-with-staged-writes branch from 1de3908 to b30c796 Compare January 12, 2026 18:35
@mverzilli mverzilli added this pull request to the merge queue Jan 12, 2026
@AztecBot
Copy link
Collaborator

AztecBot commented Jan 12, 2026

Flakey Tests

🤖 says: This CI run detected 2 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/15957f0318a52988�15957f0318a529888;;�): yarn-project/end-to-end/scripts/run_test.sh web3signer src/composed/web3signer/e2e_multi_validator_node_key_store.test.ts (39s) (code: 1) (\033Martin Verzilli\033: refactor: CapsuleStore with staged writes (#19449))
\033FLAKED\033 (8;;http://ci.aztec-labs.com/e891d50eb40e3950�e891d50eb40e39508;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_p2p/gossip_network.test.ts (436s) (code: 1) group:e2e-p2p-epoch-flakes (\033Martin Verzilli\033: refactor: CapsuleStore with staged writes (#19449))

Merged via the queue into next with commit 3d84673 Jan 12, 2026
17 checks passed
@mverzilli mverzilli deleted the martin/capsule-store-with-staged-writes branch January 12, 2026 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants