Skip to content

Conversation

Wondertan
Copy link
Member

@Wondertan Wondertan commented Jul 29, 2025

Allows batch deleting across API boundaries.

Depends on #320
Depends on ipfs/go-datastore#238, but we shouldn't block on it being released upstream. It should be fine to temporarily live with the replace so rollout pruning on time

w/o: {"from_height": 1, "to_height": 6500189, "took": 341.992917359}
w/: {"from_height": 1, "to_height": 6500617, "took": 129.445211431}

@Wondertan Wondertan self-assigned this Jul 29, 2025
Base automatically changed from perf-no-read-delet to main July 29, 2025 14:42
@Wondertan Wondertan marked this pull request as ready for review July 29, 2025 14:46
@vgonkivs
Copy link
Member

should we wait until these changes are merged and released in ipfs?

@Wondertan
Copy link
Member Author

Wondertan commented Jul 30, 2025

but we shouldn't block on it being released upstream. It should be fine to temporarily live with the replace so we rollout pruning on time

@Wondertan
Copy link
Member Author

Wondertan commented Jul 30, 2025

Retesting this with all the recent changes showed significant performance degradation. The test showed ~370s, which is worse than the result without this change ~341s.

The best measurement in the description (~129s) was done with a different hacky poc for batching through context. The supposed to be production version directly in go-datastore, seems to be bringing very high overhead.

@Wondertan
Copy link
Member Author

False alarm. I triple checked, and there was a bug on the integration side in the node, which made this change effectively a noop. The results after the fix are as expected, around ~120s.

Copy link
Member

@walldiss walldiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utack

Copy link
Member

@renaynay renaynay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay w me, guess needs FLUP issue to return go-ds to upstream once released.

@Wondertan Wondertan merged commit 105a106 into main Jul 31, 2025
5 checks passed
@Wondertan Wondertan deleted the context-ds branch July 31, 2025 14:31
Wondertan added a commit that referenced this pull request Aug 1, 2025
This PR shaves another 30s when we prune the whole chain. It achieves
this by applying the same technique as in #328, but for reads.
Previously, the underlying datastore would create a read-only
transaction for every singular read. As shown in pyroscope profiles,
this puts pressure on badger as creating new transactions ain't cheap
and involves complex synchronization to ensure atomicity between
parallel transactions.

With this insight in mind, I also integrated read batching in other
places where this must be beneficial: advancing, receding, range, and
singular header reads.

More noticeable changes:
* Fixes a silent bug that wouldn't delete samples on the Pruner side
* Fixes a bug in the upstream contextds change
* Adds tracking for missing headers that were not found during deletion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants