-
Notifications
You must be signed in to change notification settings - Fork 17
refactor(backfill): Backfill Plugin Major Refactor And Improvements #2006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass on this one, will need at least another one. Have not yet looked at tests as well.
I leave general comments for cleanup and questions.
I feel, however, that we are really missing out on not using the ranged sets and instead we are using a list of ranges and manually manipulating them. I do see usages of merging ranges, this is something the ranged sets will do automatically in a more performant fashion. I guess I just do not see why we need to use a list and complicating some of the things we do, instead of just using the ranged sets. Since we are doing such a big rework, now is the time to make decisions. Something worth thinking about.
...de/backfill/src/main/java/org/hiero/block/node/backfill/client/proto/block_node_source.proto
Outdated
Show resolved
Hide resolved
block-node/backfill/src/main/java/org/hiero/block/node/backfill/client/BlockNodeClient.java
Outdated
Show resolved
Hide resolved
block-node/backfill/src/main/java/org/hiero/block/node/backfill/client/BlockNodeClient.java
Show resolved
Hide resolved
block-node/backfill/src/main/java/org/hiero/block/node/backfill/client/BlockNodeClient.java
Outdated
Show resolved
Hide resolved
block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillConfiguration.java
Show resolved
Hide resolved
block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillPersistenceAwaiter.java
Show resolved
Hide resolved
block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillRunner.java
Outdated
Show resolved
Hide resolved
block-node/backfill/src/main/java/org/hiero/block/node/backfill/GapDetector.java
Show resolved
Hide resolved
...k-node/backfill/src/main/java/org/hiero/block/node/backfill/PriorityHealthBasedStrategy.java
Outdated
Show resolved
Hide resolved
...k-node/backfill/src/main/java/org/hiero/block/node/backfill/PriorityHealthBasedStrategy.java
Show resolved
Hide resolved
|
@ata-nas Thank you for reviewing my PR, really appreciate and value all your input and time effort made into reviewing it 🙏 I've addressed all your comments mostly in a positive outcome. And for your general notes suggestion:
I've ponder this up and decided that the benefits do not outweigh the effort and added complexity. The use case we need is simpler as opposed to the BlockRangeSet impls and prefer to keep it simple as a native List. |
0d1abe4 to
b5048d2
Compare
Introduce typed gaps to classify detected block gaps as HISTORICAL or LIVE_TAIL for routing to appropriate schedulers. GapDetector now returns TypedGap instances with proper boundary detection. Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Replace single scheduler with two independent schedulers (historical and live-tail) so live blocks never wait for historical backfill. Each scheduler has bounded queue with discard-on-full semantics. Remove unused BackfillScheduler wrapper and BackfillTask status tracking. Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Update BackfillPlugin to orchestrate two independent schedulers with dedicated executors. Add high-water mark deduplication for live-tail gaps. Add configuration for queue capacities and health penalty settings. Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Merge gRPC client functionality into BackfillFetcher. Use configurable health penalty and backoff settings. Remove redundant BackfillGrpcClient. Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Add @timeout annotations to all test classes (30s for integration, 5s for unit tests) to fail fast if tests hang instead of blocking indefinitely. Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Update configuration docs with new queue capacity and health settings. Update design docs to reflect dual scheduler architecture. Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
…gic into BackfillPersistenceAwaiter class. Improved logs overall for the plugin. Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Simplified TypedGap and GapType into nested classes of GapDetector for readability and code reduction Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
…ualityCheck for now as the current PBJ is improve logging on final failed retries to a peer Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
- Use unparsed block builder directly - Replace mock metrics with real TestUtils.createMetrics() - Replace var with explicit types - Split multi-scenario tests into focused single-behavior tests Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillFetcher.java
Outdated
Show resolved
Hide resolved
block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillFetcher.java
Outdated
Show resolved
Hide resolved
block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillFetcher.java
Outdated
Show resolved
Hide resolved
…tenceAwaiter Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
…t names, use assertSame for reference checks Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
- Simplify node logging using PBJ toString() - Use log format replacement instead of .formatted() - Add missing newline at EOF (block_node_source.proto) Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillConfiguration.java
Outdated
Show resolved
Hide resolved
block-node/backfill/src/main/java/org/hiero/block/node/backfill/BackfillFetcher.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
Signed-off-by: Alfredo Gutierrez Grajeda <alfredo@hashgraph.com>
ata-nas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are in a good state here. Great work @AlfredoG87! We can proceed with what we have. Looking forward to even more improvements and seeing it work in action!
Yes, I've been doing plenty of local testing but eager to see them in the wild 💯 Thank you for your hard work 🙏 |
Summary
Refactors the
BackfillPluginfrom a monolithic ~650-line class into a modular architecture of focused, testable components.Key change: Introduces a dual-scheduler design that separates historical and live-tail backfill processing—ensuring recent gaps are never blocked by long-running historical catch-up.
Architecture Overview
BackfillPluginnow acts as an orchestrator coordinating three stages:1. Detect —
GapDetector2. Schedule —
BackfillTaskScheduler3. Execute —
BackfillRunnerPriorityHealthBasedStrategyNew Components
BackfillRunnerBackfillTaskSchedulerBackfillPersistenceAwaiterGapDetectorPriorityHealthBasedStrategyNodeSelectionStrategyConfiguration
backfill.historicalQueueCapacitybackfill.liveTailQueueCapacitybackfill.healthPenaltyPerFailurebackfill.maxBackoffMsOther Changes
BlockNodeSourceproto with optionalNodeIdandNamefields (non-breaking)Review guide (recommended order)
If you want the fastest mental model, I recommend reviewing in this order:
PR Stats
Related Issues
Fixes #1977
Fixes #1550
Fixes #1502
Fixes #1778