Skip to content

Conversation

julienrbrt
Copy link
Member

@julienrbrt julienrbrt commented Oct 6, 2025

Overview

The syncing component was not saving header and event to the goheader (p2p) store.
Which was an issue when peers were asking for heights.

Copy link
Contributor

github-actions bot commented Oct 6, 2025

The latest Buf updates on your PR. Results from workflow CI and Release / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedOct 6, 2025, 8:42 PM

Copy link
Contributor

Summary of Changes

Hello @julienrbrt, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where the node's syncing component failed to broadcast newly acquired headers and data to the peer-to-peer network. By introducing a standardized Broadcaster interface and integrating its functionality into the syncer's block processing logic, the changes ensure that all successfully synced blocks are properly propagated across the network, improving network consistency and data availability for other nodes.

Highlights

  • Introduced Broadcaster Interface: A new generic Broadcaster interface was added to abstract P2P broadcasting and header store access, standardizing how headers and data are propagated.
  • Enabled P2P Broadcasting in Syncer: The syncing component now actively broadcasts newly processed headers and data to the P2P network, ensuring that all synced blocks are propagated.
  • Refactored Component Dependencies: Various components, including Syncer and Executor, along with their constructors, were updated to depend on the new Broadcaster interface, replacing direct goheader.Store implementations or internal broadcaster types.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly refactors the broadcasting logic by introducing a common Broadcaster interface and moving it to a shared package. This change also enables broadcasting of headers and data from sync nodes, which was the main goal.

I've found one critical issue where the wrong broadcaster was used to get the data height, which would break data syncing. I've also left a medium-severity comment regarding duplicated test mocks, which could be refactored to improve maintainability.

Once the critical issue is addressed, this PR should be good to go.

This comment was marked as outdated.

This comment was marked as outdated.

@julienrbrt julienrbrt marked this pull request as draft October 6, 2025 15:08
@julienrbrt julienrbrt changed the title fix(block/syncing): broadcast on sync nodes fix(block/syncing): save data to p2p stores Oct 6, 2025
@julienrbrt julienrbrt marked this pull request as ready for review October 6, 2025 15:36
Copy link

codecov bot commented Oct 6, 2025

Codecov Report

❌ Patch coverage is 57.25806% with 53 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.27%. Comparing base (b646e66) to head (bc2eb11).

Files with missing lines Patch % Lines
block/internal/syncing/syncer.go 57.74% 25 Missing and 5 partials ⚠️
block/internal/syncing/p2p_handler.go 52.17% 20 Missing and 2 partials ⚠️
pkg/sync/sync_service.go 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2736      +/-   ##
==========================================
- Coverage   62.28%   62.27%   -0.02%     
==========================================
  Files          79       79              
  Lines        8490     8540      +50     
==========================================
+ Hits         5288     5318      +30     
- Misses       2711     2723      +12     
- Partials      491      499       +8     
Flag Coverage Δ
combined 62.27% <57.25%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

claude bot commented Oct 6, 2025

Pull Request Review

I've reviewed PR #2736 and here's my feedback:

✅ Code Quality & Best Practices

Positive aspects:

  • Correct bug fix: The PR properly addresses the issue where headers and data were not being saved to P2P stores during syncing, which would cause problems when peers request data by height
  • Proper error handling: Added appropriate error logging with context when Append operations fail
  • Good code placement: The fix is placed in the right location (processHeightEvent) after successful block validation but before returning
  • Test coverage: Updated all affected tests to use proper mocks instead of nil values

Suggestions for improvement:

  1. Error handling flow (block/internal/syncing/syncer.go:407-419): Currently, when Append fails, the function returns early without marking the event as pending. Consider whether failed P2P store operations should prevent the block from being processed. Since the block has already been validated at this point, you might want to:

    • Log the error but continue processing (if P2P store is non-critical)
    • OR ensure the event is re-queued for retry if P2P store is critical
  2. Test expectations (syncer_test.go): The mocks use .Maybe() which means the tests don't verify that Append is actually called. Consider using explicit .Once() or .Times(n) to ensure the fix is working as expected:

    mockP2PHeaderStore.On("Append", mock.Anything, mock.Anything).Return(nil).Once()
  3. Return type change (pkg/sync/sync_service.go:120): The change from *goheaderstore.Store[H] to header.Store[H] is good for interface abstraction, but ensure this doesn't break any external callers that depend on the concrete type.

🐛 Potential Issues

Critical:

  • Order of operations: Headers and data are appended to P2P stores AFTER the block is validated but BEFORE it's saved to the main store. If the main store save fails (line 451), the P2P stores will have inconsistent data. Consider moving the P2P append operations after all critical operations succeed.

Minor:

  • In processHeightEvent, if appending to headerStore succeeds but dataStore fails, we have partial state. Consider either:
    • Making both appends atomic (if possible)
    • Or accepting this as eventual consistency

🔒 Security Concerns

No significant security issues identified. The data being stored has already been validated through validateBlock() before being appended to P2P stores.

⚡ Performance Considerations

  • The Append operations are synchronous and will block the syncing goroutine. If P2P stores have high latency, this could slow down syncing
  • Consider whether these operations could be done asynchronously or batched for better throughput
  • The current implementation is safe but may impact sync performance under high load

🧪 Test Coverage

Good:

  • All existing tests updated to use proper mocks
  • Mocks properly implement the goheader.Store interface

Could be better:

  • No explicit test for the new Append functionality
  • No test for error scenarios (what happens when Append fails?)
  • Consider adding a dedicated test case that verifies headers and data are actually appended to P2P stores

Suggested Test Case

func TestProcessHeightEvent_AppendsToP2PStores(t *testing.T) {
    // Setup with explicit mock expectations
    mockP2PHeaderStore := &mocks.MockStore[*types.SignedHeader]{}
    mockP2PDataStore := &mocks.MockStore[*types.Data]{}
    
    // Create test event
    event := createTestEvent(t, height)
    
    // Expect Append to be called exactly once with the right data
    mockP2PHeaderStore.On("Append", mock.Anything, event.Header).Return(nil).Once()
    mockP2PDataStore.On("Append", mock.Anything, event.Data).Return(nil).Once()
    
    // ... rest of test
    
    mockP2PHeaderStore.AssertExpectations(t)
    mockP2PDataStore.AssertExpectations(t)
}

📋 Summary

This is a solid bug fix that addresses a real issue with P2P data availability. The implementation is clean and well-tested. The main recommendations are:

  1. Consider moving P2P store appends after all critical operations to avoid inconsistency
  2. Use explicit mock expectations (.Once()) instead of .Maybe() for better test validation
  3. Add a dedicated test case that verifies the fix works correctly
  4. Consider async/batched operations if P2P store latency becomes an issue

Overall: Approve with minor suggestions

🤖 Generated with Claude Code

Copy link
Contributor

claude bot commented Oct 6, 2025

Pull Request Review - PR #2736

Overview

This PR fixes a bug where the syncing component wasn't saving headers and data to the P2P (goheader) stores after successfully syncing blocks. This prevented peers from serving this data when requested by other nodes.


Positive Aspects

1. Correct Fix Location

The fix is placed in processHeightEvent() at block/internal/syncing/syncer.go:410-420, which is the right location. This function processes incoming DA height events and appends them to the p2p stores, making them available to peers requesting data.

2. Follows Established Pattern

The implementation mirrors how the executor component broadcasts data via WriteToStoreAndBroadcast() (see block/internal/executing/executor.go:394-395). The syncer correctly appends to stores after validation, maintaining consistency with the broader codebase architecture.

3. Proper Error Handling

  • Errors are logged with context
  • Early returns prevent further processing on failure
  • Follows the project's error handling patterns

4. Test Coverage Updated

All existing tests were updated to provide mock P2P stores, ensuring tests continue to pass and preventing regressions.


⚠️ Issues & Concerns

1. Critical: Placement May Miss Blocks (MEDIUM SEVERITY)

Location: block/internal/syncing/syncer.go:410-420

The current placement in processHeightEvent() means data is only appended to P2P stores when:

  • The event is the exact next sequential block (height == currentHeight + 1)
  • The block hasn't been processed yet

Problem: Events that are stored as pending (line 401) or events that fail validation (line 410-412) won't get appended to P2P stores even after they're eventually processed.

Impact: When processPendingEvents() (line 588-620) later retrieves and successfully syncs these pending blocks via trySyncNextBlock(), they bypass the append logic entirely.

Evidence:

// Line 400-403: Events stored as pending
if height != currentHeight+1 {
    s.cache.SetPendingEvent(height, event)
    return  // ← P2P stores never updated for these
}

// Line 407-414: On failure, event is re-stored as pending
if err := s.trySyncNextBlock(event); err != nil {
    if !errors.Is(err, errInvalidBlock) {
        s.cache.SetPendingEvent(height, event)  // ← Will be processed later
    }
    return
}

Recommended Fix:
Move the append logic to trySyncNextBlock() after line 473 (after successful block save and state update). This ensures ALL successfully synced blocks are added to P2P stores, regardless of how they were retrieved.

// In trySyncNextBlock(), after line 473:
// Append to P2P stores for peer availability
if err := s.headerStore.Append(s.ctx, header); err != nil {
    s.logger.Error().Err(err).Msg("failed to append header to p2p store")
    // Consider if this should be a hard error
}
if err := s.dataStore.Append(s.ctx, data); err != nil {
    s.logger.Error().Err(err).Msg("failed to append data to p2p store")
}

2. Error Handling Inconsistency (LOW SEVERITY)

Current behavior: P2P store append failures are logged but don't prevent sync completion.

Questions to consider:

  • Should P2P store failures be critical errors?
  • If a node can't serve data to peers, is that acceptable for continued operation?
  • Consider the tradeoff: strict error handling could cause sync failures, but lenient handling might create "selfish" nodes that don't contribute to the network

Current implementation is probably acceptable given that:

  • Block syncing is more critical than peer serving
  • The executor also doesn't fail on broadcast errors (line 396-399)
  • This maintains consistency with the existing pattern

3. Test Coverage Gaps (LOW SEVERITY)

Missing test scenarios:

  1. P2P store append failures: No test verifies behavior when Append() returns an error
  2. Pending event flow: No test ensures pending events eventually get appended to P2P stores
  3. Integration: No test verifies peers can actually retrieve the stored data

Suggested additions:

func TestProcessHeightEvent_P2PStoreAppendFailure(t *testing.T) {
    // Test that sync continues even if P2P store append fails
    mockHeaderStore.On("Append", mock.Anything, mock.Anything).Return(errors.New("store error"))
    // Verify: block still syncs successfully, error is logged
}

func TestPendingEventsAppendedToP2P(t *testing.T) {
    // Test that out-of-order events eventually get appended
    // Send height 3 first (stored as pending)
    // Send height 2 (should sync and append)
    // Verify both heights 2 and 3 are in P2P stores
}

4. Interface Change Consideration (INFO)

File: pkg/sync/sync_service.go:120

The return type change from *goheaderstore.Store[H] to header.Store[H] is good practice (returning interfaces vs concrete types), but ensure:

  • All callers can handle the interface
  • No type assertions are broken elsewhere
  • This doesn't affect downstream consumers

📊 Code Quality Assessment

Category Rating Notes
Correctness ⚠️ Medium Main logic correct, but placement may miss edge cases
Style ✅ Good Follows project conventions
Testing ⚠️ Medium Basic coverage present, missing edge cases
Documentation ✅ Good Clear PR description, code is self-documenting
Performance ✅ Good Minimal overhead, proper error handling
Security ✅ Good No security concerns identified

🔧 Recommendations

Priority 1 (Should fix before merge):

  1. Move append logic to trySyncNextBlock() to ensure all synced blocks are added to P2P stores

Priority 2 (Consider for follow-up):

  1. Add test coverage for P2P store append failures
  2. Add test for pending events being appended to P2P stores
  3. Document the intended behavior when P2P operations fail

Priority 3 (Nice to have):

  1. Add integration test verifying peers can retrieve synced data
  2. Consider metrics/logging for P2P store operations to monitor effectiveness

📝 Summary

This PR addresses a real bug and the approach is generally sound. However, the placement of the append logic in processHeightEvent() may miss blocks that are processed via the pending events queue or retry logic. Moving the append calls to trySyncNextBlock() would ensure complete coverage of all successfully synced blocks.

The code quality is good, follows project conventions, and includes proper error handling. Test coverage is adequate but could be enhanced with edge case testing.

Recommendation: Request changes to move the append logic to trySyncNextBlock() before merging.


🤖 Generated with Claude Code

Copy link
Contributor

claude bot commented Oct 6, 2025

Code Review: PR #2736 - fix(block/syncing): save data to p2p stores

Overview

This PR fixes a bug where the syncing component was not persisting headers and data to the P2P (goheader) stores after successfully syncing blocks, which prevented peers from serving historical heights.


Strengths

  1. Correct Problem Identification: The PR addresses a real issue - headers and data need to be available to P2P peers for synchronization.

  2. Proper Concurrent Execution: Using errgroup.WithContext for parallel store operations is appropriate and follows Go best practices.

  3. Comprehensive Test Updates: All test cases were updated to use proper mocks instead of nil values, improving test quality.

  4. Additional sync_service.go improvement: The WithTrustingPeriod option addition appears to be a related configuration improvement.


⚠️ Issues & Concerns

1. Critical: Error Handling Weakness (High Priority)

Location: syncer.go:412-418

g, ctx := errgroup.WithContext(s.ctx)
g.Go(func() error { return s.headerStore.Append(ctx, event.Header) })
g.Go(func() error { return s.dataStore.Append(ctx, event.Data) })
if err := g.Wait(); err != nil {
    s.logger.Error().Err(err).Msg("failed to append event header and/or data to p2p store")
}

Problem: Errors are only logged but not acted upon. If the P2P store append fails, the block is still considered "processed" and won't be retried.

Impact: This creates a silent failure mode where:

  • The local node processes blocks successfully
  • P2P stores remain incomplete
  • Peers requesting data from this node will fail
  • The issue may go unnoticed until peers report problems

Recommendation: Consider one of:

  1. Return the error to trigger retry logic (store as pending event)
  2. Add metrics to track P2P store append failures
  3. Implement a background reconciliation process
  4. At minimum, add more detailed logging (which specific store failed, height, etc.)

2. Performance: Placement of Store Operations

Location: syncer.go:412-418

Concern: The P2P store append happens AFTER trySyncNextBlock completes successfully. This means:

  • P2P stores are updated synchronously on the critical path
  • Any latency in P2P stores directly impacts block processing throughput
  • If P2P stores are slow, it could create a bottleneck

Recommendation: Consider making this asynchronous:

// Fire and forget with error logging
go func(h *types.SignedHeader, d *types.Data) {
    g, ctx := errgroup.WithContext(context.Background())
    g.Go(func() error { return s.headerStore.Append(ctx, h) })
    g.Go(func() error { return s.dataStore.Append(ctx, d) })
    if err := g.Wait(); err != nil {
        s.logger.Error().Err(err).
            Uint64("height", h.Header.Height()).
            Msg("failed to append to p2p stores")
        // TODO: implement retry or reconciliation
    }
}(event.Header, event.Data)

3. Testing: Missing Verification

Location: All test files

Issue: The tests set up mocks with .Maybe(), but don't verify that Append was actually called with the correct data.

Example from syncer_test.go:151-153:

mockP2PHeaderStore.On("Append", mock.Anything, mock.Anything).Return(nil).Maybe()

Recommendation: Add assertions to verify the P2P stores are called:

mockP2PHeaderStore.On("Append", mock.Anything, mock.MatchedBy(func(h *types.SignedHeader) bool {
    return h.Header.Height() == expectedHeight
})).Return(nil).Once()

// After test execution
mockP2PHeaderStore.AssertExpectations(t)

4. Missing: Nil Check Safety

Location: syncer.go:414-415

Concern: The code assumes s.headerStore and s.dataStore are non-nil. The benchmark test passes nil for these stores.

Current state: This will panic if stores are nil.

Recommendation: Add nil checks:

if s.headerStore != nil && s.dataStore != nil {
    g, ctx := errgroup.WithContext(s.ctx)
    g.Go(func() error { return s.headerStore.Append(ctx, event.Header) })
    g.Go(func() error { return s.dataStore.Append(ctx, event.Data) })
    if err := g.Wait(); err != nil {
        s.logger.Error().Err(err).Msg("failed to append event header and/or data to p2p store")
    }
}

OR ensure stores are always non-nil (preferred approach - fail early in constructor).


5. Code Quality: Interface Return Type Change

Location: pkg/sync/sync_service.go:120

-func (syncService *SyncService[H]) Store() *goheaderstore.Store[H] {
+func (syncService *SyncService[H]) Store() header.Store[H] {

Question: This changes from concrete type to interface. While generally good practice:

  • Is this a breaking change for consumers?
  • Is this change necessary for this fix?
  • Should this be in a separate PR?

🔍 Additional Observations

  1. Context Handling: Using errgroup.WithContext(s.ctx) is correct, but ensures cancellation propagates properly.

  2. Trusting Period Addition: The WithTrustingPeriod(ninetyNineYears) addition in sync_service.go:423 seems unrelated to the main fix - consider documenting why this was added.

  3. Benchmark Test: The benchmark test still uses nil stores (lines 111-112), which could cause panics with this change if the code path is exercised.


📋 Recommendations Summary

Must Fix (Blocking):

  1. Add nil checks for P2P stores OR ensure they're always initialized
  2. Update benchmark test to use mocks instead of nil

Should Fix (High Priority):

  1. Improve error handling - don't silently ignore P2P store failures
  2. Add proper test assertions to verify Append is called

Consider (Medium Priority):

  1. Make P2P store updates asynchronous to avoid blocking sync path
  2. Add metrics for P2P store operation failures
  3. Document the trusting period change
  4. Consider splitting the interface return type change into separate PR

🎯 Test Coverage

The PR updates tests appropriately, but actual verification of the fix behavior is missing. Consider adding an integration test that:

  1. Syncs blocks through a node
  2. Verifies P2P stores contain the expected headers/data
  3. Tests a peer can retrieve the data

Overall Assessment

The PR correctly identifies and addresses the root cause. The implementation is reasonable but has concerning gaps in error handling and could potentially introduce performance regressions. With the suggested improvements, particularly around error handling and nil safety, this would be production-ready.

Recommendation: Request changes for nil safety and error handling before merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants