Skip to content

Conversation

Wondertan
Copy link
Member

@Wondertan Wondertan commented Mar 31, 2025

Adds Tail method to Syncer that can initialize a node with a height estimated header or using the configured height(or hash) as the new Tail. Besides, it can gracefully handle reconfiguration of height/hash upon restarts.

This change deliberately leaves some tech debt behind. The new Tail method has quite a lot of branchiness and could be teared apart/rewritten reducing nesting and improving readability. Some more specific debts are marked with a TODO. All of this is not "clean" and simply achieves the goal. The future backward sync will anyway require us to refactor this logic, so there is minimum value in cleaning things now.

Nvm, I end up refactoring it for the sake of reviewers sanity and code maintainability. There are still TODOs left for things that are literally blocked on bsync or else, but everything else was uniformly integrated into the existing code.

Depends on #283 and #285

Closes #258 and #259

@Wondertan Wondertan self-assigned this Mar 31, 2025
@Wondertan Wondertan changed the base branch from main to pruning May 6, 2025 21:35
@Wondertan Wondertan marked this pull request as ready for review May 6, 2025 21:51
@Wondertan Wondertan changed the title feat(sync): subjective tail feat(sync): Tail May 8, 2025
@Wondertan Wondertan marked this pull request as draft May 11, 2025 01:25
@Wondertan
Copy link
Member Author

Converting to draft until dependencies are merged

@Wondertan Wondertan changed the title feat(sync): Tail feat(sync): maintain tail May 11, 2025
@codecov-commenter
Copy link

codecov-commenter commented May 13, 2025

Codecov Report

Attention: Patch coverage is 66.66667% with 66 lines in your changes missing coverage. Please review.

Please upload report for BASE (pruning@9fce763). Learn more about missing BASE report.

Files with missing lines Patch % Lines
sync/sync_tail.go 69.56% 26 Missing and 9 partials ⚠️
sync/sync_head.go 51.02% 23 Missing and 1 partial ⚠️
sync/sync_store.go 72.22% 3 Missing and 2 partials ⚠️
headertest/store.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             pruning     #265   +/-   ##
==========================================
  Coverage           ?   54.52%           
==========================================
  Files              ?       39           
  Lines              ?     4169           
  Branches           ?        0           
==========================================
  Hits               ?     2273           
  Misses             ?     1733           
  Partials           ?      163           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Wondertan Wondertan marked this pull request as ready for review May 13, 2025 21:50
Copy link
Member

@renaynay renaynay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to review this as thoroughly as possible. I left a few comments - some of them are just for me and my understanding. Feel free to resolve or correct my understanding.

Open questions:

  • why do we allow SyncFromHeight? -- IMO providing two options for one purpose is not necessary (FOR NOW) (where SyncFromHash would likely be the more used option; and restricting optionality to just hash would help clean up unnecessary additional code toggling between these options. (e.g. SyncFromHash has higher priority than SyncFromHeight.) We've historically relied on TrustedHash instead of something like "TrustedHeight" (bc how can we trust height if we only pass a # and no hash so we can check that header we receive == header we query for). If we allow SyncFromHeight, we would also want to ask user to specify the hash they believe to be the header hash for the header at the given height, no? Otherwise we are JTMB?

  • Can we rename head_sync file in syncer at some point? Totally missed this new addition and it’ll be so confusing 😂 (separate but relevant)

Comment on lines 155 to 161
func (s *Syncer[H]) isTailHashOutdated(h H) (header.Hash, bool) {
wantHash := s.Params.SyncFromHash
outdated := wantHash != nil && (h.IsZero() || !bytes.Equal(wantHash, h.Hash()))
return wantHash, outdated
}

func (s *Syncer[H]) isTailHeightOutdated(h H) (uint64, bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two methods are confusing bc they not only check for outdated - they also return outdated == true if header is nil. Maybe just methods need renaming

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a little refactoring for them going on. I had the same concern

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

head H,
blockTime, trustingPeriod time.Duration,
) (height uint64) {
headersToRetain := uint64(trustingPeriod / blockTime) //nolint:gosec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be extracted into var, no? retentionWindow

afaics can't change trustingPeriod or blockTime in runtime

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow what do you mean by extracting here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't matter just a nit

case from.Height() > to.Height():
log.Infof("move tail down from %d to %d, syncing the diff...", from.Height(), to.Height())

// TODO(@Wondertan): This works but it assumes this code is only run before syncing routine starts.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah afaict we would never "grow" tail in runtime -- this would only happen on a restart for a user who wants to specify "lower" tail height

and this would be hit before a new subjHead would be set which is what would trigger a sync job in the forward direction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not actionable

}
} else if tailHash == nil && tailHeight == 0 {
if tail.IsZero() {
// no previously known Tail available - estimate solely on Head
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add a debug log for this case as well pls that includes estimated height

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's debugged after refactoring in a different place

if err != nil {
return err
}
// gets the latest head and kicks off syncing if necessary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relic, will get it back

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returned

Comment on lines +74 to +76
case errors.Is(err, header.ErrEmptyStore):
log.Info("empty store, initializing...")
s.metrics.subjectiveInitialization(s.ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay so this will make it so that we do not have to manually initialise the store via nodebuilder in celestia-node in order to kick off header syncing - syncer will do it for us here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's was the purpose behind doing #243

defer s.incomingMu.Unlock()

err := s.verify(ctx, head)
// TODO(@Wondertan): We need to ensure Tail is available before verification for subj init case and this is fine here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we perform a sanity check here on the incoming head somehow to ensure we do not get exorbitant height that could trigger a cut of the tail too significantly? (meaning one that is shorter than the pruning window)

// However, Syncer has yet to be refactored to not assume those invariants and until then
// this method is a shim that allows using store with old assumptions.
// To be reworked by bsync.
if headers[0].Height() >= head.Height() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we extract into separate func here ensureAppendAdjacency

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, we could

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't see much value from doing this tbh, so ignoring

@Wondertan
Copy link
Member Author

Can we rename head_sync file in syncer at some point? Totally missed this new addition and it’ll be so confusing 😂 (separate but relevant)

Totally agree and had the same thought. There is a simple solution for that. Rename sync_*.go files to syncer_*.go to match the name of the component

@Wondertan
Copy link
Member Author

why do we allow SyncFromHeight? -- IMO providing two options for one purpose is not necessary (FOR NOW) (where SyncFromHash would likely be the more used option; and restricting optionality to just hash would help clean up unnecessary additional code toggling between these options. (e.g. SyncFromHash has higher priority than SyncFromHeight.) We've historically relied on TrustedHash instead of something like "TrustedHeight" (bc how can we trust height if we only pass a # and no hash so we can check that header we receive == header we query for). If we allow SyncFromHeight, we would also want to ask user to specify the hash they believe to be the header hash for the header at the given height, no? Otherwise we are JTMB?

Height based sync was requested by @cmwaters and I generally agree that it is needed/simpler for users. There is ofc security aspect to it where syncing from a historical height is not LRA resistant. However, this is true currently for both hash and height based sync. The difference in trust model is that the header you start with is less trusted with hash, but beyond that, both are similarly insecure. Once bsync comes both of them will become secure, so it make sense for now to introduce an option that's easier for users and keep the old one for backwards compatibility.

Copy link
Member

@renaynay renaynay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR LGTM - your choice re nits

Copy link
Member

@renaynay renaynay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so on start-up, we no longer need to ensure that tail was set in store and we can do that after processing the first incoming header, is that correct? meaning incoming header will become new head and the append will make that it's tail and then subjectiveTail will be hit and if hash is set, it will override that tail and fetch the hash + trigger sync up to head?

@Wondertan
Copy link
Member Author

during subjective init, tail is guaranteed to be written first, even if incoming head was there earlier. Other then that its correct

@Wondertan Wondertan merged commit 57411f0 into pruning May 21, 2025
2 checks passed
@Wondertan Wondertan deleted the sync-from-tail branch May 21, 2025 17:13
Wondertan added a commit that referenced this pull request May 28, 2025
Adds `Tail` method to `Syncer` that can initialize a node with a height
estimated header or using the configured height(or hash) as the new
`Tail`. Besides, it can gracefully handle reconfiguration of height/hash
upon restarts.

~This change deliberately leaves some tech debt behind. The new Tail
method has quite a lot of branchiness and could be teared
apart/rewritten reducing nesting and improving readability. Some more
specific debts are marked with a TODO. All of this is not "clean" and
simply achieves the goal. The future backward sync will anyway require
us to refactor this logic, so there is minimum value in cleaning things
now.~

Nvm, I end up refactoring it for the sake of reviewers sanity and code
maintainability. There are still TODOs left for things that are
literally blocked on bsync or else, but everything else was uniformly
integrated into the existing code.

Depends on #283 and #285

Closes #258 and #259
This was referenced May 28, 2025
Wondertan added a commit that referenced this pull request May 29, 2025
Adds `Tail` method to `Syncer` that can initialize a node with a height
estimated header or using the configured height(or hash) as the new
`Tail`. Besides, it can gracefully handle reconfiguration of height/hash
upon restarts.

~This change deliberately leaves some tech debt behind. The new Tail
method has quite a lot of branchiness and could be teared
apart/rewritten reducing nesting and improving readability. Some more
specific debts are marked with a TODO. All of this is not "clean" and
simply achieves the goal. The future backward sync will anyway require
us to refactor this logic, so there is minimum value in cleaning things
now.~

Nvm, I end up refactoring it for the sake of reviewers sanity and code
maintainability. There are still TODOs left for things that are
literally blocked on bsync or else, but everything else was uniformly
integrated into the existing code.

Depends on #283 and #285

Closes #258 and #259
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sync: Subjective Tail
4 participants