-
Notifications
You must be signed in to change notification settings - Fork 26
feat(sync): maintain tail #265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bc832cf
to
44a7fd9
Compare
Converting to draft until dependencies are merged |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## pruning #265 +/- ##
==========================================
Coverage ? 54.52%
==========================================
Files ? 39
Lines ? 4169
Branches ? 0
==========================================
Hits ? 2273
Misses ? 1733
Partials ? 163 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to review this as thoroughly as possible. I left a few comments - some of them are just for me and my understanding. Feel free to resolve or correct my understanding.
Open questions:
-
why do we allow SyncFromHeight? -- IMO providing two options for one purpose is not necessary (FOR NOW) (where
SyncFromHash
would likely be the more used option; and restricting optionality to just hash would help clean up unnecessary additional code toggling between these options. (e.g.SyncFromHash
has higher priority thanSyncFromHeight
.) We've historically relied onTrustedHash
instead of something like "TrustedHeight
" (bc how can we trust height if we only pass a # and no hash so we can check that header we receive == header we query for). If we allowSyncFromHeight
, we would also want to ask user to specify the hash they believe to be the header hash for the header at the given height, no? Otherwise we are JTMB? -
Can we rename
head_sync
file in syncer at some point? Totally missed this new addition and it’ll be so confusing 😂 (separate but relevant)
sync/sync_tail.go
Outdated
func (s *Syncer[H]) isTailHashOutdated(h H) (header.Hash, bool) { | ||
wantHash := s.Params.SyncFromHash | ||
outdated := wantHash != nil && (h.IsZero() || !bytes.Equal(wantHash, h.Hash())) | ||
return wantHash, outdated | ||
} | ||
|
||
func (s *Syncer[H]) isTailHeightOutdated(h H) (uint64, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these two methods are confusing bc they not only check for outdated - they also return outdated == true if header is nil. Maybe just methods need renaming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a little refactoring for them going on. I had the same concern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check how it looked like in the first iteration https://github.com/celestiaorg/go-header/pull/265/files/70f65dea032db43d3a499e9d5a995b076906240f
💀
sync/sync_tail.go
Outdated
head H, | ||
blockTime, trustingPeriod time.Duration, | ||
) (height uint64) { | ||
headersToRetain := uint64(trustingPeriod / blockTime) //nolint:gosec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be extracted into var, no? retentionWindow
afaics can't change trustingPeriod or blockTime in runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow what do you mean by extracting here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't matter just a nit
sync/sync_tail.go
Outdated
case from.Height() > to.Height(): | ||
log.Infof("move tail down from %d to %d, syncing the diff...", from.Height(), to.Height()) | ||
|
||
// TODO(@Wondertan): This works but it assumes this code is only run before syncing routine starts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah afaict we would never "grow" tail in runtime -- this would only happen on a restart for a user who wants to specify "lower" tail height
and this would be hit before a new subjHead would be set which is what would trigger a sync job in the forward direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not actionable
sync/sync_tail.go
Outdated
} | ||
} else if tailHash == nil && tailHeight == 0 { | ||
if tail.IsZero() { | ||
// no previously known Tail available - estimate solely on Head |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add a debug log for this case as well pls that includes estimated height
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's debugged after refactoring in a different place
if err != nil { | ||
return err | ||
} | ||
// gets the latest head and kicks off syncing if necessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
relic, will get it back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
returned
case errors.Is(err, header.ErrEmptyStore): | ||
log.Info("empty store, initializing...") | ||
s.metrics.subjectiveInitialization(s.ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay so this will make it so that we do not have to manually initialise the store via nodebuilder in celestia-node in order to kick off header syncing - syncer will do it for us here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's was the purpose behind doing #243
sync/sync_head.go
Outdated
defer s.incomingMu.Unlock() | ||
|
||
err := s.verify(ctx, head) | ||
// TODO(@Wondertan): We need to ensure Tail is available before verification for subj init case and this is fine here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we perform a sanity check here on the incoming head somehow to ensure we do not get exorbitant height that could trigger a cut of the tail too significantly? (meaning one that is shorter than the pruning window)
// However, Syncer has yet to be refactored to not assume those invariants and until then | ||
// this method is a shim that allows using store with old assumptions. | ||
// To be reworked by bsync. | ||
if headers[0].Height() >= head.Height() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we extract into separate func here ensureAppendAdjacency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we could
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't see much value from doing this tbh, so ignoring
Totally agree and had the same thought. There is a simple solution for that. Rename |
Height based sync was requested by @cmwaters and I generally agree that it is needed/simpler for users. There is ofc security aspect to it where syncing from a historical height is not LRA resistant. However, this is true currently for both hash and height based sync. The difference in trust model is that the header you start with is less trusted with hash, but beyond that, both are similarly insecure. Once bsync comes both of them will become secure, so it make sense for now to introduce an option that's easier for users and keep the old one for backwards compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR LGTM - your choice re nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so on start-up, we no longer need to ensure that tail was set in store and we can do that after processing the first incoming header, is that correct? meaning incoming header will become new head and the append will make that it's tail and then subjectiveTail will be hit and if hash is set, it will override that tail and fetch the hash + trigger sync up to head?
during subjective init, tail is guaranteed to be written first, even if incoming head was there earlier. Other then that its correct |
Adds `Tail` method to `Syncer` that can initialize a node with a height estimated header or using the configured height(or hash) as the new `Tail`. Besides, it can gracefully handle reconfiguration of height/hash upon restarts. ~This change deliberately leaves some tech debt behind. The new Tail method has quite a lot of branchiness and could be teared apart/rewritten reducing nesting and improving readability. Some more specific debts are marked with a TODO. All of this is not "clean" and simply achieves the goal. The future backward sync will anyway require us to refactor this logic, so there is minimum value in cleaning things now.~ Nvm, I end up refactoring it for the sake of reviewers sanity and code maintainability. There are still TODOs left for things that are literally blocked on bsync or else, but everything else was uniformly integrated into the existing code. Depends on #283 and #285 Closes #258 and #259
Adds `Tail` method to `Syncer` that can initialize a node with a height estimated header or using the configured height(or hash) as the new `Tail`. Besides, it can gracefully handle reconfiguration of height/hash upon restarts. ~This change deliberately leaves some tech debt behind. The new Tail method has quite a lot of branchiness and could be teared apart/rewritten reducing nesting and improving readability. Some more specific debts are marked with a TODO. All of this is not "clean" and simply achieves the goal. The future backward sync will anyway require us to refactor this logic, so there is minimum value in cleaning things now.~ Nvm, I end up refactoring it for the sake of reviewers sanity and code maintainability. There are still TODOs left for things that are literally blocked on bsync or else, but everything else was uniformly integrated into the existing code. Depends on #283 and #285 Closes #258 and #259
Adds
Tail
method toSyncer
that can initialize a node with a height estimated header or using the configured height(or hash) as the newTail
. Besides, it can gracefully handle reconfiguration of height/hash upon restarts.This change deliberately leaves some tech debt behind. The new Tail method has quite a lot of branchiness and could be teared apart/rewritten reducing nesting and improving readability. Some more specific debts are marked with a TODO. All of this is not "clean" and simply achieves the goal. The future backward sync will anyway require us to refactor this logic, so there is minimum value in cleaning things now.Nvm, I end up refactoring it for the sake of reviewers sanity and code maintainability. There are still TODOs left for things that are literally blocked on bsync or else, but everything else was uniformly integrated into the existing code.
Depends on #283 and #285
Closes #258 and #259