-
Notifications
You must be signed in to change notification settings - Fork 903
Fix NTP: implement RFC 5905 offset calculation and verify time before warning #2137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,7 +1,9 @@ | ||||||||||
| package engine | ||||||||||
|
|
||||||||||
| import ( | ||||||||||
| "context" | ||||||||||
| "encoding/binary" | ||||||||||
| "errors" | ||||||||||
| "fmt" | ||||||||||
| "net" | ||||||||||
| "sync/atomic" | ||||||||||
|
|
@@ -11,6 +13,90 @@ import ( | |||||||||
| "github.com/thrasher-corp/gocryptotrader/log" | ||||||||||
| ) | ||||||||||
|
|
||||||||||
| const ( | ||||||||||
| ntpEpochOffset = 2208988800 | ||||||||||
| ntpDialTimeout = 5 * time.Second | ||||||||||
| ntpReadWriteTimeout = 5 * time.Second | ||||||||||
|
Comment on lines
+18
to
+19
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. suggestion: You can combine these as ntpDefaultTimeout |
||||||||||
| ) | ||||||||||
|
|
||||||||||
| // errNoValidNTPServer is returned when no valid NTP server could be reached | ||||||||||
| var errNoValidNTPServer = errors.New("no valid NTP server could be reached") | ||||||||||
|
|
||||||||||
| // CheckNTPOffset performs a one-time NTP check and returns the time offset. | ||||||||||
| // This can be called before the NTP manager is started to verify time sync. | ||||||||||
| // It uses the RFC 5905 formula: offset = ((T2-T1) + (T3-T4)) / 2 | ||||||||||
| func CheckNTPOffset(ctx context.Context, pools []string) (time.Duration, error) { | ||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This needs a dedicated test |
||||||||||
| if len(pools) == 0 { | ||||||||||
| return 0, errors.New("no NTP pools configured") | ||||||||||
| } | ||||||||||
|
|
||||||||||
| dialer := &net.Dialer{ | ||||||||||
| Timeout: ntpDialTimeout, | ||||||||||
| } | ||||||||||
|
|
||||||||||
|
Comment on lines
+33
to
+36
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Squishy squashy |
||||||||||
| for i := range pools { | ||||||||||
| conn, err := dialer.DialContext(ctx, "udp", pools[i]) | ||||||||||
| if err != nil { | ||||||||||
| log.Warnf(log.TimeMgr, "NTP check: Unable to connect to %v, attempting next", pools[i]) | ||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
| continue | ||||||||||
| } | ||||||||||
|
|
||||||||||
| if err = conn.SetDeadline(time.Now().Add(ntpReadWriteTimeout)); err != nil { | ||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
| log.Warnf(log.TimeMgr, "NTP check: Unable to set deadline on %v. Error %s. Attempting next\n", pools[i], err) | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Several
Suggested change
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
| if err = conn.Close(); err != nil { | ||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||
| log.Errorln(log.TimeMgr, err) | ||||||||||
| } | ||||||||||
| continue | ||||||||||
| } | ||||||||||
|
|
||||||||||
| // T1: Record time before sending request (origin timestamp) | ||||||||||
| t1 := time.Now() | ||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. change variable names so we can remove commentary. |
||||||||||
|
|
||||||||||
| req := &ntpPacket{Settings: 0x1B} | ||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't set TxTime? |
||||||||||
| if err = binary.Write(conn, binary.BigEndian, req); err != nil { | ||||||||||
| log.Warnf(log.TimeMgr, "NTP check: Unable to write to %v. Error %s. Attempting next\n", pools[i], err) | ||||||||||
| if err = conn.Close(); err != nil { | ||||||||||
| log.Errorln(log.TimeMgr, err) | ||||||||||
| } | ||||||||||
| continue | ||||||||||
| } | ||||||||||
|
|
||||||||||
| rsp := &ntpPacket{} | ||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This might need some packet validation to make sure its correct. Stratum not zero, the mode is correct, and the TxTime txFractions are not zero. |
||||||||||
| if err = binary.Read(conn, binary.BigEndian, rsp); err != nil { | ||||||||||
| log.Warnf(log.TimeMgr, "NTP check: Unable to read from %v. Error: %s. Attempting next\n", pools[i], err) | ||||||||||
| if err = conn.Close(); err != nil { | ||||||||||
| log.Errorln(log.TimeMgr, err) | ||||||||||
| } | ||||||||||
| continue | ||||||||||
| } | ||||||||||
|
|
||||||||||
| // T4: Record time after receiving response (Destination timestamp) | ||||||||||
| t4 := time.Now() | ||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. destinationTs? |
||||||||||
|
|
||||||||||
| if err = conn.Close(); err != nil { | ||||||||||
| log.Errorln(log.TimeMgr, err) | ||||||||||
| } | ||||||||||
|
Comment on lines
+76
to
+78
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. suggestion: You can always have a sub func then just defer this close which saves on the error pathways, then in the loop just have an error check and continue and/or return value, might be useful for |
||||||||||
|
|
||||||||||
| // T2: Server receive timestamp (when server received our request) | ||||||||||
| t2 := ntpTimestampToTime(rsp.RxTimeSec, rsp.RxTimeFrac) | ||||||||||
| // T3: Server transmit timestamp (when server sent our response) | ||||||||||
| t3 := ntpTimestampToTime(rsp.TxTimeSec, rsp.TxTimeFrac) | ||||||||||
|
|
||||||||||
| // RFC 5905 offset calculation: ((T2-T1) + (T3-T4)) / 2 | ||||||||||
| // This formula cancels out the network round-trip time | ||||||||||
| offset := (t2.Sub(t1) + t3.Sub(t4)) / 2 | ||||||||||
| return offset, nil | ||||||||||
| } | ||||||||||
| return 0, errNoValidNTPServer | ||||||||||
| } | ||||||||||
|
Comment on lines
+28
to
+91
|
||||||||||
|
|
||||||||||
| // ntpTimestampToTime converts timestamp (seconds and fractional) to time.Time | ||||||||||
| func ntpTimestampToTime(seconds, fractional uint32) time.Time { | ||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This needs a dedicated test |
||||||||||
| unixSeconds := int64(seconds) - ntpEpochOffset | ||||||||||
| nanos := (int64(fractional) * 1.e9) >> 32 | ||||||||||
| return time.Unix(unixSeconds, nanos) | ||||||||||
| } | ||||||||||
|
Comment on lines
+94
to
+98
|
||||||||||
|
|
||||||||||
| // setupNTPManager creates a new NTP manager | ||||||||||
| func setupNTPManager(cfg *config.NTPClientConfig, loggingEnabled bool) (*ntpManager, error) { | ||||||||||
| if cfg == nil { | ||||||||||
|
|
@@ -53,7 +139,7 @@ func (m *ntpManager) Start() error { | |||||||||
| // the default retry limits before giving up | ||||||||||
| check: | ||||||||||
| for i := range m.retryLimit { | ||||||||||
| err := m.processTime() | ||||||||||
| err := m.processTime(context.Background()) | ||||||||||
| switch err { | ||||||||||
| case nil: | ||||||||||
| break check | ||||||||||
|
|
@@ -107,7 +193,7 @@ func (m *ntpManager) run() { | |||||||||
| case <-m.shutdown: | ||||||||||
| return | ||||||||||
| case <-t.C: | ||||||||||
| err := m.processTime() | ||||||||||
| err := m.processTime(context.Background()) | ||||||||||
| if err != nil { | ||||||||||
| log.Errorln(log.TimeMgr, err) | ||||||||||
| } | ||||||||||
|
|
@@ -123,83 +209,92 @@ func (m *ntpManager) FetchNTPTime() (time.Time, error) { | |||||||||
| if atomic.LoadInt32(&m.started) == 0 { | ||||||||||
| return time.Time{}, fmt.Errorf("NTP manager %w", ErrSubSystemNotStarted) | ||||||||||
| } | ||||||||||
| return m.checkTimeInPools(), nil | ||||||||||
| offset, err := m.getTimeOffset(context.Background()) | ||||||||||
| if err != nil { | ||||||||||
| return time.Time{}, err | ||||||||||
| } | ||||||||||
| return time.Now().Add(offset), nil | ||||||||||
| } | ||||||||||
|
|
||||||||||
| // processTime determines the difference between system time and NTP time | ||||||||||
| // to discover discrepancies | ||||||||||
| func (m *ntpManager) processTime() error { | ||||||||||
| // processTime determines the difference between system time and NTP time to discover discrepancies | ||||||||||
| func (m *ntpManager) processTime(ctx context.Context) error { | ||||||||||
| if atomic.LoadInt32(&m.started) == 0 { | ||||||||||
| return fmt.Errorf("NTP manager %w", ErrSubSystemNotStarted) | ||||||||||
| } | ||||||||||
| NTPTime, err := m.FetchNTPTime() | ||||||||||
| offset, err := m.getTimeOffset(ctx) | ||||||||||
| if err != nil { | ||||||||||
| return err | ||||||||||
| } | ||||||||||
| currentTime := time.Now() | ||||||||||
| diff := NTPTime.Sub(currentTime) | ||||||||||
| configNTPTime := m.allowedDifference | ||||||||||
| negDiff := m.allowedNegativeDifference | ||||||||||
| configNTPNegativeTime := -negDiff | ||||||||||
| if diff > configNTPTime || diff < configNTPNegativeTime { | ||||||||||
| log.Warnf(log.TimeMgr, "NTP manager: Time out of sync (NTP): %v | (time.Now()): %v | (Difference): %v | (Allowed): +%v / %v\n", | ||||||||||
| NTPTime, | ||||||||||
| currentTime, | ||||||||||
| diff, | ||||||||||
| configNTPTime, | ||||||||||
| configNTPNegativeTime) | ||||||||||
| if offset > configNTPTime || offset < configNTPNegativeTime { | ||||||||||
| log.Warnf(log.TimeMgr, "NTP manager: Time out of sync (Offset): %v | (Allowed) +%v / %v\n", offset, configNTPTime, configNTPNegativeTime) | ||||||||||
| } | ||||||||||
| return nil | ||||||||||
| } | ||||||||||
|
|
||||||||||
| // checkTimeInPools returns local based on ntp servers provided timestamp | ||||||||||
| // if no server can be reached will return local time in UTC() | ||||||||||
| func (m *ntpManager) checkTimeInPools() time.Time { | ||||||||||
| // getTimeOffset queries NTP servers and returns the calculated time offset | ||||||||||
| // using the RFC5905 formula: offset = ((T2-T1) + (T3-T4)) / 2 | ||||||||||
| // This properly accounts for network round-trip time | ||||||||||
| func (m *ntpManager) getTimeOffset(ctx context.Context) (time.Duration, error) { | ||||||||||
| dialer := &net.Dialer{ | ||||||||||
| Timeout: ntpDialTimeout, | ||||||||||
| } | ||||||||||
|
|
||||||||||
| for i := range m.pools { | ||||||||||
| con, err := net.DialTimeout("udp", m.pools[i], 5*time.Second) //nolint:noctx // TODO: #2006 Use (*net.Dialer).DialContext with (*net.Dialer).Timeout | ||||||||||
| conn, err := dialer.DialContext(ctx, "udp", m.pools[i]) | ||||||||||
| if err != nil { | ||||||||||
| log.Warnf(log.TimeMgr, "Unable to connect to hosts %v attempting next", m.pools[i]) | ||||||||||
| log.Warnf(log.TimeMgr, "Unable to connect to hosts %v attempting to next", m.pools[i]) | ||||||||||
|
||||||||||
| log.Warnf(log.TimeMgr, "Unable to connect to hosts %v attempting to next", m.pools[i]) | |
| log.Warnf(log.TimeMgr, "Unable to connect to hosts %v. Attempting next\n", m.pools[i]) |
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent error message formatting: "attempting to next" should be "attempting next" to match the pattern used in CheckNTPOffset function.
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent error message formatting: "Attempting to next" should be "Attempting next" to match the pattern used in CheckNTPOffset function.
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent error message formatting: "Attempting to next" should be "Attempting next" to match the pattern used in CheckNTPOffset function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: "T4L" should be "T4:"
| // T4L Record time after receiving response (Destination timestamp) | |
| // T4: Record time after receiving response (Destination timestamp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is significant code duplication between getTimeOffset and the new CheckNTPOffset function. The logic is almost identical. To improve maintainability and reduce redundancy, you could refactor getTimeOffset to call CheckNTPOffset. This would centralize the NTP offset calculation logic. Note that this would change some log messages to be consistent with CheckNTPOffset.
func (m *ntpManager) getTimeOffset(ctx context.Context) (time.Duration, error) {
return CheckNTPOffset(ctx, m.pools)
}
Copilot
AI
Dec 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is significant code duplication between CheckNTPOffset (lines 28-91) and getTimeOffset (lines 240-300). Both functions perform essentially the same NTP query logic with only minor differences in logging messages. Consider extracting the common NTP query logic into a shared helper function to improve maintainability and reduce duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential nil pointer dereference: If CheckNTPConfig() fails to set the default values for AllowedDifference or AllowedNegativeDifference (which could happen in unusual circumstances), dereferencing these pointers on lines 345-346 would cause a panic. Consider adding nil checks before dereferencing, or ensure CheckNTPConfig is called earlier in the config initialization flow to guarantee these values are set.