Conversation
gossip/handler.go
Outdated
| useless = true | ||
|
|
||
| // Some clients have compatible caps and thus pass discovery checks and seep in to | ||
| // protocol handler. We should band these clients immediately. |
gossip/handler.go
Outdated
| txChanSize = 4096 | ||
|
|
||
| // percentage of useless peer nodes to allow | ||
| uselessPeerPercentage = 20 // 20% |
There was a problem hiding this comment.
Nit: Why don't we use just a factor, e.g. 0.2, instead of then having to calculate each time the percentage?
gossip/handler.go
Outdated
|
|
||
| // A useless peer is the one which does not support protocols opera/63 & fsnap/1. | ||
| useless := !eligibleForSnap(p.Peer) | ||
| if !p.Peer.Info().Network.Trusted && useless && h.peers.UselessNum() >= (h.maxPeers*(uselessPeerPercentage/100)) { |
There was a problem hiding this comment.
Question: I am not yet familiar with this useless stuff, but why do we even allow a percentage of useless peers at all? Why don't we just disconnect them all?
There was a problem hiding this comment.
Well, the peer is useless in the context of sync, i.e. it doesn't support fsnap/1 and opera/63.
But old peers supporting opera/62 should still be allowed to participate.
There was a problem hiding this comment.
Ah so I assume useless then already checked that the peer is a opera/62 peer. It's not just any peer. That would make sense.
gossip/handler.go
Outdated
| return err | ||
| // progress and application | ||
| progressWatchDogTimer := time.NewTimer(noProgressTime) | ||
| applicationWatchDogTimer := time.NewTimer(noAppMessageTime) |
There was a problem hiding this comment.
Aren't we recreating the timer on each for iteration here? Therefore the Resets later are useless? It looks to me that either we have to create the timers outside of the for loop, and then Reset them as you do now, or recreating them in each loop iteration and just break when we Reset, although this then results in a lot of garbage collected timers? Or am I missing something?
There was a problem hiding this comment.
Oops... the timer should be outside the loop.
| err := h.handleMsg(p) | ||
| if err != nil { | ||
| p.Log().Debug("Message handling failed", "err", err) | ||
| if strings.Contains(err.Error(), errorToString[ErrPeerNotProgressing]) { |
There was a problem hiding this comment.
Can we use errors.Is here instead of comparing strings?
There was a problem hiding this comment.
We can use errors.Is() only to compare errors. But in this place, the error is defined as a string.
If we want to change it, we should define all the errors as errors.New().
There was a problem hiding this comment.
Yes agreed. If there are more such string based errors instead of errors.New() based ones (which I believe would be better) - then this should go into a separate PR to address. So up to you if you want to do anything in this PR.
gossip/handler.go
Outdated
| p.SetProgress(progress) | ||
| // If peer has not progressed for noProgressTime minutes, then disconnect the peer. | ||
| if !p.IsPeerProgressing() { | ||
| return errResp(ErrPeerNotProgressing, "%v: %v %v", "epoch is not progressing for ", noProgressTime, "minutes") |
There was a problem hiding this comment.
Nit: As noProgressTime is a duration, this would print "epoch is not progressing for 3m0s minutes", I think
| return errResp(ErrInvalidMsgCode, "%v", msg.Code) | ||
| } | ||
|
|
||
| if msg.Code != ProgressMsg { |
There was a problem hiding this comment.
I am not yet familiar with all message codes, but is ProgressMsg the only message which signals that there is progress?
|
|
||
| func (p *peer) setPeerAsProgressing(x PeerProgress) { | ||
| p.progress = x | ||
| p.progressTime = time.Now() |
There was a problem hiding this comment.
Any specific reason why p.appMessageTime is locked, but p.progressTime isn't?
There was a problem hiding this comment.
It's locked in SetProgress() where setPeerAsProgressing() is called.
gossip/peer_test.go
Outdated
| newPeer := getPeer() | ||
| ep1 := PeerProgress{Epoch: 1} | ||
| newPeer.SetProgress(ep1) | ||
| time.Sleep(2 * time.Second) //set the threshold to 2 second |
There was a problem hiding this comment.
All these Sleep acctumulate to 9 seconds - making test runs 9 seconds slower as I understand. Isn't there a different way to test this? Do we actually even need to sleep?
holisticode
left a comment
There was a problem hiding this comment.
I am not sure if I should already be the only person approving, but I want to signal that this looks good to me now (at least).
This PR adds checks to identify and ban peers that pass the P2P handshake and are accepted into the application protocol but has other application-level issues.
These checks have shown that peers that are valid and working honestly get priority.
Depends on Fantom-foundation/go-ethereum#44