Replies: 2 comments 1 reply
-
There are potentially two sources of stales. One is something wrong with HDs, and the other is potentially the farmer (full node) kind of overwhelmed with network traffic / peer handling. As you compared your proof timings, and there were no differences, that points to an overwhelmed farmer. And, by "overwhelmed" farmer I mean not really slow CPU or drive holding blockchain, but rather poorly written synchronization between peers and db processes, that goes out of whack. Basically, in v1.2.8, the network protocol / db synchronization was partially broken. Patches that were added in v1.2.11 didn't address the core problems. As your CPUs are on rather weak side, I would drop your peer count down to 10 peers (config.yaml -> full_node / target_peer_count: 10). That will reduce both the network load, as well as blockchain db r/w. I would hope that with that change, your stales will go down to below 0.1%. I would not go (much) below 10 peers, as the less peers you have, the more likely your node may end up with "slow" peers. There is no magic behind either 10 or 80, just 10 reduces node's resource requirements, still keeping a node afloat, and being useful for to the network. |
Beta Was this translation helpful? Give feedback.
-
Excellent, glad it's something tangible I was losing my mind a bit on this
one. I may try switching roles of both servers. The CPU on the
secondary node is better and may solve this problem without sacrificing
peer count. Will try a few things and respond.Thanks.
…On Tue, Jan 4, 2022 at 1:23 AM Jacek-ghub ***@***.***> wrote:
There are potentially two sources of stales. One is something wrong with
HDs, and the other is potentially the farmer (full node) kind of
overwhelmed with network traffic / peer handling. As you compared your
proof timings, and there were no differences, that points to an overwhelmed
farmer. And, by "overwhelmed" farmer I mean not really slow CPU or drive
holding blockchain, but rather poorly written synchronization between peers
and db processes, that goes out of whack.
Basically, in v1.2.8, the network protocol / db synchronization was
partially broken. Patches that were added in v1.2.11 didn't address the
core problems.
As your CPUs are on rather weak side, I would drop your peer count down to
10 peers (config.yaml -> full_node / target_peer_count: 10). That will
reduce both the network load, as well as blockchain db r/w. I would hope
that with that change, your stales will go down to below 0.1%. I would not
go (much) below 10 peers, as the less peers you have, the more likely your
node may end up with "slow" peers. There is no magic behind either 10 or
80, just 10 reduces node's resource requirements, still keeping a node
afloat, and being useful for to the network.
—
Reply to this email directly, view it on GitHub
<#9685 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGGWOU6E2TXWOFH7I43IRDUUKG7PANCNFSM5K4OM54A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID:
<Chia-Network/chia-blockchain/repo-discussions/9685/comments/1904008@
github.com>
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I noticed that I get intermittent stales on every version after 1.2.7 and I am puzzled by what the issue might be. I have compared the time it takes for a proof to be found and there is no difference in the average. I know at some point I will not be able to continue using 1.2.7 and so I am hoping to find a solution before then.
It became a problem mainly during the dust storms. 1.2.7 behaves terribly during these events and I read somewhere that the latest version handles the events better. I installed, 1.2.11, and indeed it worked better during the storm event. But when the event was over, I noticed that I was still getting a few stales. I was operating at approximately 75% good to 25% stale. I left it for a few hours but this didn't change. So switched it back to 1.2.7 as a test and the stales went away. I installed every version except 1.2.9, and they all had intermittent stales. So I am assuming this is something that started in 1.2.8. But I see no errors or anything in my debug logs that would point to the issue.
Here are the systems I am running. I have 2 harvesters. I will refer to the one running blockchain as farmer and the second as harvester. And both harvesters produced stales when 1.2.x, x > 7, is installed. The chia version currently running is 1.2.7.
Farmer:
Harvester:
Looking for help to debug this issue and hopefully solve it. Thanks.
Beta Was this translation helpful? Give feedback.
All reactions