Replies: 7 comments
-
I continue to get this error message. Do I need to do something to correct it or not worry about it? 2021-09-09T19:21:59.376 full_node full_node_server : ERROR Exception: Error short batch syncing, could not fetch block at height 836993 <class 'ValueError'>, closing connection {'host': '98.249.137.155', 'port': 16664}. Traceback (most recent call last): |
Beta Was this translation helpful? Give feedback.
-
I'm getting this error message now after updating to the most recent version (1.2.6) |
Beta Was this translation helpful? Give feedback.
-
every now and then, an error happens, it can be the peer that you have requested data from crashed, lost internet, or just plain closed down chia - you will see this on your side - as long as it is a "short sync" error - your node is relative close to being on the peak height of the chain - it will try and get the block from another peer... |
Beta Was this translation helpful? Give feedback.
-
None of the conditions your listed is actually an error (peer crashed, ...), as the requesting node is happy doing what it should be doing. Adding to that that the node still has other peers to ask for the data, makes it even less of an error, just a warning while going through the data request loop. Once the loop is exhausted, it may be an error with the local setup (connection down, ...). At this point a different error should be kicked off stating that the node is potentially isolated. The fact that we see a Traceback output there means that the code doesn't handle that case at all, and barfs instead of smoothly moving on to the next peer. Maybe that ERROR should be reclassified to be above INFO level, as it is clearly only debugging info for the engineering/QA teams, and has no value for in the field node owner. It just pollutes logs, and makes end users be less sensitive to warning/errors they see. |
Beta Was this translation helpful? Give feedback.
-
@Jacek-ghub I'm not entirely sure how you mean it isn't an error ?
Thats an example of a sync attempt not succeeding, although in another flavor than the one DDMurr reported above (this is an INFO level log from 1.2.8). As you can see, it does end up with ERROR entries from the full_node. |
Beta Was this translation helpful? Give feedback.
-
"every now and then, an error happens, it can be the peer that you have requested data from crashed, lost internet, or just plain closed down chia" Agreed that errors happen no questions about that, but not every condition is an error. The fact that "the peer that you have requested data from crashed" is that peer problem/error, not really mine, as my node is still doing fine, and have other peers to work with. P2P protocol is based on flaky peers, as such one peer disappearing is just a part of a standard procedure, and you move on trying another peer. Only once you have exhausted the list of peers and still have no data you were looking for (i.e., from the local node point of view, all peers have irrecoverable issues), this becomes a mine error/problem, as it suggest that the local node has issues (ISP, router, Ethernet, OS, ...), and an error is needed to mitigate those issues. Going with your notion that somehow it is still an error. What the local node can do in such case? What the user can do to react to such error? There is no way to contact such remote node as it is more or less hidden who is behind that connection (as you said such peer could be just shut down, crashed, lost connection)? There is nothing that the local node can do, but as P2P specifies, it should quietly move to the next peer, and just request the same data. I would also check why there is that Traceback there. Why that code spits out exception. If that code would be handling the errors via catching those exceptions, then as above, it should be quiet, as it is a well know condition, so no need to issue errors/warnings/... This is apparently not the case, and suggests that the offending code basically barfs there. One more take on this error is that such condition could be considered as error if:
Looks like here we have #1 case, as the system is running smoothly. It is also clearly not #2 condition, as we are told that this is not an issue at all. If you check what code assertiveness means, is you log issues that you don't like (but you never call them errors), and are out of your control, as it helps to focus on the code parts that potentially lead to such condition (and you are off the hook for such problems). If on the other hand you just spit messages restating that the code is not properly handling some issues, we still don't know where the source of those errors are, as the main focus is on the offending code. That also implies that such case could be handled, to at least mitigate the problem, and therefore no need for such errors. Also, saying that this is an error, because it is in the log is a bit circular, isn't it? Looking at that section, I really don't understand whey the fourth line is an error. There is no error condition there, the code just decided to ban that peer, and is moving on to the next one. Shouldn't that be an INFO level message? The same with the sixth line, it is just INFO not-syncing condition, and the code moves to "long sync done" state, where you could inform that either it was not successful, and the next node will be tried. Of if that was the end of the loop, that should say that the loop got exhausted, and no data was received, and that would warrant ERROR condition, as most likely the problem is local. One more thing. The code that you have shown is a clean code. Just logged messages. On the other hand, the OP code has Trackbacks, as such that code just barfed there, and whichever part caught that exception is trying to recover, and is also spitting that error without the understanding of what condition was really the issue (absolutely, this is an error on this level, but not on the one that barfed). You can see how many levels that code that caught that exception is away for the offending one. |
Beta Was this translation helpful? Give feedback.
-
@DDMurr , the newest version of chia has fixes in place to resolve this issue. Please download it from here: https://www.chia.net/downloads/ Since this issue has been resolved with the latest version(s) of chia we will be closing this ticket but if we have closed this ticket in error do not hesitate to reach out to us again with any followup questions, comments, or if the issue persists after an update.
The best place to reach our support team is on Discord (https://discord.gg/chia) or by reopening this ticket. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
2021-09-01T20:13:04.497 full_node full_node_server : ERROR Exception: Error short batch syncing, could not fetch block at height 800024 <class 'ValueError'>, closing connection {'host': '188.187.42.11', 'port': 16664}. Traceback (most recent call last):
File "chia\server\server.py", line 563, in api_call
File "asyncio\tasks.py", line 442, in wait_for
File "chia\server\server.py", line 560, in wrapped_coroutine
File "chia\server\server.py", line 553, in wrapped_coroutine
File "chia\full_node\full_node_api.py", line 106, in new_peak
File "chia\full_node\full_node.py", line 411, in new_peak
File "chia\full_node\full_node.py", line 247, in short_sync_batch
ValueError: Error short batch syncing, could not fetch block at height 800024
Beta Was this translation helpful? Give feedback.
All reactions