|
| 1 | +## Fork detection script, for the extra peace of mind |
| 2 | + |
| 3 | +Here we have `detector.py`, which is a relatively crude way of detecting a permanent fork (split) |
| 4 | +in the network if it happens. |
| 5 | + |
| 6 | +The script basically runs the full sync in a loop, checking the node's log output for certain errors |
| 7 | +and comparing its mainchain block ids with those obtained from the API server.\ |
| 8 | +If anything suspicious is detected during the full sync, the script will save the node's data |
| 9 | +directory and log file.\ |
| 10 | +In any case, the script will temporarily ban some of the peers that participated in the sync |
| 11 | +(so that the next iteration has a chance to have different ones and to reduce the strain on |
| 12 | +the network) and start the full sync all over again, reusing the peerdb from the previous iteration. |
| 13 | + |
| 14 | +The node is always run with checkpoints disabled, so that it has the chance to find older forks too. |
| 15 | + |
| 16 | +The structure of the script's working directory (specified via the command line): |
| 17 | +- `current_attempt` - this corresponds to the current sync attempt (iteration). |
| 18 | +- `saved_attempts` - this contains subdirectories corresponding to attempts that |
| 19 | + are considered suspicious; each subdirectory's name is the datetime of the moment |
| 20 | + when the attempt was finished. |
| 21 | +- `saved_peer_dbs` - this is where peer dbs from previous attempts are stored; the script |
| 22 | + only needs the one from the latest attempt, but, just in case, the penultimate one is |
| 23 | + also stored. |
| 24 | +- `log.txt` - this is the log of the script itself. |
| 25 | + |
| 26 | +Each attempt's directory has the following structure: |
| 27 | +- `flags` - this directory contains flag files (which are usually zero-length) indicating |
| 28 | + that certain problems were found during the sync. It is what determines whether the attempt's |
| 29 | + directory will be saved in the end (i.e. if the directory is non-empty, the attempt will be saved). |
| 30 | +- `node_data` - this is the node's data directory of this attempt. |
| 31 | +- `node_log.txt` - the node's log. |
| 32 | + |
| 33 | +Some notes: |
| 34 | +* Currently the script requires Python 3.13 to run, though we may lift this requirement later. |
| 35 | +* The script can send an email when it detects an issue using the local SMTP server |
| 36 | + (if you're on Linux, google for an SMTP Postfix tutorial to set it up). |
| 37 | +* Even if the script finds a problem (e.g. a checkpoint mismatch), you're still likely |
| 38 | + to end up being on the correct chain. To download the actual fork for further investigation |
| 39 | + you can initiate a separate full sync while using the node's option `--custom-checkpoints-csv-file` |
| 40 | + to override the correct checkpoints with the wrong ones. |
| 41 | +* Once the fork has been downloaded, you'll want to examine the contents of its chainstate db. |
| 42 | + Currently we have the `chainstate-db-dumper` tool that can dump certain info about blocks |
| 43 | + to a CSV file (the most interesting part of it being the ids of pools that continue producing |
| 44 | + blocks on that fork). |
| 45 | +* Once the fork has been investigated you can "permanently" ban the peers that have been sending it |
| 46 | + to you, to prevent it from being reported again and again. To do so, you can add their ip |
| 47 | + addresses to `permabanned_peers.txt` (one address per line, '#' starts a comment) in the script's |
| 48 | + working directory (it doesn't exist by default, so you'll have to create it first). |
| 49 | + Note that the file is checked on every iteration, so you can update it while the script is already |
| 50 | + running and it will come into effect when the next iteration starts. |
| 51 | +* The script is likely to fail if a networking error occurs, e.g. if it can't query the API server. |
| 52 | + So, run it in a loop in a shell script (with some delay after each run, to prevent it from spamming |
| 53 | + you with warning emails). |
0 commit comments