Skip to content

Commit 8543bec

Browse files
authored
Merge pull request #1968 from mintlayer/fork_detection_script
Fork detection script
2 parents fc9cd2c + 4221eb0 commit 8543bec

File tree

38 files changed

+1161
-20
lines changed

38 files changed

+1161
-20
lines changed

build-tools/block-data-plots/collect_data.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
import argparse
2+
import os
23
import subprocess
34
from pathlib import Path
45

@@ -14,6 +15,7 @@
1415

1516
def collect_data(args):
1617
if args.output_file is None:
18+
os.makedirs(DEFAULT_OUTPUT_DIR, exist_ok=True)
1719
output_file = DEFAULT_OUTPUT_DIR.joinpath(
1820
DEFAULT_OUTPUT_FILE_NAME_FMT.format(chain_type=args.chain_type)
1921
)
@@ -26,7 +28,7 @@ def collect_data(args):
2628
"--output-file", output_file,
2729
"--mainchain-only=true",
2830
"--fields=height,timestamp,target",
29-
"--from_height=0"
31+
"--from-height=0"
3032
]
3133

3234
if args.node_data_dir is not None:
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
## Fork detection script, for the extra peace of mind
2+
3+
Here we have `detector.py`, which is a relatively crude way of detecting a permanent fork (split)
4+
in the network if it happens.
5+
6+
The script basically runs the full sync in a loop, checking the node's log output for certain errors
7+
and comparing its mainchain block ids with those obtained from the API server.\
8+
If anything suspicious is detected during the full sync, the script will save the node's data
9+
directory and log file.\
10+
In any case, the script will temporarily ban some of the peers that participated in the sync
11+
(so that the next iteration has a chance to have different ones and to reduce the strain on
12+
the network) and start the full sync all over again, reusing the peerdb from the previous iteration.
13+
14+
The node is always run with checkpoints disabled, so that it has the chance to find older forks too.
15+
16+
The structure of the script's working directory (specified via the command line):
17+
- `current_attempt` - this corresponds to the current sync attempt (iteration).
18+
- `saved_attempts` - this contains subdirectories corresponding to attempts that
19+
are considered suspicious; each subdirectory's name is the datetime of the moment
20+
when the attempt was finished.
21+
- `saved_peer_dbs` - this is where peer dbs from previous attempts are stored; the script
22+
only needs the one from the latest attempt, but, just in case, the penultimate one is
23+
also stored.
24+
- `log.txt` - this is the log of the script itself.
25+
26+
Each attempt's directory has the following structure:
27+
- `flags` - this directory contains flag files (which are usually zero-length) indicating
28+
that certain problems were found during the sync. It is what determines whether the attempt's
29+
directory will be saved in the end (i.e. if the directory is non-empty, the attempt will be saved).
30+
- `node_data` - this is the node's data directory of this attempt.
31+
- `node_log.txt` - the node's log.
32+
33+
Some notes:
34+
* Currently the script requires Python 3.13 to run, though we may lift this requirement later.
35+
* The script can send an email when it detects an issue using the local SMTP server
36+
(if you're on Linux, google for an SMTP Postfix tutorial to set it up).
37+
* Even if the script finds a problem (e.g. a checkpoint mismatch), you're still likely
38+
to end up being on the correct chain. To download the actual fork for further investigation
39+
you can initiate a separate full sync while using the node's option `--custom-checkpoints-csv-file`
40+
to override the correct checkpoints with the wrong ones.
41+
* Once the fork has been downloaded, you'll want to examine the contents of its chainstate db.
42+
Currently we have the `chainstate-db-dumper` tool that can dump certain info about blocks
43+
to a CSV file (the most interesting part of it being the ids of pools that continue producing
44+
blocks on that fork).
45+
* Once the fork has been investigated you can "permanently" ban the peers that have been sending it
46+
to you, to prevent it from being reported again and again. To do so, you can add their ip
47+
addresses to `permabanned_peers.txt` (one address per line, '#' starts a comment) in the script's
48+
working directory (it doesn't exist by default, so you'll have to create it first).
49+
Note that the file is checked on every iteration, so you can update it while the script is already
50+
running and it will come into effect when the next iteration starts.
51+
* The script is likely to fail if a networking error occurs, e.g. if it can't query the API server.
52+
So, run it in a loop in a shell script (with some delay after each run, to prevent it from spamming
53+
you with warning emails).

0 commit comments

Comments
 (0)