Skip to content

[version3] Deadlock on node closing #417

@fpelliccioni

Description

@fpelliccioni

(Posted here by Eric indication).

Hello.
I am experimenting a deadlock when I try to close the node (bn) Ctrl-C (SIGINT, SIGTERM).

To get the bn executable I follow:

wget https://raw.githubusercontent.com/libbitcoin/libbitcoin-node/version3/install.sh
chmod +x install.sh
sudo ./install.sh --prefix=/opt/libbitcoin --build-boost --disable-shared
#Boost modified to use 1.64 because 1.57 has compilation errors (multiprecision)

Using the following configuration file:

# cat /home/ubuntu/fer/deadlock/data/deadlock-btc.cfg 

[log]
archive_directory = /home/ubuntu/fer/deadlock/data/log/archive-node-btc-testnet
debug_file = /home/ubuntu/fer/deadlock/data/log/node-btc-testnet-debug.log
error_file = /home/ubuntu/fer/deadlock/data/log/node-btc-testnet-error.log
rotation_size = 100000000
minimum_free_space = 0
verbose = true

[network]
protocol_maximum = 70013
protocol_minimum = 31402
identifier = 118034699
outbound_connections = 50
inbound_connections = 50
inbound_port = 18333
channel_expiration_minutes = 30
hosts_file = /home/ubuntu/fer/deadlock/data/hosts-btc-testnet.cache
seed = testnet-seed.bitcoin.jonasschnelli.ch:18333
seed = seed.tbtc.petertodd.org:18333
seed = testnet-seed.bluematt.me:18333
seed = testnet-seed.bitcoin.schildbach.de:18333
seed = testnet-seed.voskuil.org:18333
self = 0.0.0.0:0

[database]
directory = /home/ubuntu/fer/deadlock/data/database/btc-testnet
flush_writes = false

[blockchain]

# cores = 4

# Testnet-btc
checkpoint = 000000000933ea01ad0ee984209779baaec3ced90fa3f408719526f8d77f4943:0
checkpoint = 00000000009e2958c15ff9290d571bf9459e93b19765c6801ddeccadbb160a1e:100000
checkpoint = 0000000000287bffd321963ef05feab753ebe274e1d78b2fd4e2bfe9ad3aa6f2:200000
checkpoint = 000000000000226f7618566e70a2b5e020e29579b46743f05348427239bf41a1:300000
checkpoint = 000000000598cbbb1e79057b79eef828c495d4fc31050e6b179c57d07d00367c:400000
checkpoint = 000000000001a7c0aaa2630fbb2c0e476aafffc60f82177375b2aaa22209f606:500000
checkpoint = 000000000000624f06c69d3a9fe8d25e0a9030569128d63ad1b704bbb3059a16:600000
checkpoint = 000000000000406178b12a4dea3b27e13b3c4fe4510994fd667d7c1e6a3f4dc1:700000
checkpoint = 0000000000209b091d6519187be7c2ee205293f25f9f503f90027e25abf8b503:800000
checkpoint = 0000000000356f8d8924556e765b7a94aaebc6b5c8685dcfa2b1ee8b41acd89b:900000
checkpoint = 0000000000478e259a3eda2fafbeeb0106626f946347955e99278fe6cc848414:1000000
checkpoint = 00000000001aa0b431dc7f8fa75179b8440bdb671db5ca79e1087faff00c19d8:1050000
checkpoint = 00000000001c2fb9880485b1f3d7b0ffa9fabdfd0cf16e29b122bb6275c73db0:1100000

[fork]
easy_blocks = true
# retarget = true
bip16 = true
bip30 = true
bip34 = true
bip66 = true
bip65 = true
bip90 = true

[node]
relay_transactions=true

... and I have the following Python script to run the node multiple times:

# cat multiple_launchs.py 

from subprocess import call
from subprocess import check_output
from random import randint
from subprocess import Popen
from subprocess import CalledProcessError

import subprocess
import signal
import os
import time

def get_pid(name):
    try:
        return map(int, check_output(["pidof",name]).split())
    except CalledProcessError:
        return []

def process_is_running(name):
    return len(get_pid(name)) > 0

def launch():
    sout = open('sout-fer.log', 'w')
    serr = open('serr-fer.log', 'w')
    return (sout, serr, Popen(["./bn", "-c", "/home/ubuntu/fer/deadlock/data/deadlock-btc.cfg"], stdout=sout, stderr=serr))

def finalize(p):
    pid = p.pid
    os.kill(pid, signal.SIGINT)

def finalize_and_wait(p):
    finalize(p)
    return p.wait()


def finalize_wait_measure(p):
    finalize(p)
    start_time = time.time()
    p.wait()
    elapsed_time = time.time() - start_time
    return elapsed_time


def wait_for_termination(name):
    while process_is_running(name):
        time.sleep(10)

def main():

    stats = []

    for i in range(0, 100):
        sout, serr, p = launch()
        seconds = randint(10, 120)
        print("******************************* Sleeping for %s seconds..." % (seconds))
        time.sleep(seconds)
        print("******************************* SLEEP FINALIZED *******************************************")


        elapsed_time = finalize_wait_measure(p)
        sout.close()
        serr.close()
        print("******************************* Closing procedure elapsed time: %s seconds..." % (elapsed_time))

        stats.append([i, seconds, elapsed_time])

    print("******************************* STATS *******************************************")
    print(stats)

main()

Node chain initialization and script running:

./bn -i -c /home/ubuntu/fer/deadlock/data/deadlock-btc.cfg
python multiple_launchs.py

(the python script have to be in the same directory as bn executable)

Thanks and regards,
Fernando.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions