Skip to content

Clique deadlock resolver script

Abram Symons edited this page Oct 9, 2021 · 4 revisions

Clique which is implemented by geth as its POA algorithm may face deadlock conditions when forks with same length happens between nodes as described here. Deadlock resolver script can help resolving these deadlocks by rewinding some of nodes to n/2 + 1 earlier blocks, where n is number of signers in the network.

This script

  • first checks if the chain does not seal new blocks for 30 seconds
  • then wait a random time between 0 to 30 seconds to prevent all nodes rewind together
  • stop miner
  • rewind the chain by setting head to blockNumber - (n/2 + 1) where n is number of signers if chain is still in deadlock condition after sleep.
  • start miner to ensue that node start syncing again correctly after rewind

Reproduce deadlock conditions

To reproduce deadlock conditions to test this script, you can increase deadlock chance by:

  • decreasing wiggleTime to milliseconds here to ensure all signers seal new blocks instantly
  • and set block time to 1 second in the genesis file of the network
  • turn of one of the nodes because when all nodes are up there is almost no chance of deadlock as blocks get signed in turn. More nodes can also be turned of to test different conditions, but deadlock will be impossible when only n/2 + 1 signers are up as only one node can sign each block in such a condition.

With such a configuration you should have something around one deadlock each hour.

Run as a service

  • Get the python script
    deadlock_resolver.py
  • Create Service File
    create a service file for the systemd as following. /etc/systemd/system/deadlock-resolver.service and add the following content in it.
     [Unit]
     Description=IDChain deadlock resolver
     After=network.target
     StartLimitIntervalSec=0
     [Service]
     Type=simple
     Restart=always
     RestartSec=1
     User=root
    
     # Update the pathes
     ExecStart=/path/to/python3 -u /path/to/deadlock_resolver.py
     [Install]
     WantedBy=multi-user.target
    
  • Enable Newly Added Service
     $ sudo systemctl daemon-reload
     $ sudo systemctl enable deadlock-resolver.service
     $ sudo systemctl start deadlock-resolver.service
    

Clone this wiki locally