-
Notifications
You must be signed in to change notification settings - Fork 21
Description
https://whois.toolforge.org/ has been unstable for weeks. It dies a couple of times a day, although usually it gets restarted after a while of unavailability.
Here is what appears like relevant part of the error messages: errors.txt
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 249
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 243
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 241
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 245
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 255
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 229
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 213
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 237
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 251
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 197
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 195
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 157
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 165
2021-08-13 10:44:43: (server.c.1863) connection closed - keep-alive timeout: 173
2021-08-13 10:44:43: (mod_cgi.c.1049) CGI pid 8284 died with signal 9
2021-08-13 10:44:44: (mod_cgi.c.1049) CGI pid 8308 died with signal 9
2021-08-13 10:44:44: (mod_cgi.c.1049) CGI pid 8312 died with signal 9
2021-08-13 10:44:44: (mod_cgi.c.1049) CGI pid 8322 died with signal 9
2021-08-13 10:44:44: (mod_cgi.c.1049) CGI pid 8282 died with signal 9
2021-08-13 10:44:44: (mod_cgi.c.1049) CGI pid 8297 died with signal 9
2021-08-13 10:44:44: (mod_cgi.c.1049) CGI pid 8315 died with signal 9
2021-08-13 10:44:45: (mod_cgi.c.1049) CGI pid 8283 died with signal 9
2021-08-13 10:44:46: (gw_backend.c.335) child signalled: 9
2021-08-13 10:44:47: (gw_backend.c.335) child signalled: 9
2021-08-13 10:44:47: (gw_backend.c.476) unlink /var/run/lighttpd/php.socket.whois-1 after connect failed: Connection refused
2021-08-13 10:44:48: (mod_cgi.c.1049) CGI pid 8301 died with signal 9
2021-08-13 10:44:48: (mod_cgi.c.1049) CGI pid 8291 died with signal 9
2021-08-13 10:44:48: (gw_backend.c.476) unlink /var/run/lighttpd/php.socket.whois-0 after connect failed: Connection refused
2021-08-13 10:44:49: (server.c.1863) connection closed - keep-alive timeout: 211
2021-08-13 10:44:49: (server.c.1863) connection closed - keep-alive timeout: 203
2021-08-13 10:44:49: (server.c.1863) connection closed - keep-alive timeout: 155
2021-08-13 10:44:49: (mod_cgi.c.1049) CGI pid 8268 died with signal 9
2021-08-13 10:44:49: (mod_cgi.c.1049) CGI pid 8298 died with signal 9
2021-08-13 10:44:49: (mod_cgi.c.1049) CGI pid 8307 died with signal 9
2021-08-13 10:44:49: (mod_cgi.c.1049) CGI pid 8265 died with signal 9
2021-08-13 10:44:50: (server.c.1863) connection closed - keep-alive timeout: 181
2021-08-13 10:44:50: (server.c.1863) connection closed - keep-alive timeout: 231
2021-08-13 10:44:50: (server.c.1863) connection closed - keep-alive timeout: 217
2021-08-13 10:44:50: (server.c.1863) connection closed - keep-alive timeout: 153
2021-08-13 10:44:51: (gw_backend.c.335) child signalled: 9
2021-08-13 10:44:51: (gw_backend.c.476) unlink /var/run/lighttpd/php.socket.whois-1 after connect failed: Connection refused
2021-08-13 10:44:51: (gw_backend.c.335) child signalled: 9
2021-08-13 10:44:51: (gw_backend.c.476) unlink /var/run/lighttpd/php.socket.whois-0 after connect failed: Connection refused
(The linked file contains a larger portion from the log file.)
Here is my tentative plan. Coincidentally I have been preparing a flask version of the tool inspired by @wiki-ST47 's fork, and it's ready to be tested. ( #20 ) If it serves automated traffic to the JSON endpoint well (and assuming that's the cause), the new version could be the solution.
I don't know what is happening from the logs above, so the replacement might or might not solve it. If there is an identifiable cause, I can work on it (and work on the switch independently, perhaps later).