Skip to content

Revise atexit uwsgi integration #2305

@xrmx

Description

@xrmx

In uwsgi 2.0.27 the gracefullty reload code changed and now the atexit callbacks are run. At the On uwsgi we use its own atexit handler instead of the python atexit handler we use elsewhere.

We register our close function in this atexit handlers in order to signal the apm server that we are done with sending data.

On uwsgi with chain reload this leads to some errors probably caused by a race condition with the openssl atexit code and we are getting this error:

Fri May 30 09:25:51 2025 - *** /tmp/uwsgi.reload has been touched... chain reload !!! ***
Fri May 30 09:25:51 2025 - chain next victim is worker 1
Gracefully killing worker 1 (pid: 1220461)...
Failed to submit message: "Unable to reach APM Server: HTTPSConnectionPool(host='sdsad-ddf831.apm.us-east-1.aws.elastic.cloud', port=443): Max retries exceeded with url: /intake/v2/events (Caused by SSLError(SSLError(0, 'unknown error (_ssl.c:3036)'))) (url: https://sdsad-ddf831.apm.us-east-1.aws.elastic.cloud:443/intake/v2/events)"
worker 1 killed successfully (pid: 1220461)
Respawned uWSGI worker 1 (new pid: 1220518)
Fri May 30 09:25:52 2025 - chain is still waiting for worker 1...
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7e17060b0e30 pid: 1220518 (default app)
Fri May 30 09:25:55 2025 - chain reloading complete

While using python atexit handler gives a less scary message:

Fri May 30 09:20:50 2025 - *** /tmp/uwsgi.reload has been touched... chain reload !!! ***
Fri May 30 09:20:50 2025 - chain next victim is worker 1
Gracefully killing worker 1 (pid: 1219067)...
Failed to submit message: 'Connection to APM Server timed out (url: https://sdsad-ddf831.apm.us-east-1.aws.elastic.cloud:443/intake/v2/events, timeout: 5 seconds)'
worker 1 killed successfully (pid: 1219067)
Respawned uWSGI worker 1 (new pid: 1219141)
Fri May 30 09:20:54 2025 - chain is still waiting for worker 1...
WSGI app 0 (mountpoint='') ready in 1 seconds on interpreter 0x7cbea2eb0e30 pid: 1219141 (default app)

We should investigate if we have a way to detect this and switch to the proper atexit implementation or if we don't maybe add a configuration.

Uwsgi config to reproduce:

[uwsgi]
module = djangouwsgi.wsgi
http = :8000
enable-threads = true
chain-reload = true
touch-chain-reload = /tmp/uwsgi.reload
pidfile = /tmp/uwsgi.pid
processes = 2
threads = 2
master = true
vacuum = true

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions