Skip to content

Conversation

@pauljeannot
Copy link

@pauljeannot pauljeannot commented Oct 4, 2024

Hi,

This PR addresses the issue reported here: #3050.

I have reverted the log level back to warning, as it was previously.

Thansk!

@drichardson
Copy link

Can this get merged in @benoitc? This error message causes a lot of error log false positive noise during deployments where SIGTERMs are common.

@obi-jerome
Copy link

Hi !
I would love to see this PR merged too @benoitc.
Please please. ^_^

@panterz
Copy link

panterz commented Apr 9, 2025

Any updates on this? Thanks

@shetty-777
Copy link

Please merge this...
Is that what should be done? In gunicorn 23.0.0, the arbiter.py has error and not warning

@pajod
Copy link
Contributor

pajod commented Apr 10, 2025

For reasons already pointed out in the related issue thread I believe this should not be merged. An active worker being signaled really is an error condition and should be reported as such.

  • Where this generally happens, it is caused by explicit administrative action. Gunicorn can and should be shutdown properly - by signalling the master, without triggering an error.
  • In typical scenarios, fixing the shutdown procedure is easy by switching out the overly broad kill command, e.g.:
    • kill -TERM $(cat /var/run/gunicorn.pid) works without error
    • pkill --oldest -TERM "gunicorn" works without error
    • systemctl stop gunicorn.service works without error, provided $MAINPID is properly tracked (I have made suggestions on how this can be made more reliable in #3285)
    • docker kill also has a mechanism for targeting the "main process inside the container"
  • Where this can be triggered in rare race conditions, it should be diagnosed and fixed instead of being made less prominent in logs.

Gunicorn documentation also can and should be improved on how to get this right. But the Gunicorn arbiter should keep reporting with the appropriate log level when a worker runs into a fatal problem.

@shetty-777
Copy link

For reasons already pointed out in the related issue thread I believe this should not be merged. An active worker being signaled really is an error condition and should be reported as such.

* Where this generally happens, it is caused by explicit administrative action. Gunicorn can and should be shutdown properly - by signalling the master, without triggering an error.
  
  * `kill -TERM $(cat /var/run/gunicorn.pid)` works without error
  * `pkill --oldest -TERM "gunicorn"` works without error
  * `systemctl stop gunicorn.service` works without error, provided `$MAINPID` is properly tracked (I have made suggestions on how this can be made more reliable in [#3285](https://github.com/benoitc/gunicorn/pull/3285))
  * `docker kill` also has a mechanism for targeting the _"main process inside the container"_

* Where this can be triggered in rare race conditions, it should be diagnosed and _fixed_ instead of being made less prominent in logs.

Gunicorn documentation also can and should be improved on how to get this right. But the Gunicorn arbiter should keep reporting with the appropriate log level when a worker runs into a fatal problem.

Thank you for being responsive and replying. Firstly, I am a complete beginner just using gunicorn to serve my little blog site, so I don't really know what's going on here. It was working and now it isn't.

  1. The suggestions you have made in issue Retain argv + fds on re-exec, fix USR2 under systemd by notifying new PID #3285 seem extremely helpful and I wish they be implemented.
  2. So, fixing this problem without downgrading the log level is not to be done by me, the user?
  3. If it works without issues when the exception is ignored, can it be "Fatal"?

I just forked gunicorn changed error to warning in arbiter.py and my problem is solved. I know it is a dirty fix, but it works.

@ryuichi1208
Copy link

ryuichi1208 commented Oct 24, 2025

@pajod

Where this generally happens, it is caused by explicit administrative action. Gunicorn can and should be shutdown properly - by signalling the master, without triggering an error.

Isn't it possible that what's happening now is that after sending SIGTERM to the master, the SIGTERM is propagated to the child process, resulting in an error? Is it possible to determine whether sending SIGTERM directly to the child process is a problem?

@pajod
Copy link
Contributor

pajod commented Oct 24, 2025

@ryuichi1208

Isn't it possible that what's happening now is that after sending SIGTERM to the master, the SIGTERM is propagated to the child process, resulting in an error?

You are thinking of just "receiving SIGTERM", but that is not what goes to the WIFSGINALED(waitpid(..)) branch. That one is triggered by unexpectedly learning that the child process received a signal and failed to catch it. Which is not what should happen (and generally: what does happen) on ordinary shutdown.

Is it possible to determine whether sending SIGTERM directly to the child process is a problem?

For the master process, that distinction is easy. The answer is yes. It is a problem, by default, if a live worker is terminated by a signal it could and should have handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants