add: add option to delay shutdown initiation after receiving a `SIGTERM` #4580

lforst · 2025-12-23T14:34:48Z

Adds an option to delay shutdown when receiving SIGTERM. This is useful in certain cloud environments (AWS ECS, Kubernetes) where the load balancer needs time to deregister the service before the process terminates - reducing the risk of stray connections being rejected by PostgREST while load balancing updates propagate.

Config: server-shutdown-wait-period (in seconds, default is 0)
Env: PGRST_SERVER_SHUTDOWN_WAIT_PERIOD
Only affects SIGTERM. SIGINT will still terminate immediately

There seems to be some prior art around this:

OPA added a similar option Consider delay/sleep on graceful termination open-policy-agent/opa#2764
Envoy has an option for draining delay: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/draining
linkerd has a --wait-before-seconds option: https://linkerd.io/2-edge/tasks/graceful-shutdown/

steve-chavez · 2025-12-23T16:43:46Z

@lforst Looks reasonable to add 👍. This article helped me to understand the problem.

For correcting the CI failures, you can run nix-shell --run postgrest-lint and nix-shell --run postgrest-style when inside the postgREST directory. For the commit style failure, ensure all commits have a prefix (in this case if you do a squash that would solve it).

steve-chavez · 2025-12-23T16:47:58Z

We would also need to add the config to the docs here:

postgrest/docs/references/configuration.rst

Lines 847 to 873 in aa4199f

    
           server-port 
        
           ----------- 
        
             =============== ================================= 
        
             **Type**        Int 
        
             **Default**     3000 
        
             **Reloadable**  N 
        
             **Environment** PGRST_SERVER_PORT 
        
             **In-Database** `n/a` 
        
             =============== ================================= 
        
             The TCP port to bind the web server. Use ``0`` to automatically assign a port. 
        
           .. _server-trace-header: 
        
           server-trace-header 
        
           ------------------- 
        
             =============== ================================= 
        
             **Type**        String 
        
             **Default**     `n/a` 
        
             **Reloadable**  Y 
        
             **Environment** PGRST_SERVER_TRACE_HEADER 
        
             **In-Database** pgrst.server_trace_header 
        
             =============== ================================= 
        
             The header name used to trace HTTP requests. See :ref:`trace_header`.

wolfgangwalther · 2025-12-26T18:17:24Z

This article helped me to understand the problem.

My understanding of k8s services, ingresses and nginx-ingress is different and I believe the article is wrong about it. The nginx configuration does not contain the actual different endpoints, it uses the DNS name of the service, which is immediately switched over by kubernetes.

I don't buy the argument, yet, why this is the right way to fix something that seems like it needs a fix on a different level.

Also note that the article itself says:

It's worth noting that we first hit this problem over 3 years ago now, so my understanding and information may be a little out of date. In particular, this section in the documentation implies that this should no longer be a problem! [...]

On a fundamental level, any tool for rolling deployment should be able to verify that the new pod is up and running, that the new pod is successfully routed to and only then start shutting down the old pod. Fixing this in the app is really the wrong place.

lforst · 2025-12-27T08:21:43Z

On a fundamental level, any tool for rolling deployment should be able to verify that the new pod is up and running, that the new pod is successfully routed to and only then start shutting down the old pod. Fixing this in the app is really the wrong place.

@wolfgangwalther For what it's worth, I wholeheartedly agree with you. In our setup, we are using AWS ECS and ALB. We explored two potential solutions to the problem at hand: We either add an upstream option (which would be this PR), or we add a whacky script / docker command override to our setup that completely traps the SIGTERM or at least delays it. AWS doesn't provide a better way of configuring the signal. Both solutions are far from optimal. It almost seems like AWS assumes that the behavior of applications after receiving a SIGTERM should be to continue functioning like normal (which makes absolutely no fucking sense to me whatsoever).

In a perfect world, I don't think this option would be necessary. In the face of reality, this option is likely very useful for anybody using PostgREST at large scale with AWS ECS. I am fine with any outcome regarding this PR.

…ERM`

steve-chavez · 2026-01-12T08:27:05Z

In the face of reality, this option is likely very useful for anybody using PostgREST at large scale with AWS ECS.

@wolfgangwalther Do you think we should merge given that this is a common use case for AWS ECS?

Also I'm wondering if this would make more sense if we introduce a "Deployment" section to the docs (we've been asked about this before) and add AWS ECS there and showcase this feature.

wolfgangwalther · 2026-01-12T20:20:00Z

There seems to be some prior art around this:

Looking at the 3 provided links, it's clear that this would only solve half the problem: Even if we delay termination, we will still, I believe, immediately exit - and cancel any ongoing requests (not 100% sure whether that's actually true?). We don't have a concept of a graceful shutdown. But to achieve zero-downtime rolling deployments, we also can't drop these requests.

I think the proper thing to do is:

Make PostgREST handle different signals differently, aka implement "hard shutdown" and "graceful shutdown".
Fix Kubernetes to allow configuration of a delay between "removal of endpoint from services" and "shutting down the pod". I'd be surprised if there wasn't already an upstream issue for it.

steve-chavez · 2026-01-15T07:45:29Z

I believe, immediately exit - and cancel any ongoing requests (not 100% sure whether that's actually true?)

Currently when we receive a SIGTERM we don't cancel ongoing requests, we only stop accepting new ones.

Instead of delaying SIGTERM for an unknown X time, perhaps we should have a new SIGTERM mode that keeps accepting new requests and quits once a Y time has passed where no new requests are received? That sounds like it could be the default mode too.

feat: add option to delay shutdown initiation after receiving a `SIGT…

74d0d43

…ERM`

lforst force-pushed the lforst-sigterm-delay branch from a489a1d to 74d0d43 Compare December 29, 2025 02:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

add: add option to delay shutdown initiation after receiving a `SIGTERM` #4580

add: add option to delay shutdown initiation after receiving a `SIGTERM` #4580

Uh oh!

lforst commented Dec 23, 2025

Uh oh!

steve-chavez commented Dec 23, 2025 •

edited

Loading

Uh oh!

steve-chavez commented Dec 23, 2025

Uh oh!

wolfgangwalther commented Dec 26, 2025

Uh oh!

lforst commented Dec 27, 2025

Uh oh!

steve-chavez commented Jan 12, 2026

Uh oh!

wolfgangwalther commented Jan 12, 2026

Uh oh!

steve-chavez commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Uh oh!

add: add option to delay shutdown initiation after receiving a SIGTERM #4580

Are you sure you want to change the base?

add: add option to delay shutdown initiation after receiving a SIGTERM #4580

Uh oh!

Conversation

lforst commented Dec 23, 2025

Uh oh!

steve-chavez commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

steve-chavez commented Dec 23, 2025

Uh oh!

wolfgangwalther commented Dec 26, 2025

Uh oh!

lforst commented Dec 27, 2025

Uh oh!

steve-chavez commented Jan 12, 2026

Uh oh!

wolfgangwalther commented Jan 12, 2026

Uh oh!

steve-chavez commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

add: add option to delay shutdown initiation after receiving a `SIGTERM` #4580

add: add option to delay shutdown initiation after receiving a `SIGTERM` #4580

steve-chavez commented Dec 23, 2025 •

edited

Loading