Skip to content

Shutting down Mercure server takes indefinitely  #986

@bobvandevijver

Description

@bobvandevijver

I was looking into a very slow shutdown of the Mercure instance I am running on a Ubuntu server, using a systemd unit.

Unit file
[Unit]
Description=Mercure server
After=network.target

[Service]
Type=simple
EnvironmentFile=/opt/application/mercure/env
ExecStart=/opt/application/mercure/mercure run
Restart=always
RestartSec=5
RuntimeMaxSec=7d
WorkingDirectory=/opt/application/mercure
User=www-data

[Install]
WantedBy=multi-user.target
Logs
Nov 21 20:15:40 linux645 mercure[1197]: {"level":"info","ts":1732216540.629172,"msg":"shutting down apps, then terminating","signal":"SIGTERM"}
Nov 21 20:15:40 linux645 mercure[1197]: {"level":"warn","ts":1732216540.6292357,"msg":"exiting; byeee!! �<9F><91><8B>","signal":"SIGTERM"}
Nov 21 20:15:40 linux645 mercure[1197]: {"level":"info","ts":1732216540.6292503,"logger":"http","msg":"servers shutting down with eternal grace period"}
Nov 21 20:15:40 linux645 systemd[1]: Stopping mercure.service - Mercure server...
Nov 21 20:16:19 linux645 mercure[1197]: {"level":"info","ts":1732216579.8925092,"logger":"http.handlers.mercure","msg":"Subscriber disconnected","subscriber":{"id":"urn:uuid:adce74f5-1f27-4912-91b6-83cd4442f0a>
Nov 21 20:16:19 linux645 mercure[1197]: {"level":"info","ts":1732216579.89257,"logger":"http.log.access.log0","msg":"handled request","request":{"remote_ip":"127.0.0.1","remote_port":"55950","client_ip":"127.0>
Nov 21 20:16:31 linux645 mercure[1197]: {"level":"info","ts":1732216591.1713917,"logger":"http.handlers.mercure","msg":"Subscriber disconnected","subscriber":{"id":"urn:uuid:5db2a139-2a36-4f4d-a7e9-e4b1ed7d6fb>
Nov 21 20:16:31 linux645 mercure[1197]: {"level":"info","ts":1732216591.17145,"logger":"http.log.access.log0","msg":"handled request","request":{"remote_ip":"127.0.0.1","remote_port":"37544","client_ip":"127.0>
Nov 21 20:17:10 linux645 systemd[1]: mercure.service: State 'stop-sigterm' timed out. Killing.
Nov 21 20:17:10 linux645 systemd[1]: mercure.service: Killing process 1197 (mercure) with signal SIGKILL.
Nov 21 20:17:10 linux645 systemd[1]: mercure.service: Failed to kill control group /system.slice/mercure.service, ignoring: Invalid argument
Nov 21 20:17:10 linux645 systemd[1]: mercure.service: Main process exited, code=killed, status=9/KILL
Nov 21 20:17:10 linux645 systemd[1]: mercure.service: Failed with result 'timeout'.

From the logs it became clear that the default shutdown action (SIGTERM), while received perfectly fine by caddy/mercure, didn't really do anything except for denying new connections. After a 90 seconds, systemd would have had enough, and use SIGKILL to get rid of the process.

One line caught my attention:

servers shutting down with eternal grace period

This is something that is described in the Caddy manual under the grace_period option and allows the clients to complete their request before the server is actually shutdown. However, with SSE and long running connections, this does not really make sense as those clients will keep the connection option.

I have now added an additional env variable to my configuration:

GLOBAL_OPTIONS=grace_period 1s

And immediately mercure shutdowns fast, although with some errors in the log:

Logs with 1s `grace_period`
Nov 21 20:41:43 linux645 mercure[46014]: {"level":"info","ts":1732218103.2779043,"msg":"shutting down apps, then terminating","signal":"SIGTERM"}
Nov 21 20:41:43 linux645 mercure[46014]: {"level":"warn","ts":1732218103.2779415,"msg":"exiting; byeee!! 👋","signal":"SIGTERM"}
Nov 21 20:41:43 linux645 mercure[46014]: {"level":"info","ts":1732218103.277979,"logger":"http","msg":"servers shutting down; grace period initiated","duration":1}
Nov 21 20:41:43 linux645 systemd[1]: Stopping nl-mercure.service - Mercure server...
Nov 21 20:41:44 linux645 mercure[46014]: {"level":"error","ts":1732218104.278682,"logger":"http","msg":"server shutdown","error":"context deadline exceeded","addresses":[":3000"]}
Nov 21 20:41:44 linux645 mercure[46014]: {"level":"warn","ts":1732218104.2788424,"logger":"http.handlers.mercure","msg":"Failed to remove subscriber on shutdown","subscriber":{},"error":"hub: read/write on closed Transport"}
Nov 21 20:41:44 linux645 mercure[46014]: {"level":"info","ts":1732218104.2788796,"logger":"http.handlers.mercure","msg":"LocalSubscriber disconnected","subscriber":{}}
Nov 21 20:41:44 linux645 mercure[46014]: {"level":"warn","ts":1732218104.2788115,"logger":"http.handlers.mercure","msg":"Failed to remove subscriber on shutdown","subscriber":{},"error":"hub: read/write on closed Transport"}
Nov 21 20:41:44 linux645 mercure[46014]: {"level":"info","ts":1732218104.27891,"logger":"http.handlers.mercure","msg":"LocalSubscriber disconnected","subscriber":{}}
Nov 21 20:41:44 linux645 mercure[46014]: {"level":"info","ts":1732218104.27891,"logger":"http.log.access.log0","msg":"handled request","request":{"remote_ip":"127.0.0.1","remote_port":"46232","client_ip":"127.0.0.1","proto":"HTTP/1.1","method":"GET","host":"localhost:3000","uri":"/.well-known/mercure?lastEventID=urn%3Auuid%3A74003c94-cdc2-4fdc-8948-d4db509702ff&topic=%2A","headers":{REDACTED}},"bytes_read":0,"user_id":"","duration":38.964192544,"size":407,"status":200,"resp_headers":{REDACTED}}
Nov 21 20:41:44 linux645 mercure[46014]: {"level":"info","ts":1732218104.2789543,"logger":"http.log.access.log0","msg":"handled request","request":{"remote_ip":"127.0.0.1","remote_port":"53440","client_ip":"127.0.0.1","proto":"HTTP/1.1","method":"GET","host":"localhost:3000","uri":"/.well-known/mercure?lastEventID=urn%3Auuid%3Add133490-ed99-4d32-968a-40ba36492738&topic=%2A","headers":{REDACTED},"bytes_read":0,"user_id":"","duration":41.440676106,"size":4,"status":200,"resp_headers":{REDACTED}}
Nov 21 20:41:44 linux645 mercure[46014]: {"level":"info","ts":1732218104.279367,"logger":"admin","msg":"stopped previous server","address":"localhost:2019"}
Nov 21 20:41:44 linux645 mercure[46014]: {"level":"info","ts":1732218104.2793803,"msg":"shutdown complete","signal":"SIGTERM","exit_code":0}
Nov 21 20:41:44 linux645 systemd[1]: mercure.service: Deactivated successfully.
Nov 21 20:41:44 linux645 systemd[1]: Stopped mercure.service - Mercure server.

As I did not find any note about this in the documentation, I am wondering what the best course of action is. Is this be something that should only be added to the documentation, or is this maybe even something that we can can configure by default in the Caddyfile that is bundled with the tar.gz releases?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions