Switch to gunicorn's gthread worker type (#242)

edmorley · web-flow · commit a538e3c0c0ad · 2024-11-21T00:57:10.000Z
gunicorn's default worker type is the `sync` worker, which is more suited to CPU/network-bandwidth bound workloads. As such this switches to the thread-based `gthread` worker for improved request throughput performance for blocking I/O workloads - given that it's common for Heroku apps to make blocking requests to external APIs/datastores etc. The threads count of 5 is somewhat arbitrary, but seemed to work well enough with rough benchmarking against a few different dyno sizes (where the process count will vary according to `WEB_CONCURRENCY`) and simulated workload types. The `preload_app` option has also been enabled, which makes gunicorn load the app before the worker processes are forked, to reduce memory usage and boot times. See: https://docs.gunicorn.org/en/stable/design.html#server-model https://docs.gunicorn.org/en/stable/settings.html#worker-class GUS-W-17236632.
diff --git a/gunicorn.conf.py b/gunicorn.conf.py
@@ -14,3 +14,37 @@
 # Bind to the IPv6 interface instead of the gunicorn default of IPv4, so the app works in IPv6-only
 # environments. IPv4 connections will still work so long as `IPV6_V6ONLY` hasn't been enabled.
 bind = [f"[::]:{_port}"]
+
+# The default `sync` worker is more suited to CPU/network-bandwidth bound workloads, so we
+# instead use the thread based worker type for improved support of blocking I/O workloads:
+# https://docs.gunicorn.org/en/stable/design.html#server-model
+#
+# If you need to further improve the performance of blocking I/O workloads, you may want to
+# try the `gevent` worker type, though you will need to disable `preload_app`, enable DB
+# connecting pooling, and be aware that gevent's monkey patching can break some packages.
+#
+# Note: When changing the number of dynos/workers/threads you will want to make sure you
+# do not exceed the maximum number of connections to external services such as DBs:
+# https://devcenter.heroku.com/articles/python-concurrency-and-database-connections
+worker_class = "gthread"
+
+# gunicorn will start this many worker processes. The Python buildpack automatically sets a
+# default for WEB_CONCURRENCY at dyno boot, based on the number of CPUs and available RAM:
+# https://devcenter.heroku.com/articles/python-concurrency
+workers = os.environ.get("WEB_CONCURRENCY", 1)
+
+# Each `gthread` worker process will use a pool of this many threads.
+threads = 5
+
+# Load the app before the worker processes are forked, to reduce memory usage and boot times.
+preload_app = True
+
+# Workers silent for more than this many seconds are killed and restarted.
+# Note: This only affects the maximum request time when using the `sync` worker.
+# For all other worker types it acts only as a worker heartbeat timeout.
+timeout = 20
+
+# After receiving a restart signal, workers have this much time to finish serving requests.
+# This should be set to a value less than the 30 second Heroku dyno shutdown timeout:
+# https://devcenter.heroku.com/articles/dyno-shutdown-behavior
+graceful_timeout = 20