Can Hatchet Scale With Postgres, or Is Redis Needed for High Workloads? #2899
-
|
I came across this tweet where a team initially used Postgres as their job queue but hit scaling issues as their volume increased. They wrote:
I'm interested in building something similar, and I'm considering using Hatchet for scheduling and orchestration. Can Hatchet scale to high workloads while still using Postgres as its backend, or will I run into the same scaling issues as described above? Should I plan to use Redis (or another dedicated queue system) from day one, instead of starting with Postgres? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Hi @khanakia, the linked tweet is unfortunately sparse on details, so I'm afraid I can't tell you if you'll run into the same issues. I'm a little skeptical of the issue described though; for this type of file-processing system, you typically see each job passed through the queue correspond to at least one database write, but usually more (for example, updating the status of a file in the database). The Postgres queue should be some of the most lightweight writes in your system, so the overhead of the job queue shouldn't be greater than the overhead of your normal application writes. Unless of course the system was spiking under load and needed to ingest a large number of jobs and process them at a constant rate. The main question I would ask is: do you need a durable queue or do you need a message broker? A durable queue has nice features like being able to view the execution history of previous tasks, while a message broker is often built for pure throughput. In addition, the durability story of Redis-backed queues is lacking (many setups don't have AOF enabled by default). And if you need any sort of complex orchestration features, those are non-trivial to build yourself on top of Redis.
Note that Hatchet uses Postgres as the durable queue, and optionally uses RabbitMQ as a message broker for passing messages between Hatchet components. A Hatchet setup with a good database (16 vCPU and not IOPs-constrained) and using RabbitMQ for the broker can scale to thousands of tasks/second. We've scaled quite a bit further on Hatchet Cloud. If you are running Hatchet in Postgres-only mode and you are using features like streaming with large payloads, you might run into some issues. |
Beta Was this translation helpful? Give feedback.
Hi @khanakia, the linked tweet is unfortunately sparse on details, so I'm afraid I can't tell you if you'll run into the same issues.
I'm a little skeptical of the issue described though; for this type of file-processing system, you typically see each job passed through the queue correspond to at least one database write, but usually more (for example, updating the status of a file in the database). The Postgres queue should be some of the most lightweight writes in your system, so the overhead of the job queue shouldn't be greater than the overhead of your normal application writes. Unless of course the system was spiking under load and needed to ingest a large number of jobs and process…