Skip to content

Commit 559a9e0

Browse files
committed
Add configurable HTTP health-check server
1 parent bd6b377 commit 559a9e0

File tree

5 files changed

+218
-0
lines changed

5 files changed

+218
-0
lines changed

README.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Solid Queue can be used with SQL databases such as MySQL, PostgreSQL, or SQLite,
2828
- [Failed jobs and retries](#failed-jobs-and-retries)
2929
- [Error reporting on jobs](#error-reporting-on-jobs)
3030
- [Puma plugin](#puma-plugin)
31+
- [Health-check HTTP server](#health-check-http-server)
3132
- [Jobs and transactional integrity](#jobs-and-transactional-integrity)
3233
- [Recurring tasks](#recurring-tasks)
3334
- [Inspiration](#inspiration)
@@ -603,6 +604,29 @@ that you set in production only. This is what Rails 8's default Puma config look
603604

604605
**Note**: phased restarts are not supported currently because the plugin requires [app preloading](https://github.com/puma/puma?tab=readme-ov-file#cluster-mode) to work.
605606

607+
## Health-check HTTP server
608+
609+
Solid Queue can start a tiny HTTP server to respond to basic health checks in the same process. This is useful for container orchestrators (e.g. Kubernetes) and external monitoring.
610+
611+
- Endpoints:
612+
- `/` and `/health`: returns `200 OK` with body `OK`
613+
- Any other path: returns `404 Not Found`
614+
- Disabled by default. When enabled, defaults are:
615+
- host: `ENV["SOLID_QUEUE_HTTP_HOST"]` or `"0.0.0.0"`
616+
- port: `ENV["SOLID_QUEUE_HTTP_PORT"]` or `9393`
617+
618+
Enable and configure via `config.solid_queue`:
619+
620+
```ruby
621+
# config/initializers/solid_queue.rb or config/application.rb
622+
Rails.application.configure do
623+
config.solid_queue.health_server_enabled = true
624+
# Optional overrides (defaults already read the env vars above)
625+
# config.solid_queue.health_server_host = "0.0.0.0"
626+
# config.solid_queue.health_server_port = 9393
627+
end
628+
```
629+
606630
## Jobs and transactional integrity
607631
:warning: Having your jobs in the same ACID-compliant database as your application data enables a powerful yet sharp tool: taking advantage of transactional integrity to ensure some action in your app is not committed unless your job is also committed and vice versa, and ensuring that your job won't be enqueued until the transaction within which you're enqueuing it is committed. This can be very powerful and useful, but it can also backfire if you base some of your logic on this behaviour, and in the future, you move to another active job backend, or if you simply move Solid Queue to its own database, and suddenly the behaviour changes under you. Because this can be quite tricky and many people shouldn't need to worry about it, by default Solid Queue is configured in a different database as the main app.
608632

lib/solid_queue.rb

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,10 @@ module SolidQueue
4141
mattr_accessor :clear_finished_jobs_after, default: 1.day
4242
mattr_accessor :default_concurrency_control_period, default: 3.minutes
4343

44+
mattr_accessor :health_server_enabled, default: false
45+
mattr_accessor :health_server_host, default: ENV.fetch("SOLID_QUEUE_HTTP_HOST", "0.0.0.0")
46+
mattr_accessor :health_server_port, default: (ENV["SOLID_QUEUE_HTTP_PORT"] || "9393").to_i
47+
4448
delegate :on_start, :on_stop, :on_exit, to: Supervisor
4549

4650
[ Dispatcher, Scheduler, Worker ].each do |process|

lib/solid_queue/engine.rb

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,5 +37,22 @@ class Engine < ::Rails::Engine
3737
include ActiveJob::ConcurrencyControls
3838
end
3939
end
40+
41+
initializer "solid_queue.health_server" do
42+
ActiveSupport.on_load(:solid_queue) do
43+
if SolidQueue.health_server_enabled
44+
server = SolidQueue::HealthServer.new(
45+
host: SolidQueue.health_server_host,
46+
port: SolidQueue.health_server_port,
47+
logger: SolidQueue.logger
48+
)
49+
50+
# Start with supervisor lifecycle so it runs in the main SolidQueue process
51+
SolidQueue.on_start { server.start }
52+
SolidQueue.on_stop { server.stop }
53+
SolidQueue.on_exit { server.stop }
54+
end
55+
end
56+
end
4057
end
4158
end

lib/solid_queue/health_server.rb

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# frozen_string_literal: true
2+
3+
require "socket"
4+
require "logger"
5+
6+
module SolidQueue
7+
class HealthServer
8+
def initialize(host:, port:, logger: nil)
9+
@host = host
10+
@port = port
11+
@logger = logger || default_logger
12+
@server = nil
13+
@thread = nil
14+
end
15+
16+
def start
17+
return if running?
18+
19+
@thread = Thread.new do
20+
begin
21+
@server = TCPServer.new(@host, @port)
22+
log_info("listening on #{@host}:#{@port}")
23+
24+
loop do
25+
socket = @server.accept
26+
begin
27+
request_line = socket.gets
28+
path = request_line&.split(" ")&.at(1) || "/"
29+
30+
if path == "/" || path == "/health"
31+
body = "OK"
32+
status_line = "HTTP/1.1 200 OK"
33+
else
34+
body = "Not Found"
35+
status_line = "HTTP/1.1 404 Not Found"
36+
end
37+
38+
headers = [
39+
"Content-Type: text/plain",
40+
"Content-Length: #{body.bytesize}",
41+
"Connection: close"
42+
].join("\r\n")
43+
44+
socket.write("#{status_line}\r\n#{headers}\r\n\r\n#{body}")
45+
ensure
46+
begin
47+
socket.close
48+
rescue StandardError
49+
end
50+
end
51+
end
52+
rescue => e
53+
log_error("failed: #{e.class}: #{e.message}")
54+
ensure
55+
begin
56+
@server&.close
57+
rescue StandardError
58+
end
59+
end
60+
end
61+
end
62+
63+
def stop
64+
return unless running?
65+
66+
begin
67+
@server&.close
68+
rescue StandardError
69+
end
70+
71+
if @thread&.alive?
72+
@thread.kill
73+
@thread.join(1)
74+
end
75+
76+
@server = nil
77+
@thread = nil
78+
end
79+
80+
def running?
81+
@thread&.alive?
82+
end
83+
84+
private
85+
86+
def default_logger
87+
logger = Logger.new($stdout)
88+
logger.level = Logger::INFO
89+
logger.progname = "SolidQueueHTTP"
90+
logger
91+
end
92+
93+
def log_info(message)
94+
@logger&.info(message)
95+
end
96+
97+
def log_error(message)
98+
@logger&.error(message)
99+
end
100+
end
101+
end

test/unit/health_server_test.rb

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# frozen_string_literal: true
2+
3+
require "test_helper"
4+
require "net/http"
5+
require "socket"
6+
7+
class HealthServerTest < ActiveSupport::TestCase
8+
def setup
9+
@host = "127.0.0.1"
10+
@port = available_port(@host)
11+
@server = SolidQueue::HealthServer.new(host: @host, port: @port, logger: Logger.new(IO::NULL))
12+
@server.start
13+
wait_for_server
14+
end
15+
16+
def teardown
17+
@server.stop if defined?(@server)
18+
end
19+
20+
def test_health_endpoint_returns_ok
21+
response = http_get("/health")
22+
assert_equal "200", response.code
23+
assert_equal "OK", response.body
24+
end
25+
26+
def test_root_endpoint_returns_ok
27+
response = http_get("/")
28+
assert_equal "200", response.code
29+
assert_equal "OK", response.body
30+
end
31+
32+
def test_unknown_path_returns_not_found
33+
response = http_get("/unknown")
34+
assert_equal "404", response.code
35+
assert_equal "Not Found", response.body
36+
end
37+
38+
def test_stop_stops_server
39+
assert @server.running?, "server should be running before stop"
40+
@server.stop
41+
refute @server.running?, "server should not be running after stop"
42+
ensure
43+
# Avoid double-stop in teardown if we stopped here
44+
@server = SolidQueue::HealthServer.new(host: @host, port: @port, logger: Logger.new(IO::NULL))
45+
end
46+
47+
private
48+
def http_get(path)
49+
Net::HTTP.start(@host, @port) do |http|
50+
http.get(path)
51+
end
52+
end
53+
54+
def wait_for_server
55+
# Try to connect for up to 1 second
56+
deadline = Process.clock_gettime(Process::CLOCK_MONOTONIC) + 1.0
57+
begin
58+
Net::HTTP.start(@host, @port) { |http| http.head("/") }
59+
rescue Errno::ECONNREFUSED, Errno::EHOSTUNREACH
60+
raise if Process.clock_gettime(Process::CLOCK_MONOTONIC) > deadline
61+
sleep 0.05
62+
retry
63+
end
64+
end
65+
66+
def available_port(host)
67+
tcp = TCPServer.new(host, 0)
68+
port = tcp.addr[1]
69+
tcp.close
70+
port
71+
end
72+
end

0 commit comments

Comments
 (0)