Skip to content

Commit 177afd4

Browse files
committed
Add configurable HTTP health-check server
1 parent bd6b377 commit 177afd4

File tree

6 files changed

+267
-0
lines changed

6 files changed

+267
-0
lines changed

README.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ Solid Queue can be used with SQL databases such as MySQL, PostgreSQL, or SQLite,
2828
- [Failed jobs and retries](#failed-jobs-and-retries)
2929
- [Error reporting on jobs](#error-reporting-on-jobs)
3030
- [Puma plugin](#puma-plugin)
31+
- [Health-check HTTP server](#health-check-http-server)
3132
- [Jobs and transactional integrity](#jobs-and-transactional-integrity)
3233
- [Recurring tasks](#recurring-tasks)
3334
- [Inspiration](#inspiration)
@@ -603,6 +604,32 @@ that you set in production only. This is what Rails 8's default Puma config look
603604

604605
**Note**: phased restarts are not supported currently because the plugin requires [app preloading](https://github.com/puma/puma?tab=readme-ov-file#cluster-mode) to work.
605606

607+
## Health-check HTTP server
608+
609+
Solid Queue can start a tiny HTTP server to respond to basic health checks in the same process. This is useful for container orchestrators (e.g. Kubernetes) and external monitoring.
610+
611+
- Endpoints:
612+
- `/` and `/health`: returns `200 OK` with body `OK`
613+
- Any other path: returns `404 Not Found`
614+
- Disabled by default. When enabled, defaults are:
615+
- host: `ENV["SOLID_QUEUE_HTTP_HOST"]` or `"0.0.0.0"`
616+
- port: `ENV["SOLID_QUEUE_HTTP_PORT"]` or `9393`
617+
618+
Enable and configure via `config.solid_queue`:
619+
620+
```ruby
621+
# config/initializers/solid_queue.rb or config/application.rb
622+
Rails.application.configure do
623+
config.solid_queue.health_server_enabled = true
624+
# Optional overrides (defaults already read the env vars above)
625+
# config.solid_queue.health_server_host = "0.0.0.0"
626+
# config.solid_queue.health_server_port = 9393
627+
end
628+
```
629+
630+
Note:
631+
- When the Puma plugin is active (`plugin :solid_queue` in `puma.rb`), Solid Queue will skip starting the health server even if `health_server_enabled` is set. A warning is logged instead. This prevents running multiple embedded servers in the same process tree.
632+
606633
## Jobs and transactional integrity
607634
:warning: Having your jobs in the same ACID-compliant database as your application data enables a powerful yet sharp tool: taking advantage of transactional integrity to ensure some action in your app is not committed unless your job is also committed and vice versa, and ensuring that your job won't be enqueued until the transaction within which you're enqueuing it is committed. This can be very powerful and useful, but it can also backfire if you base some of your logic on this behaviour, and in the future, you move to another active job backend, or if you simply move Solid Queue to its own database, and suddenly the behaviour changes under you. Because this can be quite tricky and many people shouldn't need to worry about it, by default Solid Queue is configured in a different database as the main app.
608635

lib/puma/plugin/solid_queue.rb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ def start(launcher)
1313

1414
if Gem::Version.new(Puma::Const::VERSION) < Gem::Version.new("7")
1515
launcher.events.on_booted do
16+
SolidQueue.puma_plugin = true
1617
@solid_queue_pid = fork do
1718
Thread.new { monitor_puma }
1819
SolidQueue::Supervisor.start
@@ -23,6 +24,7 @@ def start(launcher)
2324
launcher.events.on_restart { stop_solid_queue }
2425
else
2526
launcher.events.after_booted do
27+
SolidQueue.puma_plugin = true
2628
@solid_queue_pid = fork do
2729
Thread.new { monitor_puma }
2830
SolidQueue::Supervisor.start

lib/solid_queue.rb

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,33 @@ module SolidQueue
4141
mattr_accessor :clear_finished_jobs_after, default: 1.day
4242
mattr_accessor :default_concurrency_control_period, default: 3.minutes
4343

44+
mattr_accessor :health_server_enabled, default: false
45+
mattr_accessor :health_server_host, default: ENV.fetch("SOLID_QUEUE_HTTP_HOST", "0.0.0.0")
46+
mattr_accessor :health_server_port, default: (ENV["SOLID_QUEUE_HTTP_PORT"] || "9393").to_i
47+
48+
mattr_accessor :puma_plugin, default: false
49+
50+
def start_health_server
51+
return unless health_server_enabled
52+
53+
if puma_plugin
54+
logger.warn("SolidQueue health server is enabled but Puma plugin is active; skipping starting health server to avoid duplicate servers") if logger
55+
return nil
56+
end
57+
58+
server = SolidQueue::HealthServer.new(
59+
host: health_server_host,
60+
port: health_server_port,
61+
logger: logger
62+
)
63+
64+
on_start { server.start }
65+
on_stop { server.stop }
66+
on_exit { server.stop }
67+
68+
server
69+
end
70+
4471
delegate :on_start, :on_stop, :on_exit, to: Supervisor
4572

4673
[ Dispatcher, Scheduler, Worker ].each do |process|

lib/solid_queue/engine.rb

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,5 +37,11 @@ class Engine < ::Rails::Engine
3737
include ActiveJob::ConcurrencyControls
3838
end
3939
end
40+
41+
initializer "solid_queue.health_server" do
42+
ActiveSupport.on_load(:solid_queue) do
43+
SolidQueue.start_health_server
44+
end
45+
end
4046
end
4147
end

lib/solid_queue/health_server.rb

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
# frozen_string_literal: true
2+
3+
require "socket"
4+
require "logger"
5+
6+
module SolidQueue
7+
class HealthServer
8+
def initialize(host:, port:, logger: nil)
9+
@host = host
10+
@port = port
11+
@logger = logger || default_logger
12+
@server = nil
13+
@thread = nil
14+
end
15+
16+
def start
17+
return if running?
18+
19+
@thread = Thread.new do
20+
begin
21+
@server = TCPServer.new(@host, @port)
22+
log_info("listening on #{@host}:#{@port}")
23+
24+
loop do
25+
socket = @server.accept
26+
begin
27+
request_line = socket.gets
28+
path = request_line&.split(" ")&.at(1) || "/"
29+
30+
if path == "/" || path == "/health"
31+
body = "OK"
32+
status_line = "HTTP/1.1 200 OK"
33+
else
34+
body = "Not Found"
35+
status_line = "HTTP/1.1 404 Not Found"
36+
end
37+
38+
headers = [
39+
"Content-Type: text/plain",
40+
"Content-Length: #{body.bytesize}",
41+
"Connection: close"
42+
].join("\r\n")
43+
44+
socket.write("#{status_line}\r\n#{headers}\r\n\r\n#{body}")
45+
ensure
46+
begin
47+
socket.close
48+
rescue StandardError
49+
end
50+
end
51+
end
52+
rescue => e
53+
log_error("failed: #{e.class}: #{e.message}")
54+
ensure
55+
begin
56+
@server&.close
57+
rescue StandardError
58+
end
59+
end
60+
end
61+
end
62+
63+
def stop
64+
return unless running?
65+
66+
begin
67+
@server&.close
68+
rescue StandardError
69+
end
70+
71+
if @thread&.alive?
72+
@thread.kill
73+
@thread.join(1)
74+
end
75+
76+
@server = nil
77+
@thread = nil
78+
end
79+
80+
def running?
81+
@thread&.alive?
82+
end
83+
84+
private
85+
86+
def default_logger
87+
logger = Logger.new($stdout)
88+
logger.level = Logger::INFO
89+
logger.progname = "SolidQueueHTTP"
90+
logger
91+
end
92+
93+
def log_info(message)
94+
@logger&.info(message)
95+
end
96+
97+
def log_error(message)
98+
@logger&.error(message)
99+
end
100+
end
101+
end

test/unit/health_server_test.rb

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# frozen_string_literal: true
2+
3+
require "test_helper"
4+
require "net/http"
5+
require "socket"
6+
require "stringio"
7+
8+
class HealthServerTest < ActiveSupport::TestCase
9+
def setup
10+
@host = "127.0.0.1"
11+
@port = available_port(@host)
12+
@server = SolidQueue::HealthServer.new(host: @host, port: @port, logger: Logger.new(IO::NULL))
13+
@server.start
14+
wait_for_server
15+
end
16+
17+
def teardown
18+
@server.stop if defined?(@server)
19+
end
20+
21+
def test_health_endpoint_returns_ok
22+
response = http_get("/health")
23+
assert_equal "200", response.code
24+
assert_equal "OK", response.body
25+
end
26+
27+
def test_root_endpoint_returns_ok
28+
response = http_get("/")
29+
assert_equal "200", response.code
30+
assert_equal "OK", response.body
31+
end
32+
33+
def test_unknown_path_returns_not_found
34+
response = http_get("/unknown")
35+
assert_equal "404", response.code
36+
assert_equal "Not Found", response.body
37+
end
38+
39+
def test_stop_stops_server
40+
assert @server.running?, "server should be running before stop"
41+
@server.stop
42+
assert_not @server.running?, "server should not be running after stop"
43+
ensure
44+
# Avoid double-stop in teardown if we stopped here
45+
@server = SolidQueue::HealthServer.new(host: @host, port: @port, logger: Logger.new(IO::NULL))
46+
end
47+
48+
def test_engine_skips_starting_health_server_when_puma_plugin_is_active
49+
SolidQueue.health_server_enabled = true
50+
SolidQueue.puma_plugin = true
51+
52+
server = SolidQueue.start_health_server
53+
assert_nil server
54+
ensure
55+
SolidQueue.health_server_enabled = false
56+
SolidQueue.puma_plugin = false
57+
end
58+
59+
def test_logs_warning_when_skipped_under_puma_plugin
60+
SolidQueue.health_server_enabled = true
61+
SolidQueue.puma_plugin = true
62+
63+
original_logger = SolidQueue.logger
64+
io = StringIO.new
65+
SolidQueue.logger = Logger.new(io)
66+
67+
server = SolidQueue.start_health_server
68+
assert_nil server
69+
70+
io.rewind
71+
output = io.read
72+
assert_includes output, "SolidQueue health server is enabled but Puma plugin is active; skipping starting health server to avoid duplicate servers"
73+
ensure
74+
SolidQueue.logger = original_logger if defined?(original_logger)
75+
SolidQueue.health_server_enabled = false
76+
SolidQueue.puma_plugin = false
77+
end
78+
79+
private
80+
def http_get(path)
81+
Net::HTTP.start(@host, @port) do |http|
82+
http.get(path)
83+
end
84+
end
85+
86+
def wait_for_server
87+
# Try to connect for up to 1 second
88+
deadline = Process.clock_gettime(Process::CLOCK_MONOTONIC) + 1.0
89+
begin
90+
Net::HTTP.start(@host, @port) { |http| http.head("/") }
91+
rescue Errno::ECONNREFUSED, Errno::EHOSTUNREACH
92+
raise if Process.clock_gettime(Process::CLOCK_MONOTONIC) > deadline
93+
sleep 0.05
94+
retry
95+
end
96+
end
97+
98+
def available_port(host)
99+
tcp = TCPServer.new(host, 0)
100+
port = tcp.addr[1]
101+
tcp.close
102+
port
103+
end
104+
end

0 commit comments

Comments
 (0)