Skip to content

Conversation

@kpumuk
Copy link
Contributor

@kpumuk kpumuk commented Nov 19, 2025

This pull request enables TCP_NODELAY flag on the accepted socket in the ServerSocket transport in Ruby.

The change brings Ruby server transport to the same level as servers in other languages, for example C++.

Bechmarks

Changing socket flags is an obviously risky operation. There is a good reason for this though: it improves TLS server performance speed up to 13x for some workloads.

Note

Currently benchmark script has several breaking issues:

  • Does not work on modern Ruby versions because of use of Fixnum in the BaseProtocol (write_type and read_type)
  • Uses NonblockingServer, which seem to be incompatible with SSLServerSocket
  • Does not actually support TLS at the moment. I have updated it to support TLS, and will submit another separate patch (here)

Given all this, I will be using ThreadPoolServer for benchmarking, and a script that has SSL with an elliptic curve cryptography for best performance. I am testing with number of processes set to 4, as Ruby GIL affects how much a single multi-threaded server can process concurrently.

Baseline - Non TLS Server

BeforeAfter

$ export THRIFT_SERVER=Thrift::ThreadPoolServer
$ export THRIFT_NUM_CLIENTS=100
$ export THRIFT_NUM_PROCESSES=4
$ ruby benchmark/benchmark.rb
Starting server...
Spawning benchmark processes...
Collecting output...
Translating output...
Analyzing output...

Server class: Thrift::ThreadPoolServer
Server interpreter: ruby
Client interpreter: ruby
Socket class: Thrift::Socket
Number of processes: 4
Clients per process: 100
Calls per client: 50
Using fastthread: no

Connection failures: 0
Connection errors: 0
Average time per call: 0.0003 seconds
Average time per client (50 calls): 0.0167 seconds
Total time for all calls: 6.3130 seconds
Real time for benchmarking: 1.8681 seconds
Shortest call time: 0.0001 seconds
Longest call time: 0.0045 seconds
Shortest client time (50 calls): 0.0071 seconds
Longest client time (50 calls): 0.0211 seconds


$ export THRIFT_SERVER=Thrift::ThreadPoolServer
$ export THRIFT_NUM_CLIENTS=100
$ export THRIFT_NUM_PROCESSES=4
$ ruby benchmark/benchmark.rb
benchmark/benchmark.rb
Starting server...
Spawning benchmark processes...
Collecting output...
Translating output...
Analyzing output...

Server class: Thrift::ThreadPoolServer
Server interpreter: ruby
Client interpreter: ruby
Socket class: Thrift::Socket
Number of processes: 4
Clients per process: 100
Calls per client: 50
Using fastthread: no

Connection failures: 0
Connection errors: 0
Average time per call: 0.0003 seconds
Average time per client (50 calls): 0.0164 seconds
Total time for all calls: 6.2003 seconds
Real time for benchmarking: 1.8400 seconds
Shortest call time: 0.0001 seconds
Longest call time: 0.0036 seconds
Shortest client time (50 calls): 0.0072 seconds
Longest client time (50 calls): 0.0197 seconds

No visible performance difference.

TLS server with multiple requests per connection

Before (server with TLS)

BeforeAfter

$ export THRIFT_TLS=true
$ export THRIFT_SERVER=Thrift::ThreadPoolServer
$ export THRIFT_NUM_CLIENTS=100
$ export THRIFT_NUM_PROCESSES=4
$ ruby benchmark/benchmark.rb
Generating TLS certificate and key...
Starting server...
Spawning benchmark processes...
Collecting output...
Translating output...
Analyzing output...

Server class: Thrift::ThreadPoolServer
Server interpreter: ruby
Client interpreter: ruby
Socket class: Thrift::Socket
Number of processes: 4
Clients per process: 100
Calls per client: 50
Using fastthread: no

Connection failures: 0
Connection errors: 0
Average time per call: 0.0012 seconds
Average time per client (50 calls): 0.0631 seconds
Total time for all calls: 24.2655 seconds
Real time for benchmarking: 6.6075 seconds
Shortest call time: 0.0001 seconds
Longest call time: 0.0602 seconds
Shortest client time (50 calls): 0.0500 seconds
Longest client time (50 calls): 0.0806 seconds


$ export THRIFT_TLS=true
$ export THRIFT_SERVER=Thrift::ThreadPoolServer
$ export THRIFT_NUM_CLIENTS=100
$ export THRIFT_NUM_PROCESSES=4
$ ruby benchmark/benchmark.rb
Generating TLS certificate and key...
Starting server...
Spawning benchmark processes...
Collecting output...
Translating output...
Analyzing output...

Server class: Thrift::ThreadPoolServer
Server interpreter: ruby
Client interpreter: ruby
Socket class: Thrift::Socket
Number of processes: 4
Clients per process: 100
Calls per client: 50
Using fastthread: no

Connection failures: 0
Connection errors: 0
Average time per call: 0.0004 seconds
Average time per client (50 calls): 0.0217 seconds
Total time for all calls: 7.5610 seconds
Real time for benchmarking: 2.4709 seconds
Shortest call time: 0.0001 seconds
Longest call time: 0.0049 seconds
Shortest client time (50 calls): 0.0095 seconds
Longest client time (50 calls): 0.0302 seconds

This brings about 3x performance improvement.

TLS server with a single request per connection

BeforeAfter

$ export THRIFT_TLS=true
$ export THRIFT_SERVER=Thrift::ThreadPoolServer
$ export THRIFT_NUM_CLIENTS=1000 THRIFT_NUM_PROCESSES=4
$ export THRIFT_NUM_CALLS=1
$ ruby benchmark/benchmark.rb
Generating TLS certificate and key...
Starting server...
Spawning benchmark processes...
Collecting output...
Translating output...
Analyzing output...

Server class: Thrift::ThreadPoolServer
Server interpreter: ruby
Client interpreter: ruby
Socket class: Thrift::Socket
Number of processes: 4
Clients per process: 1000
Calls per client: 1
Using fastthread: no

Connection failures: 0
Connection errors: 0
Average time per call: 0.0511 seconds
Average time per client (1 calls): 0.0529 seconds
Total time for all calls: 204.2465 seconds
Real time for benchmarking: 54.2907 seconds
Shortest call time: 0.0407 seconds
Longest call time: 0.0635 seconds
Shortest client time (1 calls): 0.0428 seconds
Longest client time (1 calls): 0.0651 seconds


$ export THRIFT_TLS=true
$ export THRIFT_SERVER=Thrift::ThreadPoolServer
$ export THRIFT_NUM_CLIENTS=1000
$ export THRIFT_NUM_PROCESSES=4
$ export THRIFT_NUM_CALLS=1
$ ruby benchmark/benchmark.rb
Generating TLS certificate and key...
Starting server...
Spawning benchmark processes...
Collecting output...
Translating output...
Analyzing output...

Server class: Thrift::ThreadPoolServer
Server interpreter: ruby
Client interpreter: ruby
Socket class: Thrift::Socket
Number of processes: 4
Clients per process: 1000
Calls per client: 1
Using fastthread: no

Connection failures: 0
Connection errors: 0
Average time per call: 0.0009 seconds
Average time per client (1 calls): 0.0033 seconds
Total time for all calls: 3.7327 seconds
Real time for benchmarking: 4.4765 seconds
Shortest call time: 0.0001 seconds
Longest call time: 0.0051 seconds
Shortest client time (1 calls): 0.0016 seconds
Longest client time (1 calls): 0.0079 seconds

The difference is 13x for the total time, and 68x if measured CPU time required to send those messages on all CPUs.

  • Did you create an Apache Jira ticket? THRIFT-5904
  • If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
  • Did you squash your changes to a single commit? (not required, but preferred)
  • Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
  • If your change does not involve any code, include [skip ci] anywhere in the commit message to free up build resources.

@Jens-G Jens-G added the ruby label Nov 19, 2025
@Jens-G Jens-G changed the title ruby: Set TCP_NODELAY on accepted sockets Set TCP_NODELAY on accepted sockets Nov 19, 2025
@kpumuk kpumuk changed the title Set TCP_NODELAY on accepted sockets THRIFT-5904: Set TCP_NODELAY on accepted sockets Nov 20, 2025
@Jens-G Jens-G merged commit 10d5a65 into apache:master Nov 21, 2025
30 checks passed
@kpumuk kpumuk deleted the nodelay branch December 1, 2025 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants