Skip to content

Conversation

@franz1981
Copy link
Owner

This pull request introduces a new comprehensive benchmarking module for evaluating handoff strategies between Netty event loops and virtual threads. It provides a full implementation of an HTTP server (HandoffHttpServer) that demonstrates handoff logic, a mock backend server, configuration via environment variables, and a benchmarking script. The module is packaged as a Maven project with all dependencies and includes POJOs for JSON serialization/deserialization.

Benchmarking infrastructure and documentation

  • Added a detailed README.md describing the benchmarking module, usage instructions, environment variable configuration, example runs, output files, and manual server invocation.

Maven project setup

  • Added benchmark-runner/pom.xml to define the new Maven module, including dependencies for Netty, Jackson, Apache HttpClient, and test libraries. Configured the Maven Shade Plugin to produce a fat JAR with the correct main class.

Server implementation

  • Implemented HandoffHttpServer.java, an HTTP server that receives requests on Netty event loops, hands off processing to virtual threads (with optional custom scheduler), makes blocking HTTP calls to a mock backend, parses JSON responses, and writes results back to clients. Includes argument parsing and usage help.

Data model for JSON handling

  • Added Fruit.java and FruitsResponse.java POJOs to represent the mock server's JSON response structure, using Jackson annotations for robust serialization/deserialization. [1] [2]

Fixes #64

…add SERVER_POLLER_MODE and SERVER_FJ_PARALLELISM support
…zation in HandoffHttpServer and MockHttpServer
@franz1981
Copy link
Owner Author

The results so far are great:

  • custom scheduler w EPOLL: Requests/sec: 65844.80
  • built-in scheduler untuned (FJ parallelism not set): Requests/sec: 49578.43
  • built-in scheduler tuned (FJ parallelism capped + Netty event loops capped): Requests/sec: 55756.87

@franz1981
Copy link
Owner Author

franz1981 commented Jan 16, 2026

@He-Pin And it's not finished yet...
I've decided to run another test with 40K tps which looks sustainable by both the built-in and custom scheduler and...

  • custom scheduler during steady: avg 1.3 cores
  • built-int scheduler untuned during steady: avg 1.8 cores

Latencies are fluctuating because the custom scheduler has such a lower CPU usage that go idle much more, but still p99 is halved

@He-Pin
Copy link

He-Pin commented Jan 17, 2026

Would it be better to run on a bare metal box?

@franz1981
Copy link
Owner Author

franz1981 commented Jan 17, 2026

@He-Pin for cpu bound scenarios like this, I would expect a quite similar behaviour tbh - cost of syscalls could be less (but NAPI poll can still kicks in) and additional context switch due to IRQ handling.

If it's related the latencies, my "local" box at home is a Threadripper heavily tuned for low latency, disabling tubo boost, setting fixed frequencies etc etc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Write a benchmark to compare against the default scheduler

3 participants