Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
d6fe489
Adding latency benchmark
franz1981 Jan 13, 2026
8ff7b3e
Variant with Netty timers + wait all requests to completes before shu…
franz1981 Jan 13, 2026
1030b48
Enhance SchedulerHandoffBenchmark with IO_URING support and RTT tracking
franz1981 Jan 14, 2026
a3d838c
Refactor SchedulerHandoffBenchmark to improve event loop thread handl…
franz1981 Jan 14, 2026
58891c5
Refactor SchedulerHandoffBenchmark to use RequestData for request and…
franz1981 Jan 14, 2026
bdee7cc
Update SchedulerHandoffBenchmark to adjust output time unit and add r…
franz1981 Jan 15, 2026
6fdab26
Implement end 2 end benchmark + scripts
franz1981 Jan 15, 2026
fd88fe7
Update run-benchmark.sh with default CPU affinities and JVM options, …
franz1981 Jan 15, 2026
ab29edb
Update run-benchmark.sh to use wrk and wrk2 from hyperfoil catalog in…
franz1981 Jan 15, 2026
3539b40
Switch to Apache HttpClient5 for blocking HTTP calls in HandoffHttpSe…
franz1981 Jan 15, 2026
92707b6
Use BasicHttpClientConnectionManager for per-request HttpClient in Ha…
franz1981 Jan 15, 2026
e176ce4
Remove bossGroup and use single workerGroup for Netty server initiali…
franz1981 Jan 15, 2026
e916236
Add --threads option to asprof commands and update pidstat to remove …
franz1981 Jan 15, 2026
9db6653
Fix concurrent usage of HttpGet and serialize access to the http client
franz1981 Jan 16, 2026
5b7c5b0
Enable the fj scheduler test to set the FJ parallelism
franz1981 Jan 16, 2026
53a76b8
improve the JMH benchmark to allow unbounded concurrency (not great TBH)
franz1981 Jan 16, 2026
2f40284
optimize yield to happen only if required
franz1981 Jan 16, 2026
cd874dc
Change back default server thread count and force wrk 2 to always pri…
franz1981 Jan 16, 2026
e4a0718
add perf stat to the mix
franz1981 Jan 16, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
229 changes: 229 additions & 0 deletions benchmark-runner/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,229 @@
# Benchmark Runner

A comprehensive benchmarking module for testing handoff strategies between Netty event loops and virtual threads.

## Overview

This module provides:

1. **HandoffHttpServer** - An HTTP server that:
- Receives requests on Netty event loops
- Hands off to virtual threads (with configurable scheduler)
- Makes blocking HTTP calls to a mock backend using JDK HttpClient
- Parses JSON with Jackson
- Returns response via event loop

2. **run-benchmark.sh** - A complete benchmarking script with:
- Mock server management
- CPU affinity control (taskset)
- Warmup phase (no profiling)
- Load generation via jbang wrk/wrk2
- Async-profiler integration
- pidstat monitoring
- Duration configuration with validation

## Quick Start

```bash
# Set JAVA_HOME to your Java 27 build
export JAVA_HOME=/path/to/jdk

# Build the module (includes both MockHttpServer and HandoffHttpServer)
mvn package -pl benchmark-runner -am -DskipTests

# Run a basic benchmark
cd benchmark-runner/scripts
./run-benchmark.sh
```

## Build Details

Everything is packaged in a single JAR: `benchmark-runner/target/benchmark-runner.jar`

| Class | Description |
|-------|-------------|
| `io.netty.loom.benchmark.runner.MockHttpServer` | Backend mock server (think time + JSON response) |
| `io.netty.loom.benchmark.runner.HandoffHttpServer` | Server under test (handoff logic) |

The `run-benchmark.sh` script will automatically build the JAR if missing.

## Configuration

All configuration is via environment variables:

### Mock Server
| Variable | Default | Description |
|----------|---------|-------------|
| `MOCK_PORT` | 8080 | Mock server port |
| `MOCK_THINK_TIME_MS` | 1 | Simulated processing delay (ms) |
| `MOCK_THREADS` | 1 | Number of Netty threads |
| `MOCK_TASKSET` | | CPU affinity (e.g., "0-1") |

### Handoff Server
| Variable | Default | Description |
|----------|---------|-------------|
| `SERVER_PORT` | 8081 | Server port |
| `SERVER_THREADS` | 1 | Number of event loop threads |
| `SERVER_USE_CUSTOM_SCHEDULER` | false | Use custom Netty scheduler |
| `SERVER_IO` | epoll | I/O type: epoll or nio |
| `SERVER_TASKSET` | | CPU affinity (e.g., "2-5") |
| `SERVER_JVM_ARGS` | | Additional JVM arguments |

### Load Generator
| Variable | Default | Description |
|----------|---------|-------------|
| `LOAD_GEN_CONNECTIONS` | 100 | Number of connections |
| `LOAD_GEN_THREADS` | 2 | Number of threads |
| `LOAD_GEN_RATE` | | Target rate (empty = max throughput with wrk) |
| `LOAD_GEN_TASKSET` | | CPU affinity (e.g., "6-7") |

### Timing
| Variable | Default | Description |
|----------|---------|-------------|
| `WARMUP_DURATION` | 10s | Warmup duration (no profiling) |
| `TOTAL_DURATION` | 30s | Total test duration |

### Profiling
| Variable | Default | Description |
|----------|---------|-------------|
| `ENABLE_PROFILER` | false | Enable async-profiler |
| `ASYNC_PROFILER_PATH` | | Path to async-profiler installation |
| `PROFILER_EVENT` | cpu | Profiler event type |
| `PROFILER_OUTPUT` | profile.html | Output filename |

### pidstat
| Variable | Default | Description |
|----------|---------|-------------|
| `ENABLE_PIDSTAT` | false | Enable pidstat collection |
| `PIDSTAT_INTERVAL` | 1 | Collection interval (seconds) |
| `PIDSTAT_OUTPUT` | pidstat.log | Output filename |

## Example Runs

### Basic comparison: custom vs default scheduler

```bash
# With custom scheduler
JAVA_HOME=/path/to/jdk \
SERVER_USE_CUSTOM_SCHEDULER=true \
./run-benchmark.sh

# With default scheduler
JAVA_HOME=/path/to/jdk \
SERVER_USE_CUSTOM_SCHEDULER=false \
./run-benchmark.sh
```

### With CPU pinning

```bash
JAVA_HOME=/path/to/jdk \
MOCK_TASKSET="0" \
SERVER_TASKSET="1-4" \
LOAD_GEN_TASKSET="5-7" \
SERVER_THREADS=4 \
SERVER_USE_CUSTOM_SCHEDULER=true \
./run-benchmark.sh
```

### With profiling

```bash
JAVA_HOME=/path/to/jdk \
ENABLE_PROFILER=true \
ASYNC_PROFILER_PATH=/path/to/async-profiler \
PROFILER_EVENT=cpu \
SERVER_USE_CUSTOM_SCHEDULER=true \
WARMUP_DURATION=15s \
TOTAL_DURATION=45s \
./run-benchmark.sh
```

### Rate-limited test with wrk2

```bash
JAVA_HOME=/path/to/jdk \
LOAD_GEN_RATE=10000 \
LOAD_GEN_CONNECTIONS=200 \
TOTAL_DURATION=60s \
WARMUP_DURATION=15s \
./run-benchmark.sh
```

### With pidstat monitoring

```bash
JAVA_HOME=/path/to/jdk \
ENABLE_PIDSTAT=true \
PIDSTAT_INTERVAL=1 \
./run-benchmark.sh
```

## Output

Results are saved to `./benchmark-results/` (configurable via `OUTPUT_DIR`):

- `wrk-results.txt` - Load generator output with throughput/latency
- `profile.html` - Flamegraph (if profiling enabled)
- `pidstat.log` - Thread-level CPU usage (if pidstat enabled)

## Architecture

```
┌─────────────────┐ ┌──────────────────────────┐ ┌─────────────────┐
│ wrk/wrk2 │────▶│ HandoffHttpServer │────▶│ MockHttpServer │
│ (load gen) │ │ │ │ │
└─────────────────┘ │ 1. Receive on EL │ │ Think time + │
│ 2. Handoff to VThread │ │ JSON response │
│ 3. Blocking HTTP call │ │ │
│ 4. Parse JSON (Jackson) │ └─────────────────┘
│ 5. Write back on EL │
└──────────────────────────┘
```

## Running Manually

### Mock Server

```bash
java -cp benchmark-runner/target/benchmark-runner.jar \
io.netty.loom.benchmark.runner.MockHttpServer \
8080 1 1 # port, thinkTimeMs, threads
```

### Handoff Server (with custom scheduler)

```bash
java \
--add-opens=java.base/java.lang=ALL-UNNAMED \
-XX:+UnlockExperimentalVMOptions \
-XX:-DoJVMTIVirtualThreadTransitions \
-Djdk.trackAllThreads=false \
-Djdk.virtualThreadScheduler.implClass=io.netty.loom.NettyScheduler \
-Djdk.pollerMode=3 \
-cp benchmark-runner/target/benchmark-runner.jar \
io.netty.loom.benchmark.runner.HandoffHttpServer \
--port 8081 \
--mock-url http://localhost:8080/fruits \
--threads 2 \
--use-custom-scheduler true \
--io epoll
```

### Handoff Server (with default scheduler)

```bash
java \
--add-opens=java.base/java.lang=ALL-UNNAMED \
-XX:+UnlockExperimentalVMOptions \
-XX:-DoJVMTIVirtualThreadTransitions \
-Djdk.trackAllThreads=false \
-cp benchmark-runner/target/benchmark-runner.jar \
io.netty.loom.benchmark.runner.HandoffHttpServer \
--port 8081 \
--mock-url http://localhost:8080/fruits \
--threads 2 \
--use-custom-scheduler false \
--io epoll
```

99 changes: 99 additions & 0 deletions benchmark-runner/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://maven.apache.org/POM/4.0.0"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>io.netty.loom</groupId>
<artifactId>netty-virtualthread-parent</artifactId>
<version>1.0-SNAPSHOT</version>
</parent>

<artifactId>benchmark-runner</artifactId>
<packaging>jar</packaging>

<dependencies>
<dependency>
<groupId>io.netty.loom</groupId>
<artifactId>netty-virtualthread-core</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
</dependency>
<!-- Jackson for JSON processing -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.17.0</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents.client5</groupId>
<artifactId>httpclient5</artifactId>
<version>5.6</version>
<scope>compile</scope>
</dependency>
<!-- Test dependencies -->
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-params</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.rest-assured</groupId>
<artifactId>rest-assured</artifactId>
<version>5.4.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.awaitility</groupId>
<artifactId>awaitility</artifactId>
<version>4.2.0</version>
<scope>test</scope>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.6.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<finalName>benchmark-runner</finalName>
<createDependencyReducedPom>false</createDependencyReducedPom>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>io.netty.loom.benchmark.runner.HandoffHttpServer</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.14.1</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
</plugin>
</plugins>
</build>
</project>

Loading