Skip to content

Commit ff5128b

Browse files
authored
Merge pull request #2183 from iJobsYuYing/main
add Java Performance Analysis - FlameGraph
2 parents ea51ef3 + 794cc1d commit ff5128b

File tree

11 files changed

+239
-1
lines changed

11 files changed

+239
-1
lines changed
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
---
2+
title: Setup Tomcat Benchmark Environment
3+
weight: 2
4+
5+
### FIXED, DO NOT MODIFY
6+
layout: learningpathall
7+
---
8+
9+
10+
## Before You Begin
11+
- There are numerous performance analysis methods and tools for Java applications, among which the call stack flame graph method is regarded as a conventional entry-level approach. Therefore, generating flame graphs is considered a basic operation.
12+
- Various methods and tools are available for generating Java flame graphs, including async-profiler, Java Agent, jstack, JFR (Java Flight Recorder), etc.
13+
- This LP (Learning Path) focuses on introducing two simple and easy-to-use methods: async-profiler and Java Agent.
14+
15+
16+
## Setup Benchmark Server - Tomcat
17+
- [Apache Tomcat](https://tomcat.apache.org/) is an open-source Java Servlet container that enables running Java web applications, handling HTTP requests and serving dynamic content.
18+
- As a core component in Java web development, Apache Tomcat supports Servlet, JSP, and WebSocket technologies, providing a lightweight runtime environment for web apps.
19+
20+
1. So you should install Java Development Kit (JDK) first.
21+
```bash
22+
sudo apt update
23+
sudo apt install -y openjdk-21-jdk
24+
```
25+
26+
2. Second, you can install Tomcat by either [building it from source](https://github.com/apache/tomcat) or downloading the pre-built package simply from [the official website](https://tomcat.apache.org/whichversion.html)
27+
```bash
28+
wget -c https://dlcdn.apache.org/tomcat/tomcat-11/v11.0.9/bin/apache-tomcat-11.0.9.tar.gz
29+
tar xzf apache-tomcat-11.0.9.tar.gz
30+
```
31+
32+
3. If you intend to access the built-in Examples of Tomcat via an intranet IP or even an external IP, you need to modify a configuration file.
33+
```bash
34+
vim apache-tomcat-11.0.9/webapps/examples/META-INF/context.xml
35+
# change <Valve className="org.apache.catalina.valves.RemoteAddrValve" allow="127\.\d+\.\d+\.\d+|::1|0:0:0:0:0:0:0:1" />
36+
# to
37+
# <Valve className="org.apache.catalina.valves.RemoteAddrValve" allow=".*" />
38+
39+
# now you can start Tomcat Server
40+
./apache-tomcat-11.0.9/bin/startup.sh
41+
```
42+
43+
4. If you can access the page at "http://${tomcat_ip}:8080/examples" via a browser, congratulations-you can proceed to the next benchmarking step.
44+
45+
![example image alt-text#center](./_images/lp-tomcat-homepage.png "Tomcat-HomePage")
46+
47+
![example image alt-text#center](./_images/lp-tomcat-examples.png "Tomcat-Examples")
48+
49+
## Setup Benchmark Client - [wrk2](https://github.com/giltene/wrk2)
50+
- wrk2 is a high-performance HTTP benchmarking tool specialized in generating constant throughput loads and measuring latency percentiles for web services.
51+
- As an enhanced version of wrk, wrk2 provides accurate latency statistics under controlled request rates, ideal for performance testing of HTTP servers.
52+
53+
1. If you intend to use wrk2, you should install some essential tools before build it.
54+
```bash
55+
sudo apt-get update
56+
sudo apt-get install -y build-essential libssl-dev git zlib1g-dev
57+
```
58+
59+
2. Now you can clone and build it from source.
60+
```bash
61+
sudo git clone https://github.com/giltene/wrk2.git
62+
cd wrk2
63+
sudo make
64+
# move the executable to somewhere in your PATH
65+
sudo cp wrk /usr/local/bin
66+
```
67+
68+
3. Finally, you can run the benchamrk of Tomcat through wrk2.
69+
```bash
70+
wrk -c32 -t16 -R50000 -d60 http://${tomcat_ip}:8080/examples/servlets/servlet/HelloWorldExample
71+
```
72+
Below is the output of wrk2
73+
```console
74+
Running 1m test @ http://172.26.203.139:8080/examples/servlets/servlet/HelloWorldExample
75+
16 threads and 32 connections
76+
Thread calibration: mean lat.: 0.986ms, rate sampling interval: 10ms
77+
Thread calibration: mean lat.: 0.984ms, rate sampling interval: 10ms
78+
Thread calibration: mean lat.: 0.999ms, rate sampling interval: 10ms
79+
Thread calibration: mean lat.: 0.994ms, rate sampling interval: 10ms
80+
Thread calibration: mean lat.: 0.983ms, rate sampling interval: 10ms
81+
Thread calibration: mean lat.: 0.989ms, rate sampling interval: 10ms
82+
Thread calibration: mean lat.: 0.991ms, rate sampling interval: 10ms
83+
Thread calibration: mean lat.: 0.993ms, rate sampling interval: 10ms
84+
Thread calibration: mean lat.: 0.985ms, rate sampling interval: 10ms
85+
Thread calibration: mean lat.: 0.990ms, rate sampling interval: 10ms
86+
Thread calibration: mean lat.: 0.987ms, rate sampling interval: 10ms
87+
Thread calibration: mean lat.: 0.990ms, rate sampling interval: 10ms
88+
Thread calibration: mean lat.: 0.984ms, rate sampling interval: 10ms
89+
Thread calibration: mean lat.: 0.991ms, rate sampling interval: 10ms
90+
Thread calibration: mean lat.: 0.978ms, rate sampling interval: 10ms
91+
Thread calibration: mean lat.: 0.976ms, rate sampling interval: 10ms
92+
Thread Stats Avg Stdev Max +/- Stdev
93+
Latency 1.00ms 454.90us 5.09ms 63.98%
94+
Req/Sec 3.31k 241.68 4.89k 63.83%
95+
2999817 requests in 1.00m, 1.56GB read
96+
Requests/sec: 49997.08
97+
Transfer/sec: 26.57MB
98+
```
99+
100+
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
title: Java FlameGraph - Async-profiler
3+
weight: 3
4+
5+
### FIXED, DO NOT MODIFY
6+
layout: learningpathall
7+
---
8+
9+
## Java Flame Graph Generation via async-profiler [async-profiler](https://github.com/async-profiler/async-profiler) (Recommended)
10+
- async-profiler is a low-overhead sampling profiler for JVM applications, capable of capturing CPU, allocation, and lock events to generate actionable performance insights.
11+
- A lightweight tool for Java performance analysis, async-profiler produces flame graphs and detailed stack traces with minimal runtime impact, suitable for production environments.
12+
13+
You should deploy async-profiler on the same machine where Tomcat is running to ensure accurate performance profiling.
14+
1. Download async-profiler-4.0 and uncompress
15+
```bash
16+
wget -c https://github.com/async-profiler/async-profiler/releases/download/v4.0/async-profiler-4.0-linux-arm64.tar.gz
17+
tar xzf async-profiler-4.0-linux-arm64.tar.gz
18+
```
19+
20+
2. Run async-profiler to profile the Tomcat instance under benchmarking
21+
```bash
22+
cd async-profiler-4.0-linux-arm64/bin
23+
./asprof -d 10 -f profile.html $(jps | awk /Bootstrap/'{print $1}')
24+
# or
25+
./asprof -d 10 -f profile.html ${tomcat_process_id}
26+
```
27+
28+
3. Launch profile.html in a browser to analyse the profiling result
29+
30+
![example image alt-text#center](_images/lp-flamegraph-async.png "Java Flame Graph via async-profiler")
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
title: Java FlameGraph - Java Agent
3+
weight: 4
4+
5+
6+
### FIXED, DO NOT MODIFY
7+
layout: learningpathall
8+
---
9+
10+
## Java Flame Graph Generation via Java agent and perf
11+
To profile a Java application with perf and ensure proper symbol resolution, you must include libperf-jvmti.so when launching the Java application.
12+
- libperf-jvmti.so is a JVM TI agent library enabling perf to resolve Java symbols, facilitating accurate profiling of Java applications.
13+
- A specialized shared library, libperf-jvmti.so bridges perf and the JVM, enabling proper translation of memory addresses to Java method names during profiling.
14+
15+
1. Find and add libperf-jvmti.so to Java option
16+
```bash
17+
vi apache-tomcat-11.0.9/bin/catalina.sh
18+
# add JAVA_OPTS="$JAVA_OPTS -agentpath:/usr/lib/linux-tools-6.8.0-63/libperf-jvmti.so -XX:+PreserveFramePointer"
19+
cd apache-tomcat-11.0.9/bin
20+
./shutdown.sh
21+
./startup.sh
22+
```
23+
24+
2. Use perf to profile Tomcat, and restart wrk if necessary
25+
```bash
26+
sudo perf record -g -k1 -p $(jps | awk /Bootstrap/'{print $1}') -- sleep 10
27+
```
28+
29+
3. Convert the collected perf.data file into a Java flame graph using FlameGraph
30+
```bash
31+
git clone https://github.com/brendangregg/FlameGraph.git
32+
export PATH=$PATH:/root/FlameGraph
33+
sudo perf inject -j -i perf.data | perf script | stackcollapse-perf.pl | flamegraph.pl &> profile.svg
34+
```
35+
36+
4. Launch profile.svg in a browser to analyse the profiling result
37+
38+
![example image alt-text#center](_images/lp-flamegraph-agent.png "Java Flame Graph via Java agent and perf")
2.1 MB
Loading
1.19 MB
Loading
191 KB
Loading
750 KB
Loading
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
---
2+
title: Java Performance Analysis - FlameGraph
3+
4+
draft: true
5+
cascade:
6+
draft: true
7+
8+
minutes_to_complete: 30
9+
10+
who_is_this_for: This is an introductory guide for individuals aiming to perform performance analysis of Java applications on the ARM Neoverse platform using flame graphs.
11+
12+
learning_objectives:
13+
- How to set up tomcat benchmark environment
14+
- How to generate flame graphs for Java applications using async-profiler
15+
- How to generate flame graphs for Java applications using Java agent
16+
17+
prerequisites:
18+
- Basic familiarity with Java applications
19+
- Basic familiarity with flame graphs
20+
- Basic familiarity with Tomcat, wrk, etc
21+
22+
author: Ying Yu, Martin Ma
23+
24+
### Tags
25+
skilllevels: Introductory
26+
subjects: Java Performance Analysis
27+
armips:
28+
- Neoverse
29+
30+
tools_software_languages:
31+
- OpenJDK-21
32+
- Tomcat
33+
- Async-profiler
34+
- FlameGraph
35+
- wrk2
36+
operatingsystems:
37+
- Ubuntu 24
38+
39+
40+
further_reading:
41+
- resource:
42+
title: PLACEHOLDER MANUAL
43+
link: PLACEHOLDER MANUAL LINK
44+
type: documentation
45+
- resource:
46+
title: PLACEHOLDER BLOG
47+
link: PLACEHOLDER BLOG LINK
48+
type: blog
49+
- resource:
50+
title: PLACEHOLDER GENERAL WEBSITE
51+
link: PLACEHOLDER GENERAL WEBSITE LINK
52+
type: website
53+
54+
55+
56+
### FIXED, DO NOT MODIFY
57+
# ================================================================================
58+
weight: 1 # _index.md always has weight of 1 to order correctly
59+
layout: "learningpathall" # All files under learning paths have this same wrapper
60+
learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
61+
---
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
---
2+
# ================================================================================
3+
# FIXED, DO NOT MODIFY THIS FILE
4+
# ================================================================================
5+
weight: 21 # Set to always be larger than the content in this path to be at the end of the navigation.
6+
title: "Next Steps" # Always the same, html page title.
7+
layout: "learningpathall" # All files under learning paths have this same wrapper for Hugo processing.
8+
---

data/stats_current_test_info.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,3 +196,4 @@ sw_categories:
196196
zlib:
197197
readable_title: Learn how to build and use Cloudflare zlib on Arm servers
198198
tests_and_status: []
199+

0 commit comments

Comments
 (0)