Skip to content

Commit 5f3d121

Browse files
authored
About perf (#45)
* introduce wamr jit perf support * a blog to introduce profiling with perf in wamr
1 parent e7375a7 commit 5f3d121

File tree

2 files changed

+16087
-0
lines changed

2 files changed

+16087
-0
lines changed
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
---
2+
title: "Profile Wasm applications with perf in WAMR JIT"
3+
description: "WAMR JIT supports linux perf"
4+
excerpt: ""
5+
date: 2023-12-22T13:14:15+08:00
6+
lastmod: 2023-12-22T13:14:15+08:00
7+
draft: false
8+
weight: 50
9+
images: []
10+
categories: ["profiling", "tool"]
11+
tags: ["profiling"]
12+
contributors: ["lum1n0us"]
13+
pinned: false
14+
homepage: false
15+
---
16+
17+
Profiling a Wasm application can provide valuable insights into its performance. In this blog post, we'll explore how to use [linux perf](https://perf.wiki.kernel.org/index.php/Main_Page) to analyze Wasm applications running on the WAMR with JIT compilation.
18+
19+
Linux perf is a versatile performance analysis tool that helps developers understand an optimize the behavior of their applications. It provides detailed information about various aspects of program execution, including CPU usage, memory access patterns, and function call traces.
20+
21+
## With perf report
22+
23+
Let's dive into profiling a Wasm application using WAMR(aot and jit) and linux perf.
24+
25+
1. Before profiling, recompile WAMR with the LLVM JIT and AOT compilation option:
26+
27+
```bash
28+
$ cmake -S . -B bulid -DWAMR_BUILD_JIT=1 -DWAMR_BUILD_AOT=1
29+
```
30+
31+
2. Use perf to profiling
32+
33+
```bash
34+
# perf.data.raw is perf output. it records all call stacks for every sample event.
35+
# but it can't translate jiited function address to jitted function name
36+
$ perf record -k mono -g --output=perf.data.raw -- iwasm --perf-profile <.wasm or .aot>
37+
```
38+
39+
2.1 merge jitted symbols information
40+
41+
*only if iwasm is running under jit mode. aot doesn't need this step*
42+
43+
``` bash
44+
# read jit-xxx.dump file generated by wamr and get jitted symbols information
45+
$ perf inject --jit --intput=perf.data.raw --output=perf.data
46+
```
47+
48+
You can use `perf report` to review _perf.data_. It includes performance counter profile information recorded via `perf record`.
49+
50+
```
51+
76.07% 0.00% iwasm libc.so.6 [.] __libc_start_call_main
52+
|
53+
---__libc_start_call_main
54+
main
55+
|
56+
--68.33%--app_instance_main
57+
wasm_application_execute_main
58+
execute_main
59+
wasm_runtime_call_wasm
60+
wasm_call_function
61+
call_wasm_with_hw_bound_check
62+
wasm_interp_call_wasm
63+
llvm_jit_call_func_bytecode
64+
wasm_runtime_invoke_native
65+
push_args_end
66+
aot_func#1
67+
aot_func#32
68+
|
69+
--68.33%--aot_func#5
70+
|
71+
--68.32%--aot_func#4
72+
|
73+
--68.19%--aot_func#2
74+
75+
68.33% 0.00% iwasm iwasm [.] app_instance_main
76+
|
77+
---app_instance_main
78+
wasm_application_execute_main
79+
execute_main
80+
wasm_runtime_call_wasm
81+
wasm_call_function
82+
call_wasm_with_hw_bound_check
83+
wasm_interp_call_wasm
84+
llvm_jit_call_func_bytecode
85+
wasm_runtime_invoke_native
86+
push_args_end
87+
aot_func#1
88+
aot_func#32
89+
|
90+
--68.33%--aot_func#5
91+
|
92+
--68.32%--aot_func#4
93+
|
94+
--68.19%--aot_func#2
95+
```
96+
97+
## With Flamegraph
98+
99+
[Flamegraph](https://github.com/brendangregg/FlameGraph0) is a visualization technique that represents the call stack of a program. They provide a clean overview of where CPU time is spent during program execution.
100+
101+
All based on previous generated _perf.data_. And need to download [FlameGraphs](https://github.com/brendangregg/FlameGraph) firstly.
102+
103+
```bash
104+
$ perf script -i perf.data > out.perf
105+
#fold stacks
106+
$ ./FlameGraph/stackcollapse-perf.pl out.perf > out.folded
107+
#render a flamegraph
108+
$ ./FlameGraph/flamegraph.pl out.folded.translated > perf.svg
109+
```
110+
111+
Because jitted functions all have the same form names like _aot_func#NUMBER_, it's hard for developers to understand. There is a script to do translation.
112+
113+
```bash
114+
# translate jitted function names into their original wasm function names
115+
$ python trans_wasm_func_name.py --wabt_home <wabt installation> --folded out.folded <wasm>
116+
#render a flamegraph
117+
$ ./FlameGraph/flamegraph.pl out.folded.translated > perf.svg
118+
```
119+
![example flamegraph](./perf.svg)
120+
121+
122+
**For more details, please refer to [doc](https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/doc/perf_tune.md#7-use-linux-perf)**

0 commit comments

Comments
 (0)