Skip to content

Commit 32e9bed

Browse files
author
Ariel Ben-Yehuda
committed
doc: add design docs, document running without the agent
1 parent 7756e10 commit 32e9bed

File tree

5 files changed

+296
-3
lines changed

5 files changed

+296
-3
lines changed

.github/workflows/build.yml

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,12 +38,17 @@ jobs:
3838
- uses: Swatinem/rust-cache@v2
3939
- name: Build
4040
shell: bash
41-
run: cargo build --all-features --verbose --example simple
42-
- name: Upload artifact for testing
41+
run: cargo build --all-features --verbose --example simple --example pollcatch-without-agent
42+
- name: Upload example-simple for testing
4343
uses: actions/upload-artifact@v4
4444
with:
4545
name: example-simple
4646
path: ./target/debug/examples/simple
47+
- name: Upload example-pollcatch-without-agent for testing
48+
uses: actions/upload-artifact@v4
49+
with:
50+
name: example-pollcatch-without-agent
51+
path: ./target/debug/examples/pollcatch-without-agent
4752
build-decoder:
4853
name: Build Decoder
4954
runs-on: ubuntu-latest
@@ -86,11 +91,16 @@ jobs:
8691
with:
8792
name: example-simple
8893
path: ./tests
94+
- name: Download example-pollcatch-without-agent
95+
uses: actions/download-artifact@v4
96+
with:
97+
name: example-pollcatch-without-agent
98+
path: ./tests
8999
- name: Download async-profiler
90100
shell: bash
91101
working-directory: tests
92102
run: wget https://github.com/async-profiler/async-profiler/releases/download/v4.1/async-profiler-4.1-linux-x64.tar.gz -O async-profiler.tar.gz && tar xvf async-profiler.tar.gz && mv -vf async-profiler-*/lib/libasyncProfiler.so .
93103
- name: Run integration test
94104
shell: bash
95105
working-directory: tests
96-
run: chmod +x simple pollcatch-decoder && LD_LIBRARY_PATH=$PWD ./integration.sh && LD_LIBRARY_PATH=$PWD ./separate_runtime_integration.sh
106+
run: ls -l && chmod +x simple pollcatch-without-agent pollcatch-decoder && LD_LIBRARY_PATH=$PWD ./integration.sh && LD_LIBRARY_PATH=$PWD ./separate_runtime_integration.sh && LD_LIBRARY_PATH=$PWD ./test_pollcatch_without_agent.sh

DESIGN.md

Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
# Design Document: async-profiler Rust Agent
2+
3+
## Overview
4+
5+
The async-profiler Rust agent is an in-process profiling library that integrates with [async-profiler](https://github.com/async-profiler/async-profiler) to collect performance data and upload it to various backends. The agent is designed to run continuously in production environments with minimal overhead.
6+
7+
For a more how-to-focused guide on running the profiler in various contexts, read the README.
8+
9+
This guide is based on an AI-driven summary, but it includes many comments from the development team.
10+
11+
## Architecture
12+
13+
The async-profiler agent runs as an agent within a Rust process and profiles it using [async-profiler].
14+
15+
async-profiler is loaded, currently the agent only supports loading a `libasyncProfiler.so` dynamically
16+
via [libloading], but in future versions it might also be possible to statically or plain-dynamically
17+
link against it.
18+
19+
[async-profiler]: https://github.com/async-profiler/async-profiler
20+
[libloading]: https://crates.io/crates/libloading
21+
22+
## Code Architecture
23+
24+
The crate follows a modular architecture with clear separation of concerns:
25+
26+
```
27+
async-profiler-agent/
28+
├── src/
29+
│ ├── lib.rs # Public API and documentation
30+
│ ├── profiler.rs # Core profiler orchestration
31+
│ ├── asprof/ # async-profiler FFI bindings
32+
│ ├── metadata/ # Host and report metadata
33+
│ ├── pollcatch/ # Tokio poll time tracking
34+
│ └── reporter/ # Data upload backends
35+
├── examples/ # Sample applications
36+
├── decoder/ # JFR analysis tool
37+
└── tests/ # Integration tests
38+
```
39+
40+
## Core Modules
41+
42+
### 1. Profiler (`profiler`)
43+
44+
**Purpose**: Central orchestration of profiling lifecycle and data collection.
45+
46+
**Key Components**:
47+
- `Profiler` & `ProfilerBuilder`: Main entry point for starting profiling
48+
- `ProfilerOptions`: Profiling behavior configuration
49+
- `RunningProfiler`: Handle for controlling active profiler
50+
- `ProfilerEngine` trait: used to allow mocking async-profiler (the C library) during tests
51+
52+
#### Profiler lifecycle management
53+
54+
As of version 4.1, async-profiler does not have a mode where it can run continuously
55+
with bounded memory usage and periodically collect samples.
56+
57+
Therefore, every [`reporting_interval`] seconds, the async-profiler agent restarts async-profiler by sending a `stop` (which flushes the JFR file) and `start` commands.
58+
59+
This is managed by `Profiler` (see the [`profiler_tick`] function).
60+
61+
This is a supported async-profiler operation mode.
62+
63+
[`reporting_interval`]: https://docs.rs/async-profiler-agent/0.1/async_profiler_agent/profiler/struct.ProfilerBuilder.html#method.with_reporting_interval
64+
[`profiler_tick`]: https://github.com/async-profiler/rust-agent/blob/506718fff274b49cf2eb03305a4f9547b61720e3/src/profiler.rs#L1083
65+
66+
#### Agent lifecycle management
67+
68+
The async-profiler agent can be stopped and started at run-time.
69+
70+
Trying to start an async-profiler session when async-profiler is already running leads to an error from
71+
async-profiler, so if restarting the profiler is desired (possibly with a different configuration), it is needed
72+
to stop the profiler before starting it again.
73+
74+
When stopped, the async-profiler agent stops async-profiler, flushes the last profile to the recorder, and then signals
75+
that it has finished. After that signal, it is possible to start a different instance of the async-profiler
76+
agent on the same process.
77+
78+
#### Profiler configuration
79+
80+
async-profiler is configured via [`ProfilerOptions`] and [`ProfilerOptionsBuilder`]. You
81+
should read these docs along with the [async-profiler options docs], for more details.
82+
83+
[`ProfilerOptions`]: https://docs.rs/async-profiler-agent/0.1/async_profiler_agent/profiler/struct.ProfilerOptionsBuilder.html
84+
[`ProfilerOptionsBuilder`]: https://docs.rs/async-profiler-agent/0.1/async_profiler_agent/profiler/struct.ProfilerOptionsBuilder.html
85+
[async-profiler options docs]: https://github.com/async-profiler/async-profiler/blob/v4.0/docs/ProfilerOptions.md
86+
87+
#### JFR file rotation
88+
89+
async-profiler expects to be writing the current JFR to a "fresh" file path. To that
90+
effect, async-profiler creates 2 unnamed temporary files via `JfrFile`, and gives to
91+
async-profiler alternating paths of the form `/proc/self/fd/<N>` to write the
92+
JFRs into.
93+
94+
### 2. async-profiler FFI (`asprof`)
95+
96+
**Purpose**: Safe Rust bindings to the native async-profiler library.
97+
98+
**Key Components**:
99+
- `AsProf`: Safe interface to async-profiler
100+
- `raw`: Low-level FFI declarations
101+
102+
**Responsibilities**:
103+
- Dynamic loading of `libasyncProfiler.so` using [`libloading`]
104+
- Safe, Rust-native wrappers around C API calls
105+
106+
[libloading]: crates.io/crates/libloading
107+
108+
### 3. Metadata (`metadata/`)
109+
110+
**Purpose**: Host identification and report context information.
111+
112+
**Key Components**:
113+
- `AgentMetadata`: Host identification (EC2, Fargate, or generic)
114+
- `aws`: AWS-specific metadata autodetection via IMDS
115+
116+
The metadata is sent to the [`Reporter`] implementation, and can be used to
117+
identify the host that generated a particular profiling report. In the local reporter,
118+
it is ignored, In the S3 reporter, it is attached to the zip uploaded
119+
to S3 as `metadata.json`.
120+
121+
### 4. Reporters (`reporter/`)
122+
123+
**Purpose**: Pluggable backends for uploading profiling data.
124+
125+
**Key Components**:
126+
- [`Reporter`] trait: Common interface for all backends
127+
- [`LocalReporter`]: Filesystem output for development/testing
128+
- [`S3Reporter`]: AWS S3 upload with metadata
129+
- [`MultiReporter`]: Composition of multiple reporters
130+
131+
[`Reporter`]: https://docs.rs/async-profiler-agent/0.1/async_profiler_agent/reporter/trait.Reporter.html
132+
[`LocalReporter`]: https://docs.rs/async-profiler-agent/0.1/async_profiler_agent/reporter/local/struct.LocalReporter.html
133+
[`S3Reporter`]: https://docs.rs/async-profiler-agent/0.1/async_profiler_agent/reporter/s3/struct.S3Reporter.html
134+
[`MultiReporter`]: https://docs.rs/async-profiler-agent/0.1/async_profiler_agent/reporter/multi/struct.MultiReporter.html
135+
136+
The reporter trait is as follows:
137+
138+
```rust
139+
#[async_trait]
140+
pub trait Reporter: fmt::Debug {
141+
async fn report(
142+
&self,
143+
jfr: Vec<u8>,
144+
metadata: &ReportMetadata,
145+
) -> Result<(), Box<dyn std::error::Error + Send>>;
146+
}
147+
```
148+
149+
Customers whose needs are not suited by the built-in reporters might write their
150+
own reporters.
151+
152+
### 5. PollCatch (`pollcatch/`)
153+
154+
**Purpose**: Tokio-specific instrumentation for detecting long poll times.
155+
156+
**Key Components**:
157+
- `before_poll_hook()`: Pre-poll timestamp capture
158+
- `after_poll_hook()`: Post-poll analysis and reporting
159+
- `tsc.rs`: CPU timestamp counter utilities
160+
161+
**Responsibilities**:
162+
- Minimal-overhead poll time tracking
163+
- Integration with Tokio's task hooks
164+
- JFR event emission for long polls
165+
- CPU timestamp correlation with profiler samples
166+
167+
## Data Flow
168+
169+
1. **Initialization**: Profiler loads `libasyncProfiler.so` and initializes
170+
2. **Session Start**: Creates temporary JFR files and starts async-profiler
171+
3. **Continuous Profiling**: async-profiler collects samples to active JFR file
172+
4. **Periodic Reporting**:
173+
- Stop profiler and rotate JFR files
174+
- Read completed JFR data
175+
- Package with metadata
176+
- Upload via configured reporters
177+
- Restart profiler with new JFR file
178+
5. **Shutdown**: Stop profiler and perform final report
179+
180+
## Key Design Decisions
181+
182+
### Dual JFR File Strategy
183+
Uses two temporary files to enable continuous profiling during report uploads. While one file receives new samples, the other is being processed and uploaded.
184+
185+
### Builder Pattern Configuration
186+
Provides type-safe, ergonomic configuration with sensible defaults while supporting advanced customization.
187+
188+
### Trait-Based Reporters
189+
Enables pluggable upload destinations without coupling core profiling logic to specific backends.
190+
191+
### Optional AWS Integration
192+
AWS-specific features are behind feature flags, allowing use in non-AWS environments without unnecessary dependencies.
193+
194+
### Thread Safety
195+
Designed for multi-threaded environments with careful synchronization around profiler state and file operations.
196+
197+
## Feature Flags
198+
199+
- `s3`: Full S3 reporter with default AWS SDK features
200+
- `s3-no-defaults`: S3 reporter without default features (for custom TLS)
201+
- `aws-metadata`: AWS metadata detection with default features
202+
- `aws-metadata-no-defaults`: AWS metadata without default features
203+
- `__unstable-fargate-cpu-count`: Experimental Fargate CPU metrics
204+
205+
## Error Handling
206+
207+
The design emphasizes resilience:
208+
- Reporter errors don't stop profiling
209+
- Profiler errors are logged but allow graceful degradation
210+
- Resource cleanup is guaranteed via RAII patterns
211+
- Temporary file management prevents resource leaks
212+
213+
## Performance Considerations
214+
215+
- Minimal overhead during normal operation
216+
- JFR file I/O is asynchronous and non-blocking
217+
- PollCatch hooks are optimized for the common case (no sample)
218+
- Memory allocation is minimized in hot paths
219+
- Background reporting doesn't interfere with application performance

README.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,33 @@ If you can't use `tokio_unstable`, it is possible to wrap your tasks by instrume
123123
runs the risk of forgetting to instrument the task that is actually causing the high latency,
124124
and therefore it is strongly recommended to use `on_before_task_poll`/`on_after_task_poll`.
125125

126+
#### Using pollcatch without the agent
127+
128+
The recommended way of using async-profiler-agent is via async-profiler-agent's agent. However, in case your
129+
application is already integrated with some other mechanism that calls `async-profiler`, the
130+
`on_before_task_poll`/`on_after_task_poll` hooks just call the async-profiler [JFR Event API]. They can be used
131+
even if async-profiler is run via a mechanism different from the async-profiler Rust agent (for example, a
132+
Java-native async-profiler integration), though currently, the results from the JFR Event API are only exposed in
133+
async-profiler's JFR-format output mode.
134+
135+
You can see the `test_pollcatch_without_agent.sh` for an example that uses pollcatch with just async-profiler's
136+
`LD_PRELOAD` mode.
137+
138+
However, in that case, it is only needed that the pollcatch hooks refer to the same `libasyncProfiler.so` that is
139+
being used as a profiler, since the JFR Event API is based on global variables that must match. async-profiler-agent
140+
uses [libloading] which uses [dlopen(3)] (currently passing [`RTLD_LOCAL | RTLD_LAZY`][libloadingflags]), which
141+
performs [deduplication based on inode]. Therefore, if your system only has a single `libasyncProfiler.so`
142+
on the search path, it will be shared and pollcatch will work.
143+
144+
The async-profiler-agent crate currently does not expose the JFR Event API to users, due to stability
145+
reasons. As a user, using `libloading` to open `libasyncProfiler.so` and calling the API yourself
146+
will work, but if you have a use case for the JFR Event API, consider opening an issue.
147+
148+
[deduplication based on inode]: https://stackoverflow.com/questions/45954861/how-to-circumvent-dlopen-caching/45955035#45955035
149+
[JFR Event API]: https://github.com/async-profiler/async-profiler/blob/master/src/asprof.h#L99
150+
[libloading]: https://crates.io/crates/libloading
151+
[libloadingflags]: https://docs.rs/libloading/latest/libloading/os/unix/struct.Library.html#method.new
152+
126153
### Not enabling the AWS SDK / Reqwest default features
127154

128155
The `aws-metadata-no-defaults` and `s3-no-defaults` feature flags do not enable feature flags for the AWS SDK and `reqwest`.
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
extern crate async_profiler_agent;
2+
use std::time::{Duration, Instant};
3+
4+
// Simple test without a Tokio runtime, to just have an integration
5+
// test of the pollcatch hooks on async-profiler without involving
6+
// Tokio
7+
8+
fn main() {
9+
let start = Instant::now();
10+
while start.elapsed() < Duration::from_secs(1) {
11+
async_profiler_agent::pollcatch::before_poll_hook();
12+
let mid = Instant::now();
13+
while mid.elapsed() < Duration::from_millis(10) {
14+
// spin, there will be a profiler sample here
15+
}
16+
async_profiler_agent::pollcatch::after_poll_hook();
17+
}
18+
}
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
#!/bin/bash
2+
3+
# This test needs to following resources:
4+
# 1. LD_LIBRARY_PATH set to an async-profiler with user JFR support
5+
# 2. executable `./pollcatch-decoder` from `cd decoder && cargo build`
6+
# 3. executable `./pollcatch-without-agent` from `cargo build --example pollcatch-without-agent`
7+
8+
set -exuo pipefail
9+
10+
dir="pollcatch-without-agent-jfr"
11+
12+
mkdir -p $dir
13+
rm -f $dir/*.jfr
14+
15+
# Test that the pollcatch functions work fine with async-profiler in non-agent mode (test LD_PRELOAD mode)
16+
LD_PRELOAD=libasyncProfiler.so ASPROF_COMMAND=start,event=cpu,jfr,file=$dir/output.jfr ./pollcatch-without-agent
17+
./pollcatch-decoder longpolls --include-non-pollcatch $dir/output.jfr > $dir/output.txt
18+
cat $dir/output.txt
19+
grep -q 'poll of' $dir/output.txt

0 commit comments

Comments
 (0)