Skip to content

Conversation

@JonasKunz
Copy link
Contributor

Description:

Upgrades async-profiler used by the inferred-spans extension to 4.0. Supersedes #1840.
Async-profiler 4.0 introduced a new batching mode as the default for wall clock profiling (PR). This mode yields great efficiency improvements:

Previously, if a thread was blocked for e.g. a second, the profiler would periodically wake it up and collect the same stacktrace every time. With batching, the profiler got smarter: It still wakes the thread up, but then detects that the CPU-time of the thread hasn't changed, so it can omit fetching the same stacktrace again.

Unfortunately I don't think we can make use of this improvement yet, see async-profiler/async-profiler#1277. However, eventually we'll definitely add it, as I'm expecting great efficiency improvements.
Based on the outcome of the discussion in the linked issue, I'll open a issue in this repo to make sure we don't forget about it.

This feature has also changed how the samples are stored in the JFR files, causing the inferred spans extension to not find them anymore. To fix this for now, I've disabled the batching functionality, causing async-profiler to operate in the "old" wall clock profiling mode.

Testing:

Quick manual test plus our unit test coverage is already decent.

@JonasKunz JonasKunz requested a review from a team May 2, 2025 07:29
String createStartCommand() {
StringBuilder startCommand =
new StringBuilder("start,jfr,clock=m,event=wall,cstack=n,interval=")
new StringBuilder("start,jfr,clock=m,event=wall,nobatch,cstack=n,interval=")
Copy link
Contributor

@steverao steverao May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 4.0, I found there is an attractive feature in stack walking async-profiler/async-profiler#1073, it can avoid using AsyncGetCallTrace in JDK that it may causes all kind of crash problems. For detail, can refer to async-profiler/async-profiler#795 Is there any plan to switch to that stack walking mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is best to by default stick with the default stack walking of async-profiler: Like the maintainer stated, async-profiler will eventually switch to the custom stack walking implementation by default, and then we'll switch our default to.

However, I don't see a reason why we shouldn't allow the stack walker to be configured on the inferred-spans extension. Feel free to open a PR adding such a config option if you think it would be useful.

@trask trask added this pull request to the merge queue May 6, 2025
Merged via the queue into open-telemetry:main with commit 4ae092c May 6, 2025
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants