Skip to content

ruby: Handle Ruby JIT PC with JIT frame type#1102

Open
dalehamel wants to merge 12 commits intoopen-telemetry:mainfrom
Shopify:ruby-jit-upstream
Open

ruby: Handle Ruby JIT PC with JIT frame type#1102
dalehamel wants to merge 12 commits intoopen-telemetry:mainfrom
Shopify:ruby-jit-upstream

Conversation

@dalehamel
Copy link
Copy Markdown
Contributor

@dalehamel dalehamel commented Jan 21, 2026

What

Detect JIT frames produced by ruby's yjit / zjit to enable the ruby interpreter to still work if jit is in use.

Screenshot 2025-11-06 at 11 10 10 AM

Why

Fixes #937

yjit is a substantial improvement to performance, but currently breaks the ruby interpreter as it will never run, since the PC for the interpreter is never called - the whole point of JIT is to avoid the interpreter after all.

How

We use the SynchronizeMappings hook to detect the ruby JIT address range, and if the PC is in this range we will push a dummy frame to indicate that the leaf is some JIT code.

In the future, we can support the linux jit interface to symbolize these, but for now we just push a dummy frame.

Since Ruby's jit works by just running native code replacing the ISEQ of the leaf CME, we can just switch to unwinding the ruby stack once we detect a JIT frame.

However, since we cannot be guaranteed base pointers are available (and even if they are, we don't seem to be able to further unwind correctly with the native unwinder), we can't switch back to native unwinding once we have detected a JIT frame on the stack. This means that if JIT is enabled, we don't get the "interleaved" native and ruby stacks anymore, but we do still get the native frames on the edge of the stack which are probably the most interesting:

Screenshot 2025-11-06 at 11 15 32 AM

In a follow-up, PR, i'll add more thorough support for continuing to unwind when we have frame pointers enabled, and better support for zjit which is a method-based jit.

@dalehamel
Copy link
Copy Markdown
Contributor Author

Opening this one as a draft, it should probably have coredump tests as well. While it is independent from #1101 , it should probably land after it as it is lower a bit priority. If the approach looks good, I'll add some coredump tests.

@github-actions
Copy link
Copy Markdown

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Feb 24, 2026
@dalehamel
Copy link
Copy Markdown
Contributor Author

I managed to add support for better jit unwinding support when frame pointers are present, I am thinking i will probably submit both together - this is more stub stuff

@github-actions github-actions bot removed the Stale label Mar 2, 2026
@dalehamel dalehamel force-pushed the ruby-jit-upstream branch from 7120077 to e4d6d9a Compare March 3, 2026 17:30
Copy link
Copy Markdown
Contributor

@korniltsev-grafanista korniltsev-grafanista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just tested the change locally and it worked well on 3.4.9 and 3.3.10
I left some comments, but the general idea works and unlocks the profiling with jit enabled

@dalehamel
Copy link
Copy Markdown
Contributor Author

I've just tested the change locally and it worked well on 3.4.9 and 3.3.10 I left some comments, but the general idea works and unlocks the profiling with jit enabled

hey @korniltsev-grafanista i have an updated version of this branch Shopify#26 but i haven't updated this yet as i'm waiting for two other PRs to land first ( #1226 and #1227 )

Thanks for your comments, when the other PRs land trying to upstream the JIT support is my next priority.

For now i can say that this branch is at least tested in production and works quite well for our cases so far but certainly can use some review and hardening.

@korniltsev-grafanista
Copy link
Copy Markdown
Contributor

korniltsev-grafanista commented Mar 18, 2026

I took a look at Shopify#26 and it looks good to me. I would still prefer to only handle the whole jit area once and forget about it, than updating maps everytime something is recompiled, but it should work anyways.

dalehamel added a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Apr 7, 2026
- Restore findJITRegion() from base (open-telemetry#1102) which was incorrectly removed
  during rebase - PR #26 now reuses the shared function instead of inline code
- Remove redundant inline first-pass JIT detection (now handled by findJITRegion)
- Detach cleanup is inherited from the updated base (open-telemetry#1102)
dalehamel added a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Apr 7, 2026
- Restore findJITRegion() from base (open-telemetry#1102) which was incorrectly removed
  during rebase - PR #26 now reuses the shared function instead of inline code
- Remove redundant inline first-pass JIT detection (now handled by findJITRegion)
- Detach cleanup is inherited from the updated base (open-telemetry#1102)
dalehamel added a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Apr 7, 2026
- Restore findJITRegion() from base (open-telemetry#1102) which was incorrectly removed
  during rebase - PR #26 now reuses the shared function instead of inline code
- Remove redundant inline first-pass JIT detection (now handled by findJITRegion)
- Detach cleanup is inherited from the updated base (open-telemetry#1102)
dalehamel added a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Apr 8, 2026
- Restore findJITRegion() from base (open-telemetry#1102) which was incorrectly removed
  during rebase - PR #26 now reuses the shared function instead of inline code
- Remove redundant inline first-pass JIT detection (now handled by findJITRegion)
- Detach cleanup is inherited from the updated base (open-telemetry#1102)
dalehamel added a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Apr 8, 2026
- Restore findJITRegion() from base (open-telemetry#1102) which was incorrectly removed
  during rebase - PR #26 now reuses the shared function instead of inline code
- Remove redundant inline first-pass JIT detection (now handled by findJITRegion)
- Detach cleanup is inherited from the updated base (open-telemetry#1102)
dalehamel added a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Apr 8, 2026
- Restore findJITRegion() from base (open-telemetry#1102) which was incorrectly removed
  during rebase - PR #26 now reuses the shared function instead of inline code
- Remove redundant inline first-pass JIT detection (now handled by findJITRegion)
- Detach cleanup is inherited from the updated base (open-telemetry#1102)
dalehamel added a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Apr 8, 2026
- Restore findJITRegion() from base (open-telemetry#1102) which was incorrectly removed
  during rebase - PR #26 now reuses the shared function instead of inline code
- Remove redundant inline first-pass JIT detection (now handled by findJITRegion)
- Detach cleanup is inherited from the updated base (open-telemetry#1102)
@dalehamel dalehamel force-pushed the ruby-jit-upstream branch from 94930a6 to 5831ed4 Compare April 8, 2026 14:31
@dalehamel
Copy link
Copy Markdown
Contributor Author

@florianl I added new coredump tests for ruby with yjit enabled, can you please add the modules to the store

@dalehamel
Copy link
Copy Markdown
Contributor Author

@korniltsev-grafanista please take another look, i've added unit tests for the memory region detection and addressed your comments. I've also added coredump tests.

I'm planning on submitting a second PR once this lands which will add proper frame pointer detection to keep the native state in sync, which would be needed for unwinding zjit correctly. But for now that's lower priority, this allows the ruby unwinder to work with either jit enabled at all.

@dalehamel dalehamel force-pushed the ruby-jit-upstream branch from 5831ed4 to e151da4 Compare April 8, 2026 14:42
@dalehamel dalehamel marked this pull request as ready for review April 8, 2026 14:42
@dalehamel dalehamel requested review from a team as code owners April 8, 2026 14:42
return 0, 0, false
}

func (r *rubyInstance) SynchronizeMappings(ebpf interpreter.EbpfHandler,
Copy link
Copy Markdown
Contributor Author

@dalehamel dalehamel Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was cargo-culted from the node interpreter

dalehamel added a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Apr 8, 2026
- Restore findJITRegion() from base (open-telemetry#1102) which was incorrectly removed
  during rebase - PR #26 now reuses the shared function instead of inline code
- Remove redundant inline first-pass JIT detection (now handled by findJITRegion)
- Detach cleanup is inherited from the updated base (open-telemetry#1102)
florianl added a commit to florianl/opentelemetry-ebpf-profiler that referenced this pull request Apr 8, 2026
When trying to upload coredump tests for open-telemetry#1102 I did run into a bug.
When the S3 API returned exactly batchSize items, the code would continue the loop without checking if there were more pages. If the ContinuationToken was nil (indicating end of results), passing it to the next ListObjectsV2 call would restart the listing from the beginning, creating an infinite loop.

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
dalehamel added a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Apr 8, 2026
- Restore findJITRegion() from base (open-telemetry#1102) which was incorrectly removed
  during rebase - PR #26 now reuses the shared function instead of inline code
- Remove redundant inline first-pass JIT detection (now handled by findJITRegion)
- Detach cleanup is inherited from the updated base (open-telemetry#1102)
dalehamel added a commit to Shopify/opentelemetry-ebpf-profiler that referenced this pull request Apr 8, 2026
- Restore findJITRegion() from base (open-telemetry#1102) which was incorrectly removed
  during rebase - PR #26 now reuses the shared function instead of inline code
- Remove redundant inline first-pass JIT detection (now handled by findJITRegion)
- Detach cleanup is inherited from the updated base (open-telemetry#1102)
"Range#each+0 in <cfunc>:0",
"block in <main>+0 in /Users/dalehamel/src/github.com/Shopify/otel-ebpf-pr1102/tools/coredump/testsources/ruby/loop.rb:33",
"Kernel#loop+0 in <internal:kernel>:168",
"<main>+0 in /Users/dalehamel/src/github.com/Shopify/otel-ebpf-pr1102/tools/coredump/testsources/ruby/loop.rb:32"
Copy link
Copy Markdown
Contributor Author

@dalehamel dalehamel Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that because we lack the ability to continue unwinding when there are no frame pointers, the 'base frame' appears as the base ruby frame, not actual C / native frame that is the entrypoint to the ruby interpreter process. This is something that my follow-up PR will address. but for now it is expected behaviour that once jit is detected, we don't resume native unwinding.

And even with the follow-up, it isn't guaranteed we can - ruby needs to be running with a jit-perf mode that enables the frame pointers. So this is generally an expected and (I think) acceptable tradeoff for being able to profile with jit enabled.

maxSize atomic.Uint32

// mappings is indexed by the Mapping to its generation
mappings map[process.RawMapping]*uint32
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried about unlimited growth of this map. Is there some guidance on the number of entries we expect here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a comment

dalehamel and others added 11 commits April 9, 2026 09:40
… cleanup

- Extract JIT region detection into standalone findJITRegion() function
  that handles both prctl-labeled mappings (spanning full reserved area
  including holes) and heuristic fallback (first anonymous executable mapping)
- Add comprehensive unit tests for JIT region detection: no mappings,
  file-backed only, labeled single/multiple, heuristic fallback, precedence
- Fix Detach() to clean up PidInterpreterMapping prefixes (previously only
  deleted proc data, leaking eBPF map entries)
- Remove dead code: m.Vaddr < jitMapping.Vaddr branch that could never be
  true since /proc/pid/maps is sorted by address
- Fix JIT region check to use >= for lower bound (half-open interval [start, end))
  matching V8 and HotSpot conventions
- Reuse map lookup value to avoid double hashing of RawMapping struct key
- Wrap CalculatePrefixList error for better debugging
- Log stale prefix deletion errors instead of silently discarding
- Move prctl(PR_SET_VMA) named anonymous mapping support from process/process.go
  (was in PR #36 but is needed here for findJITRegion labeled detection)
- Remove stale 'assume all anonymous' comment
- Rebuild BPF blobs for >= fix
- Add 'Register LPM prefixes' comment to SynchronizeMappings loop
- Extract in_jit bool variable from inline if condition for clarity
- Improve JIT detection comments (read_ruby_frame and walk_ruby_stack)
- Fix whitespace: add blank line after FRAMES_PER_WALK_RUBY_STACK define,
  normalize MAX_EP_CHECKS spacing
- Rebuild BPF blobs
Ruby 3.4.7 with --yjit on arm64. Verifies that:
- JIT code region is detected from anonymous executable mappings
- A JIT frame is pushed for the YJIT-compiled code
- Ruby VM frames (is_prime, sum_of_primes, loop) are properly unwound
  after the JIT frame

On main, this coredump fails with native_no_pid_page_mapping because
the profiler doesn't know how to handle the anonymous JIT mapping.

Moduledata tar for CI upload: ~/coredump-uploads/ruby-yjit-moduledata.tar
Ruby 3.4.7 with --yjit on x86_64. Same verification as arm64 test:
- JIT code region detected from anonymous executable mappings
- JIT frame pushed for YJIT-compiled code
- Full Ruby VM stack unwound (is_prime, sum_of_primes, loop)

Moduledata tar: ~/coredump-uploads/ruby-yjit-amd64-moduledata.tar
On systems with CONFIG_ANON_VMA_NAME=y, Ruby labels its JIT memory
region via prctl(PR_SET_VMA), giving it a path like
'[anon:Ruby:rb_yjit_reserve_addr_space]'. Without this fix, IsFileBacked()
returns true for these mappings (non-empty Path), causing IsAnonymous()
to return false. This prevents the LPM prefix registration loop in
SynchronizeMappings from registering JIT executable pages for eBPF
dispatch.

Add IsPrctlNamed() helper and exclude prctl-named mappings from
IsFileBacked(), so they are correctly classified as anonymous.
Co-authored-by: Florian Lehner <florianl@users.noreply.github.com>
Adapt SynchronizeMappings to work with uint32 values instead of *uint32
pointers: since generations are no longer shared via pointer, both mappings
and their prefixes must be updated independently each cycle. Use an isNew
flag to only call UpdatePidInterpreterMapping for newly discovered mappings.
@dalehamel dalehamel force-pushed the ruby-jit-upstream branch from 0c685ac to edbf402 Compare April 9, 2026 13:40
@dalehamel
Copy link
Copy Markdown
Contributor Author

thanks for the review and feedback @florianl I applied your suggestions in one commit (per the new guidance to ensure you get credit), and fixed them up in a second to fix the build and address the remaining comments.

Address florianl's concern about unlimited growth: entries are pruned
each SynchronizeMappings call via generation tracking, and the map size
is bounded by the number of executable anonymous mappings per process.

for _, prefix := range prefixes {
if isNew {
if err := ebpf.UpdatePidInterpreterMapping(pid, prefix,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only tiny suggestion remains the same is to consider taking advantage of the fact that we know the full size of the yjit area (48/128/x mib) and we can just find the first mapping and assume the whole area belongs to the ruby interpreter without figuring out which subset of it has already been occupied/ garbage collected by the jit. This would simplify go code, there will be less map updates/deletitions every time something is recompiled, maps will be smaller.

#1102 (comment)

// Heuristic fallback: first anonymous executable mapping by address.
// Mappings from /proc/pid/maps are sorted by address, so the first
// match is the lowest address.
if !heuristicFound && m.IsExecutable() && m.IsAnonymous() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the label detection extends start/end jit area, but the heuristic sets it to the first found?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also a bit of disconnection with the interpreter mappings: we UpdatePidInterpreterMapping for all mappings, but only set jit start/end for the first one. Is this intentional?

wantFound: true,
},
{
name: "heuristic fallback - first anonymous executable mapping",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking only the first mapping may be not correct.

I did not test, but looks like it would not catch at least this case.

7f17d99b9000-7f17d9b18000 r-xp 00000000 00:00 0
7f17d9b18000-7f17d9c31000 r-xp 00000000 00:00 0
7f17d9c31000-7f17e19b9000 ---p 00000000 00:00 0

Both of this mappings belong to the jit area. And there may be many more cases. There may be more mappings later. Furthermore once full jit area is occupied some of the pages may be grabage collected, so there may be holes in it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we reject the second mapping in the heuristic fallback - first anonymous executable mapping testcase.

				execAnon(0x7f0000100000, 0x4000),
				execAnon(0x7f0000200000, 0x8000),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feat][Ruby] Support for ruby JIT frames

3 participants