Skip to content

feat: native signal handler strategy on Android#4676

Open
jpnurmi wants to merge 9 commits intomainfrom
feat/android-signal-handler-strategy
Open

feat: native signal handler strategy on Android#4676
jpnurmi wants to merge 9 commits intomainfrom
feat/android-signal-handler-strategy

Conversation

@jpnurmi
Copy link
Copy Markdown
Collaborator

@jpnurmi jpnurmi commented Oct 27, 2025

Experimental opt-in Android-only solution for:

According to our newly introduced Android integration tests, CHAIN_AT_START works on android-arm64 and android-x64 in both Release and Debug configurations, but we'd like to validate the fix further in real-world conditions. To that end, we're making it opt-in initially so customers can try it on more devices, platforms, and configurations before considering it as the new default.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Oct 27, 2025

Messages
📖 Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Generated by 🚫 dangerJS against 47248ba

@codecov
Copy link
Copy Markdown

codecov bot commented Oct 27, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.99%. Comparing base (3e2eae9) to head (02f59c1).
⚠️ Report is 21 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4676      +/-   ##
==========================================
- Coverage   74.00%   73.99%   -0.02%     
==========================================
  Files         499      499              
  Lines       18066    18067       +1     
  Branches     3518     3519       +1     
==========================================
- Hits        13370    13368       -2     
- Misses       3837     3839       +2     
- Partials      859      860       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jpnurmi
Copy link
Copy Markdown
Collaborator Author

jpnurmi commented Nov 4, 2025

@jamescrosswell This is marked as a draft because it's still waiting for dependencies, but we could already kick off the review process if you have time. :)

Do you have a preferred approach for exposing this as an opt-in API? The current proposal mirrors the sentry-native enum.

#if ANDROID
    options.Native.SignalHandlerStrategy = Sentry.Android.SignalHandlerStrategy.ChainAtStart;
#endif

With only two values, a simple boolean flag could also work for the time being, but I'm unsure if potential Android Tombstone support could change things in the future.

Base automatically changed from version6 to main November 14, 2025 02:30
@jpnurmi jpnurmi force-pushed the feat/android-signal-handler-strategy branch from af57caa to 4486918 Compare November 20, 2025 10:31
@jpnurmi jpnurmi force-pushed the feat/android-signal-handler-strategy branch 2 times, most recently from 7a489eb to 543b3e3 Compare January 2, 2026 14:57
@jpnurmi
Copy link
Copy Markdown
Collaborator Author

jpnurmi commented Jan 26, 2026

Update: I'll get back to this after #4750 to make sure this works with .NET 10

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 7, 2026

Semver Impact of This PR

None (no version bump detected)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


Features ✨

  • feat: native signal handler strategy on Android by jpnurmi in #4676

Fixes 🐛

  • fix: CaptureFeedback now supports multiple attachments correctly by bitsandfoxes in #5077

Dependencies ⬆️

Deps

  • chore(deps): update Native SDK to v0.13.4 by github-actions in #5081
  • chore(deps): update Java SDK to v8.37.1 by github-actions in #5071
  • chore(deps): update CLI to v3.3.4 by github-actions in #5068
  • chore(deps): update Java SDK to v8.37.0 by github-actions in #5069
  • chore(deps): update Cocoa SDK to v9.8.0 by github-actions in #5044
  • chore(deps): update Java SDK to v8.36.0 by github-actions in #5036
  • chore(deps): update epitaph to 0.1.1 by github-actions in #5036

Other

  • ci: fix workflows that always fail for fork PRs by jamescrosswell in #5065

🤖 This preview updates automatically when you update the PR.

@jpnurmi
Copy link
Copy Markdown
Collaborator Author

jpnurmi commented Feb 9, 2026

While the signal handler strategy fixes the redundant SIGSEGV vs. NullReferenceException issue, the native exception test case breaks with .NET 10 on Android X86_64. 😢 All tests pass locally in an ARM64 Android emulator.

@jpnurmi

This comment was marked as outdated.

@jpnurmi
Copy link
Copy Markdown
Collaborator Author

jpnurmi commented Feb 10, 2026

Claude confidently claims the issue was fixed by:

The fix was backported to release/10.0. We can give it a try later with the next .NET 10 release.

Investigation: .NET 10 x86_64 native crash test failure

Traced the root cause via strace on the Android emulator.

What happens

When sentry-native defers to Mono's signal handler (CHAIN_AT_START), Mono's mono_handle_native_crash walks the managed stack. On .NET 10, this walk uses the MONO_UNWIND_SIGNAL_SAFE flag (added in d34ef7e), which causes AOT method JIT info to be loaded in async-safe mode with ji->async = TRUE. The stack walk callback print_stack_frame_signal_safe was not updated to check !ji->async before calling mono_jit_info_get_method(), which hits:

  • Assertion at jit-info.c:918, condition !ji->async not met

The assertion triggers abort(), and since mono_handle_native_crash already reset SIGABRT to SIG_DFL, the process is killed immediately — before control returns to sentry-native.

Fix

Already merged upstream:

@jpnurmi jpnurmi force-pushed the feat/android-signal-handler-strategy branch from b64b284 to bda8f1b Compare February 11, 2026 06:00
@jpnurmi
Copy link
Copy Markdown
Collaborator Author

jpnurmi commented Feb 11, 2026

SDK 10.0.103 ships with runtime 10.0.3, which was built 12 days before the fix was merged. The fix is in the release/10.0 branch but hasn't shipped in a servicing release yet. It will likely be in runtime 10.0.4.

  • Runtime pack: Microsoft.NETCore.App.Runtime.Mono.android-x64 10.0.3
  • Built from: commit c2435c3e0f46de784341ac3ed62863ce77e117b4 in the dotnet VMR
  • Package dates: January 25, 2026
  • Fix merged: February 6, 2026 (commit cdc557af23d)

Confirmed with strace — the crash sequence is identical to before:

07:17:43.198063  at <unknown> <0xffffffff>
07:17:43.198083  at Sentry.SentrySdk:NativeCrash <0x000d8>
07:17:43.198099  at Sentry.SentrySdk:CauseCrash <0x00a0a>
07:17:43.198145  * Assertion at jit-info.c:918, condition `!ji->async' not met
07:17:43.198800  FORTIFY: pthread_mutex_lock called on a destroyed mutex

@jpnurmi
Copy link
Copy Markdown
Collaborator Author

jpnurmi commented Feb 11, 2026

I confirmed with a locally built libmonosgen-2.0.so that dotnet/runtime#123346 fixes the issue.

I used the 10.0.3 baseline of dotnet/runtime, cherry-picked the fix, built (./build.sh mono -os android -arch x64 -c Release), and copied over /usr/share/dotnet/packs/Microsoft.NETCore.App.Runtime.Mono.android-x64/10.0.3/runtimes/android-x64/native/libmonosgen-2.0.so.

02-11 07:58:07.845  5437  5437 I DOTNET  :   Debug: Triggering a deliberate exception because SentrySdk.CauseCrash(CrashType.Native) was called
02-11 07:58:07.845  5437  5437 D app_process64:   Debug: Triggering a deliberate exception because SentrySdk.CauseCrash(CrashType.Native) was called
02-11 07:58:07.846  5437  5437 I sentry-native: entering signal handler
02-11 07:58:07.846  5437  5437 D sentry-native: defer to runtime signal handler at start
02-11 07:58:07.846  5437  5437 F libc    : Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0 in tid 5437 (egrationtestapp), pid 5437 (egrationtestapp)
02-11 07:58:08.031  5475  5475 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
02-11 07:58:08.031  5475  5475 F DEBUG   : Build fingerprint: 'google/sdk_gphone64_x86_64/emu64xa:16/BE2A.250530.026.F3/13894323:userdebug/dev-keys'
02-11 07:58:08.031  5475  5475 F DEBUG   : Revision: '0'
02-11 07:58:08.031  5475  5475 F DEBUG   : ABI: 'x86_64'
02-11 07:58:08.031  5475  5475 F DEBUG   : Timestamp: 2026-02-11 07:58:07.902802417+0100
02-11 07:58:08.031  5475  5475 F DEBUG   : Process uptime: 2s
02-11 07:58:08.031  5475  5475 F DEBUG   : Cmdline: io.sentry.dotnet.maui.device.integrationtestapp
02-11 07:58:08.031  5475  5475 F DEBUG   : pid: 5437, tid: 5437, name: egrationtestapp  >>> io.sentry.dotnet.maui.device.integrationtestapp <<<
02-11 07:58:08.031  5475  5475 F DEBUG   : uid: 10246
02-11 07:58:08.031  5475  5475 F DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0000000000000000
02-11 07:58:08.031  5475  5475 F DEBUG   : Cause: null pointer dereference
02-11 07:58:08.031  5475  5475 F DEBUG   :     rax 0000000000000000  rbx 000079c99f938298  rcx 000079ca67c9d010  rdx 0000000000000200
02-11 07:58:08.031  5475  5475 F DEBUG   :     r8  0000000000000002  r9  00007ffdbd520910  r10 0000000000000005  r11 000079c918eb6eb0
02-11 07:58:08.031  5475  5475 F DEBUG   :     r12 000079c9f8856a88  r13 000079ca67c9d020  r14 000079c909c8a5b0  r15 000079cba7cb20d0
02-11 07:58:08.031  5475  5475 F DEBUG   :     rdi 000079cba7cb20d0  rsi 0000000000000006
02-11 07:58:08.031  5475  5475 F DEBUG   :     rbp 000079c99f938298  rsp 00007ffdbd5208d8  rip 000079c909c8a5be
02-11 07:58:08.031  5475  5475 F DEBUG   : 2 total frames
02-11 07:58:08.031  5475  5475 F DEBUG   : backtrace:
02-11 07:58:08.031  5475  5475 F DEBUG   :       #00 pc 00000000000005be  /data/app/~~RnBcgBqVyUx2MEIc7xRDiA==/io.sentry.dotnet.maui.device.integrationtestapp-hriVX4xDjbd5AhYJnurtoA==/lib/x86_64/libsentrysupplemental.so (crash+14) (BuildId: 852925ef6da49aeb59a096644bd533ff2fce55dc)
02-11 07:58:08.031  5475  5475 F DEBUG   :       #01 pc 00000000000e47a8  <anonymous:405af000>
02-11 07:58:08.044  5437  5437 D sentry-native: return from runtime signal handler, we handle the signal
02-11 07:58:08.047   716   740 I ActivityManager: Showing crash dialog for package io.sentry.dotnet.maui.device.integrationtestapp u0
02-11 07:58:08.053  5437  5437 D sentry-native: captured backtrace from ucontext with 2 frames
02-11 07:58:08.053  5437  5437 D sentry-native: captured backtrace with 2 frames
02-11 07:58:08.053  5437  5437 D sentry-native: merging global scope into event
02-11 07:58:08.053  5437  5437 D sentry-native: trying to read modules from /proc/self/maps
02-11 07:58:08.077  5437  5437 D sentry-native: read 445 modules from /proc/self/maps
02-11 07:58:08.077  5437  5437 D sentry-native: sending envelope
02-11 07:58:08.078  5437  5437 I sentry-native: crash has been captured

@jamescrosswell
Copy link
Copy Markdown
Collaborator

I used the 10.0.3 baseline of dotnet/runtime, cherry-picked the fix, built (./build.sh mono -os android -arch x64 -c Release), and copied over /usr/share/dotnet/packs/Microsoft.NETCore.App.Runtime.Mono.android-x64/10.0.3/runtimes/android-x64/native/libmonosgen-2.0.so.

Nice! So once 10.0.30 is released we could put together a fix. Would our SDK users also need to be building with 10.0.3 or is it sufficient for us to be building the Sentry SDK for .NET with it?

@jpnurmi
Copy link
Copy Markdown
Collaborator Author

jpnurmi commented Feb 13, 2026

Applying the fix when building Sentry alone is not sufficient. The fix must be present in the .NET runtime / Mono DLL that is packaged inside the application APK.

Unfortunately, the fix was not included in version 10.0.3, which was released a few days ago. We'll have to wait for 10.0.4, the next monthly servicing update.

@jamescrosswell
Copy link
Copy Markdown
Collaborator

Looks like 10.0.4 is finally out and even 10.0.5, which we have a PR to bump to here:

That PR needs some debugging - @jpnurmi would you have the bandwidth to look at it or should @Flash0ver or I pick it up?

@jpnurmi
Copy link
Copy Markdown
Collaborator Author

jpnurmi commented Mar 16, 2026

Another issue has come up. 😓 While testing against 10.0.4, I discovered that the refactored inproc backend introduced in sentry-native 0.13 broke the CHAIN_AT_START signal chaining strategy. A fix is in progress:

To prevent similar regressions going forward, we're also adding a .NET Android integration test to the Native SDK:

@jamescrosswell
Copy link
Copy Markdown
Collaborator

jamescrosswell commented Mar 17, 2026

Another issue has come up. 😓 While testing against 10.0.4, I discovered that the refactored inproc backend introduced in sentry-native 0.13 broke the CHAIN_AT_START signal chaining strategy.

Hm... what's your gut tell you about the robustness of the solution we're putting in place here? Is it something that's likely to work for long periods without us needing to pay attention to it or is it possibly fragile and going to be a time sink?

I mean, pretty much everything about MAUI seems fragile to some extent... is this going to be worse though?

@jpnurmi
Copy link
Copy Markdown
Collaborator Author

jpnurmi commented Mar 18, 2026

As AI would put it, "You're absolutely right to question that!"

Chaining signal handlers as we do with the "chain-at-start" strategy is not an officially supported use case and could potentially break in any future version of .NET or Android.

Furthermore, there's an endless combination of versions and targets/devices/emulators/architectures to cover, and we have quite big gaps in testing as we currently can't even test on ARM64 (#4660). If we still want to release this, it's best to start with an opt-in to gather feedback.

@jamescrosswell
Copy link
Copy Markdown
Collaborator

As AI would put it, "You're absolutely right to question that!"

🤣

If we still want to release this, it's best to start with an opt-in to gather feedback.

That sounds like a good approach... maybe we leave it opt in for a decent amount of time to see if it can survive major version bumps in the various different moving parts.

@jpnurmi jpnurmi force-pushed the feat/android-signal-handler-strategy branch from 47248ba to 448380a Compare March 26, 2026 11:19
@jpnurmi
Copy link
Copy Markdown
Collaborator Author

jpnurmi commented Mar 26, 2026

Temporarily rebased on top of #5069. Android jobs are green. Finally! 🎉

#5069 needs to be merged, and then this will be ready to go. Do the changes look otherwise good, btw?

@jpnurmi jpnurmi force-pushed the feat/android-signal-handler-strategy branch from 448380a to 13e3e29 Compare March 27, 2026 06:37
@jpnurmi jpnurmi removed the Blocked label Mar 27, 2026
@jpnurmi jpnurmi marked this pull request as ready for review March 27, 2026 07:17
Copy link
Copy Markdown
Collaborator

@jamescrosswell jamescrosswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jpnurmi - the other dependencies are now merged so I think this is almost ready.

The changes themselves look fine functionally (pretty trivial - just wrapping an android SDK option). I made a few suggestions that might hopefully head off confusion for our SDK users though.

…T runtimes

Add a runtime version check that falls back to Default on .NET 10.0.0–10.0.3,
which crash with ChainAtStart due to dotnet/runtime#123346. Also fix swapped
runtime/SDK version numbers in XML docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jpnurmi jpnurmi force-pushed the feat/android-signal-handler-strategy branch from 97c35d2 to 2a04f8f Compare March 30, 2026 20:01
jpnurmi and others added 4 commits March 30, 2026 22:30
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…falling back

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jpnurmi and others added 2 commits March 31, 2026 07:07
…tead of falling back"

This reverts commit 9698020. Use warning + fallback to Default instead
of throwing during initialization.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@jamescrosswell jamescrosswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming the iOS failure is just flaky tests.

Massive thank you for this @jpnurmi ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants