Skip to content

Conversation

@m4gr3d
Copy link
Contributor

@m4gr3d m4gr3d commented Dec 19, 2025

This addresses the ANRs (Application Not Responding) reports from the Google Play and Horizon stores:

Diving into the report from the Google Play store shows that the ANRs tend to occur on application exit when the main thread is waiting for the render thread to clean up and shutdown the engine.

"main" tid=1 Waiting
  at java.lang.Object.wait (Native method)
  at java.lang.Object.wait (Object.java:405)
  at java.lang.Object.wait (Object.java:543)
  at org.godotengine.godot.gl.GLSurfaceView$GLThread.requestExitAndWait (GLSurfaceView.java:1791)
  at org.godotengine.godot.gl.GLSurfaceView.requestRenderThreadExitAndWait (GLSurfaceView.java:604)
  at org.godotengine.godot.GodotGLRenderView.onActivityDestroyed (GodotGLRenderView.java:134)
  at org.godotengine.godot.Godot.onDestroy (Godot.kt:739)
  at org.godotengine.godot.Godot.destroyAndKillProcess$lambda$25 (Godot.kt:1072)
  at org.godotengine.godot.Godot.$r8$lambda$0Jotn_vpQ3wRkZHjwwa4O3QZtLs (unavailable)
  at org.godotengine.godot.Godot$$ExternalSyntheticLambda5.run (D8$$SyntheticClass)
  at android.os.Handler.handleCallback (Handler.java:938)
  at android.os.Handler.dispatchMessage (Handler.java:99)
  at android.os.Looper.loopOnce (Looper.java:201)
  at android.os.Looper.loop (Looper.java:288)
  at android.app.ActivityThread.main (ActivityThread.java:7941)
  at java.lang.reflect.Method.invoke (Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:553)
  at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:1003)

In certain conditions however, the render thread ends up itself being blocked waiting on internal threads to complete their cleanup. For example, in the logs below, the render thread is blocked by the destruction of the IP object which is blocked on resolver->thread.wait_to_finish();.

"GLThread 1367" tid=15 Native
  #00  pc 0x000000000004c01c  /apex/com.android.runtime/lib64/bionic/libc.so (syscall+28)
  #01  pc 0x00000000000b1cc8  /apex/com.android.runtime/lib64/bionic/libc.so (pthread_join+244)
  #02  pc 0x00000000000dc040  /data/app/~~uaMD97Az4XPyvygrw4DQJw==/org.godotengine.editor.v4-k33iNo7R0XF9MVzrt8iUhA==/split_config.arm64_v8a.apk (std::__ndk1::thread::join()+191) (BuildId: 725db191d65be59dc18ddb3238f9e749442844a4)
  #03  pc 0x000000000813b5ec  /data/app/~~uaMD97Az4XPyvygrw4DQJw==/org.godotengine.editor.v4-k33iNo7R0XF9MVzrt8iUhA==/split_config.arm64_v8a.apk (Thread::wait_to_finish()+86) (BuildId: 149dd1732e747a99)
  #04  pc 0x0000000008295a60  /data/app/~~uaMD97Az4XPyvygrw4DQJw==/org.godotengine.editor.v4-k33iNo7R0XF9MVzrt8iUhA==/split_config.arm64_v8a.apk (IP::~IP()+350) (BuildId: 149dd1732e747a99)
  #05  pc 0x000000000807d424  /data/app/~~uaMD97Az4XPyvygrw4DQJw==/org.godotengine.editor.v4-k33iNo7R0XF9MVzrt8iUhA==/split_config.arm64_v8a.apk (unregister_core_types()+139) (BuildId: 149dd1732e747a99)
  #06  pc 0x00000000039f8784  /data/app/~~uaMD97Az4XPyvygrw4DQJw==/org.godotengine.editor.v4-k33iNo7R0XF9MVzrt8iUhA==/split_config.arm64_v8a.apk (Main::cleanup(bool)+5121) (BuildId: 149dd1732e747a99)
  #07  pc 0x000000000399b54c  /data/app/~~uaMD97Az4XPyvygrw4DQJw==/org.godotengine.editor.v4-k33iNo7R0XF9MVzrt8iUhA==/split_config.arm64_v8a.apk (_terminate(_JNIEnv*, bool)+109) (BuildId: 149dd1732e747a99)
  #08  pc 0x000000000399bbf0  /data/app/~~uaMD97Az4XPyvygrw4DQJw==/org.godotengine.editor.v4-k33iNo7R0XF9MVzrt8iUhA==/split_config.arm64_v8a.apk (Java_org_godotengine_godot_GodotLib_step+313) (BuildId: 149dd1732e747a99)
  at org.godotengine.godot.GodotLib.step (Native method)
  at org.godotengine.godot.gl.GodotRenderer.onDrawFrame (GodotRenderer.java:61)
  at org.godotengine.godot.gl.GLSurfaceView$GLThread.guardedRun (GLSurfaceView.java:1592)
  at org.godotengine.godot.gl.GLSurfaceView$GLThread.run (GLSurfaceView.java:1294)

This PR fixes the issue by adding a timeout for how long the main thread should wait for the render thread. On expiry of the timeout, the main thread gives up on waiting on the render thread and force kill the process.

@m4gr3d m4gr3d added this to the 4.6 milestone Dec 19, 2025
@m4gr3d m4gr3d requested a review from a team as a code owner December 19, 2025 17:50
@m4gr3d m4gr3d added bug cherrypick:4.5 Considered for cherry-picking into a future 4.5.x release platform:android labels Dec 19, 2025
@clayjohn
Copy link
Member

Is killing a thread not a big deal?

I have no issue if Android is fine with you just killing stuff off when necessary. From a rendering perspective, the only potential issue I foresee is that the pipeline cache is saved on exit and killing the thread early may either cause that to be corrupted, or just not happen at all. Which would create a similar issue to #111319

Alternatively, is there a way for us to poll for the completion of tasks at shutdown instead of waiting? That way perhaps the main thread could at least avoid the ANR even if shutdown takes a little bit of time

@m4gr3d
Copy link
Contributor Author

m4gr3d commented Dec 19, 2025

I have no issue if Android is fine with you just killing stuff off when necessary. From a rendering perspective, the only potential issue I foresee is that the pipeline cache is saved on exit and killing the thread early may either cause that to be corrupted, or just not happen at all. Which would create a similar issue to #111319

Is the pipeline cache saved as part of Main::cleanup()? I'm wondering if we could explicitly trigger it (would also address #111319) so we remove the concern about preventing it from running or corrupting it when we kill the process.

Alternatively, is there a way for us to poll for the completion of tasks at shutdown instead of waiting? That way perhaps the main thread could at least avoid the ANR even if shutdown takes a little bit of time

On app exit, the Android runtime makes the app go through the DESTROY state by calling onDestroy(), and consider the app is done and ready to be disposed as soon as the app returns from the onDestroy() callback.
So at this point of time (1), we don't have a mechanism to async poll for the engine shutdown status given we're no longer around when we return from onDestroy(), which is why in the current logic we have onDestroy() blocking on the engine shutdown cleanup.
ANR is triggered after 5 seconds, so in most scenarios it works as expected given the engine shutdown in much less than 5 seconds. In the scenarios highlighted by the crash logs though, the engine is taking longer than 5 seconds to shutdown, by which point I'd expect it to be in an unrecoverable state.

Re (1), for 4.7 I'm exploring having a background service by leveraging GodotService, to which the host activity can hand over the engine to properly dispose of it; the use of a background service would also provide the ability to run the engine in the background (e.g: media players); but that is not an improvement that can be cherry-picked into previous versions of the engine.

@clayjohn
Copy link
Member

clayjohn commented Dec 19, 2025

I have no issue if Android is fine with you just killing stuff off when necessary. From a rendering perspective, the only potential issue I foresee is that the pipeline cache is saved on exit and killing the thread early may either cause that to be corrupted, or just not happen at all. Which would create a similar issue to #111319

Is the pipeline cache saved as part of Main::cleanup()? I'm wondering if we could explicitly trigger it (would also address #111319) so we remove the concern about preventing it from running or corrupting it when we kill the process.

Looks like it, it's triggered here:

if (rendering_device) {
memdelete(rendering_device);
}

Which gets called from finalize_display() which is called in Main::cleanup()

In any case, I think being able to explicitly trigger it is important for solving #111319

ANR is triggered after 5 seconds, so in most scenarios it works as expected given the engine shutdown in much less than 5 seconds. In the scenarios highlighted by the crash logs though, the engine is taking longer than 5 seconds to shutdown, by which point I'd expect it to be in an unrecoverable state.

Maybe we are getting dead locks on exit?

In certain conditions however, the render thread ends up itself being blocked waiting on internal threads to complete their cleanup. For example, in the logs below, the render thread is blocked by the destruction of the IP object which is blocked on resolver->thread.wait_to_finish();.

This makes me think that the root of the issue is a deadlock in IP. I wonder if there is a way for us to force it to kill the thread instead of waiting at close.

Have you been able to reproduce this issue locally at all?

@m4gr3d
Copy link
Contributor Author

m4gr3d commented Dec 30, 2025

Maybe we are getting dead locks on exit?

Yes we are getting dead locks on exit. I haven't been able to reproduce the issue locally, but from the store logs it occurs both on Quest and Android devices across a wide range of devices.

This makes me think that the root of the issue is a deadlock in IP. I wonder if there is a way for us to force it to kill the thread instead of waiting at close.

Deadlock in the IP destructor is the most prevalent source of the ANRs (and worth addressing), but it's not the only source of ANRs. So this approach of killing the process is to ensure that we pro-actively terminate ourselves before the system does and report an ANR.

Other (less common) sources include:

  • deadlock when saving the pipeline cache
  #00  pc 0x00000000000896ac  /apex/com.android.runtime/lib64/bionic/libc.so (__memcpy+300)
  #01  pc 0x00000000000617f0  /vendor/lib64/hw/vulkan.mt6765.so
  #02  pc 0x000000000006379c  /vendor/lib64/hw/vulkan.mt6765.so
  #03  pc 0x0000000004b3c498  /data/app/~~FviKmczML5QY76VlG7cQxA==/org.godotengine.editor.v4-S04BF21HyaZR7dbmX5iRLg==/split_config.arm64_v8a.apk (RenderingDeviceDriverVulkan::pipeline_cache_serialize()+4472) (BuildId: 149dd1732e747a99)
  #04  pc 0x0000000007a341f0  /data/app/~~FviKmczML5QY76VlG7cQxA==/org.godotengine.editor.v4-S04BF21HyaZR7dbmX5iRLg==/split_config.arm64_v8a.apk (RenderingDevice::_save_pipeline_cache(void*)+6960) (BuildId: 149dd1732e747a99)
  #05  pc 0x0000000007a20c9c  /data/app/~~FviKmczML5QY76VlG7cQxA==/org.godotengine.editor.v4-S04BF21HyaZR7dbmX5iRLg==/split_config.arm64_v8a.apk (RenderingDevice::_update_pipeline_cache(bool)+6950) (BuildId: 149dd1732e747a99)
  #06  pc 0x0000000007a358ac  /data/app/~~FviKmczML5QY76VlG7cQxA==/org.godotengine.editor.v4-S04BF21HyaZR7dbmX5iRLg==/split_config.arm64_v8a.apk (RenderingDevice::finalize()+7242) (BuildId: 149dd1732e747a99)
  #07  pc 0x0000000007a790a0  /data/app/~~FviKmczML5QY76VlG7cQxA==/org.godotengine.editor.v4-S04BF21HyaZR7dbmX5iRLg==/split_config.arm64_v8a.apk (RenderingDevice::~RenderingDevice()+8101) (BuildId: 149dd1732e747a99)
  #08  pc 0x00000000039bb414  /data/app/~~FviKmczML5QY76VlG7cQxA==/org.godotengine.editor.v4-S04BF21HyaZR7dbmX5iRLg==/split_config.arm64_v8a.apk (DisplayServerAndroid::~DisplayServerAndroid()+139) (BuildId: 149dd1732e747a99)
  #09  pc 0x00000000039f84ac  /data/app/~~FviKmczML5QY76VlG7cQxA==/org.godotengine.editor.v4-S04BF21HyaZR7dbmX5iRLg==/split_config.arm64_v8a.apk (Main::cleanup(bool)+139) (BuildId: 149dd1732e747a99)
  #10  pc 0x000000000399b54c  /data/app/~~FviKmczML5QY76VlG7cQxA==/org.godotengine.editor.v4-S04BF21HyaZR7dbmX5iRLg==/split_config.arm64_v8a.apk (_terminate(_JNIEnv*, bool)+109) (BuildId: 149dd1732e747a99)
  • deadlock when destroying the Window
  #00  pc 0x000000000008b430  /apex/com.android.runtime/lib64/bionic/libc.so (je_tcache_bin_flush_small+880)
  #01  pc 0x0000000000057b5c  /apex/com.android.runtime/lib64/bionic/libc.so (je_free+684)
  #02  pc 0x0000000007d57d3c  /data/app/~~iHElsM6XYde8gDm0KFnDuw==/org.godotengine.editor.v4-xlH7DU7iTBQ1rEoFM5A8jw==/split_config.arm64_v8a.apk (RendererRD::LightStorage::shadow_atlas_set_size(RID, int, bool)+265) (BuildId: 149dd1732e747a99)
  #03  pc 0x0000000007d578f8  /data/app/~~iHElsM6XYde8gDm0KFnDuw==/org.godotengine.editor.v4-xlH7DU7iTBQ1rEoFM5A8jw==/split_config.arm64_v8a.apk (RendererRD::LightStorage::shadow_atlas_free(RID)+2113) (BuildId: 149dd1732e747a99)
  #04  pc 0x0000000007b5a9b8  /data/app/~~iHElsM6XYde8gDm0KFnDuw==/org.godotengine.editor.v4-xlH7DU7iTBQ1rEoFM5A8jw==/split_config.arm64_v8a.apk (RendererViewport::free(RID)+1605) (BuildId: 149dd1732e747a99)
  #05  pc 0x0000000007b0f8d0  /data/app/~~iHElsM6XYde8gDm0KFnDuw==/org.godotengine.editor.v4-xlH7DU7iTBQ1rEoFM5A8jw==/split_config.arm64_v8a.apk (RenderingServerDefault::free(RID)+54) (BuildId: 149dd1732e747a99)
  #06  pc 0x00000000061a12f0  /data/app/~~iHElsM6XYde8gDm0KFnDuw==/org.godotengine.editor.v4-xlH7DU7iTBQ1rEoFM5A8jw==/split_config.arm64_v8a.apk (Viewport::~Viewport()+5396) (BuildId: 149dd1732e747a99)
  #07  pc 0x00000000061f48f8  /data/app/~~iHElsM6XYde8gDm0KFnDuw==/org.godotengine.editor.v4-xlH7DU7iTBQ1rEoFM5A8jw==/split_config.arm64_v8a.apk (Window::~Window()+3506) (BuildId: 149dd1732e747a99)
  • deadlock caused by the RemoteDebugger (presumably, trying to quit the app while debugging)
  #00  pc 0x00000000000f5ac7  /apex/com.android.runtime/lib64/bionic/libc.so (nanosleep+7)
  #01  pc 0x0000000004c10759  /data/app/~~yChvFkCcCNHZaKp-GP7PVw==/org.godotengine.editor.v4-mFCAkM6fEbKBdSDpVw39mg==/split_config.x86_64.apk (OS_Unix::delay_usec(unsigned int) const+382) (BuildId: 59619ec116082fd6)
  #02  pc 0x000000000893fc15  /data/app/~~yChvFkCcCNHZaKp-GP7PVw==/org.godotengine.editor.v4-mFCAkM6fEbKBdSDpVw39mg==/split_config.x86_64.apk (RemoteDebugger::debug(bool, bool)+609) (BuildId: 59619ec116082fd6)
  #03  pc 0x0000000008946d6c  /data/app/~~yChvFkCcCNHZaKp-GP7PVw==/org.godotengine.editor.v4-mFCAkM6fEbKBdSDpVw39mg==/split_config.x86_64.apk (ScriptDebugger::debug(ScriptLanguage*, bool, bool)+90) (BuildId: 59619ec116082fd6)
  #04  pc 0x0000000003e27bb9  /data/app/~~yChvFkCcCNHZaKp-GP7PVw==/org.godotengine.editor.v4-mFCAkM6fEbKBdSDpVw39mg==/split_config.x86_64.apk (GDScriptLanguage::debug_break(String const&, bool)+282) (BuildId: 59619ec116082fd6)
  #05  pc 0x0000000003e0e1a8  /data/app/~~yChvFkCcCNHZaKp-GP7PVw==/org.godotengine.editor.v4-mFCAkM6fEbKBdSDpVw39mg==/split_config.x86_64.apk (GDScriptFunction::call(GDScriptInstance*, Variant const**, int, Callable::CallError&, GDScriptFunction::CallState*)+3949) (BuildId: 59619ec116082fd6)
  #06  pc 0x0000000003de49fc  /data/app/~~yChvFkCcCNHZaKp-GP7PVw==/org.godotengine.editor.v4-mFCAkM6fEbKBdSDpVw39mg==/split_config.x86_64.apk (GDScriptInstance::callp(StringName const&, Variant const**, int, Callable::CallError&)+2065) (BuildId: 59619ec116082fd6)
  #07  pc 0x00000000063906e4  /data/app/~~yChvFkCcCNHZaKp-GP7PVw==/org.godotengine.editor.v4-mFCAkM6fEbKBdSDpVw39mg==/split_config.x86_64.apk (Node::_notification(int)+399) (BuildId: 59619ec116082fd6)

@akien-mga akien-mga changed the title Fix ANRs when shutting down the engine Android: Fix ANRs when shutting down the engine due to the GL render thread Jan 6, 2026
Copy link
Member

@akien-mga akien-mga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit heavy handed to have to force terminate the app within less than a second to avoid Android considering a bit of thread join delay a deadlock... but that's what we have to work with, and I agree a force termination is better than an ANR report.

We should definitely have issues opened for the most common occurrences of this ANR, especially if we're going to stop getting these reports and not know that user logcat's in the wild are spammed with warnings of early termination.

@akien-mga akien-mga requested a review from a team January 6, 2026 19:35
@m4gr3d m4gr3d changed the title Android: Fix ANRs when shutting down the engine due to the GL render thread Android: Fix ANRs when shutting down the engine due to the render thread Jan 7, 2026
@m4gr3d
Copy link
Contributor Author

m4gr3d commented Jan 7, 2026

It seems a bit heavy handed to have to force terminate the app within less than a second to avoid Android considering a bit of thread join delay a deadlock... but that's what we have to work with, and I agree a force termination is better than an ANR report.

The default ANR timeout is 5 seconds and it's triggered when the main UI thread is blocked for that long so it can be either be a deadlock or just an operation taking longer than 5 seconds and blocking the main UI thread.
We could increase the current force kill timeout so long as it's less than 5 seconds; the current value is me being overly cautious.

When Main::cleanup() terminates properly without deadlocking, the default behavior is also to force kill the app process since only a single Godot instance can run per process, otherwise the engine would not initialize properly when the app is started again since Android tends to reuse app process when they are still around. So the new force kill timeout behavior is more of the same, except that Main::cleanup() does not get to complete since it's deadlocked.

Also the issue occurs for both vulkan and opengl; the opengl reports are just more prevalent.

We should definitely have issues opened for the most common occurrences of this ANR, especially if we're going to stop getting these reports and not know that user logcat's in the wild are spammed with warnings of early termination.

I'll open corresponding issues in the morning.

@akien-mga akien-mga merged commit 4595e5f into godotengine:master Jan 8, 2026
20 checks passed
@akien-mga
Copy link
Member

Thanks!

@akien-mga
Copy link
Member

Cherry-picked for 4.5.2.

@akien-mga akien-mga removed the cherrypick:4.5 Considered for cherry-picking into a future 4.5.x release label Jan 8, 2026
@m4gr3d m4gr3d deleted the fix_anr_on_exit branch January 8, 2026 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants