-
Notifications
You must be signed in to change notification settings - Fork 555
Use Android shared library loader for JNI libraries #10376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ebe9744
to
96bbe92
Compare
[[gnu::flatten]] | ||
static void init (JNIEnv *env, jclass systemClass) | ||
{ | ||
jni_env = env; | ||
systemKlass = systemClass; | ||
System_loadLibrary = env->GetStaticMethodID (systemClass, "loadLibrary", "(Ljava/lang/String;)V"); | ||
if (System_loadLibrary == nullptr) [[unlikely]] { | ||
Helpers::abort_application ("Failed to look up the Java System.loadLibrary method."); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my understanding, when would you decide to move the implementation to a dso-loader.cc
file? Instead of putting all the code in the .hh
file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move it if the code was complex and/or involved a lot of dependencies (include files). I prefer to inline as much as possible by default, to squeeze out every ounce of performance at run time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this particular code is not called from many places, so size increase in the resulting binary isn't bad at all.
e873db7
to
cd2a310
Compare
cd2a310
to
be52def
Compare
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
93e4028
to
4b16be1
Compare
Unfortunately, due to bad Android design, loading the shared library on a thread other than the main one fails:
The problem here is that the class loader being used ( I'm afraid we'll have no option but to preload all the JNI shared libraries at application startup, thus wasting potentially a lot of precious startup time. |
f072092
to
e47a210
Compare
e47a210
to
6c4d790
Compare
No matter what I try, the call posted to the main thread still doesn't use the right class loader and, thus, `System.loadLibrary` cannot find the shared lib: ``` 08-13 12:16:10.657 12472 12507 D monodroid-assembly: Trying to load loading shared JNI library /data/user/0/Mono.Android.NET_Tests/files/.__override__/arm64-v8a/libSystem.Security.Cryptography.Native.Android.so with System.loadLibrary 08-13 12:16:10.657 12472 12507 D monodroid-assembly: Running DSO loader on thread 12507, dispatching to main thread 08-13 12:16:10.657 12472 12472 D monodroid-assembly: Looper CB called on thread 12472. Will attempt to load DSO 'System.Security.Cryptography.Native.Android' 08-13 12:16:10.657 12472 12472 D monodroid-assembly: Undecorated library name: System.Security.Cryptography.Native.Android 08-13 12:16:10.659 12472 12472 D nativeloader: Load libSystem.Security.Cryptography.Native.Android.so using system ns (caller=/system/framework/framework.jar!classes3.dex): dlopen failed: library "libSystem.Security.Cryptography.Native.Android.so" not found 08-13 12:16:10.659 12472 12472 D monodroid-assembly: System.loadLibrary threw a Java exception. Will attempt to log it. 08-13 12:16:10.661 12472 12472 W System.err: java.lang.UnsatisfiedLinkError: dlopen failed: library "libSystem.Security.Cryptography.Native.Android.so" not found ``` Time to think about something else :( Thank you Android for 3 days wasted
They might be causing this error: ``` 08-18 09:32:06.603 5693 5718 I monodroid: Loaded type: Java.Security.Cert.X509Certificate 08-18 09:32:06.603 5693 5718 E droid.NET_Test: JNI ERROR (app bug): accessed deleted Global 0x3a62 08-18 09:32:06.603 5693 5718 F droid.NET_Test: java_vm_ext.cc:570] JNI DETECTED ERROR IN APPLICATION: use of deleted global reference 0x3a62 08-18 09:32:06.604 5693 5718 F droid.NET_Test: java_vm_ext.cc:570] from void crc643df67da7b13bb6b1.TestInstrumentation_1.n_onStart() ```
6c4d790
to
af20339
Compare
Fixes: #10324 Fixes: #7616 Context: https://docs.oracle.com/en/java/javase/17/docs/specs/jni/invocation.html#library-and-version-management Context: #7616 (comment) Java language/virtual machine supports implementing portions of the API in a language (like C, C++ or Rust) which compiles into native binary code instead of being JIT-ed or interpreted at run time. Such implementations are contained in native shared libraries which have to conform to a set of rules laid out in the JNI (Java Native Interface) documentation. Part of the specification describes a function (`JNI_OnLoad`) which may be present in the shared library and, if it's there, it will be called by the JavaVM when the library is loaded. For this to happen, however, the load must be initiated by calling the `System.loadLibrary(string)` Java method. This method will find the named shared library, load it using an OS-specific mechanism and then call all the exported functions described in the JNI specification, if they are present. Until this PR, .NET for Android (and Xamarin.Android before it) were loading all the shared libraries in the same way, via `dlopen(2)` instead of by using `System.loadLibrary(string)` which resulted in some of those libraries not being initialized properly. This PR fixes the issue by identifying shared libraries which contain the `JNI_OnLoad` function and loading them at run time by calling `System.loadLibrary(string)` instead of just `dlopen(2)`. This makes it certain that the libraries are properly initialized. However, Android environment is quite complex and not everything in the PR is implement the way it was intended to. The problem lies in the ability of `System.loadLibrary(string)` to find the actual shared library file. The file can be found in a number of locations, among them two application-specific ones: * The APK archive's `lib/{ABI}/` directory, when shared libraries are not extracted from the archive on installation. * The application-specific library location on the file system, when shared libraries are extracted from the archive on installation. In either case, the location is not known beforehand as each time the application is installed, it will get a different path where both its archive and extracted files are located. This requires the Java runtime to provide that information to the application in some way. The way ART (the Java runtime on Android) does it is via class loaders, which are special classes that know how to find and load Java components as well as the native libraries. `System.loadLibrary(string)` uses that information to locate the .so files with JavaVM extensions. The mechanism described above works well as long as the `System.loadLibrary` call is made from a thread that's fully attached to the Java VM, which is to say that the VM environment sets up the class loaders correctly, so that they contain information about the application-specific shared library locations. With the correctly configured class loaders, we can see a similar message when loading the shared library with `System.loadLibrary`: ``` 08-13 12:06:48.269 11989 11989 D nativeloader: Load /data/app/~~Xy-UIVle34c_VksRd2_LEg==/com.xamarin.XAPerfTest.net10-GbhwYcau77FAjV_FW1uZwg==/split_config.arm64_v8a.apk!/lib/arm64-v8a/libSystem.Security.Cryptography.Native.Android.so using class loader ns clns-9 (caller=/data/app/~~Xy-UIVle34c_VksRd2_LEg==/com.xamarin.XAPerfTest.net10-GbhwYcau77FAjV_FW1uZwg==/base.apk): ok ``` The bits to note above are the `class loader ns clns-9` information - it's a dynamically configured loader that is fully informed on application-specific shared library locations and the name of the caller (cryptic-looking path ending with `base.apk`). This loader is used during, for instance, our native runtime configuration - when it is being intialized from our (Java) package manager at application startup. However, the problem is that the above class loader is no longer around when we call `System.loadLibrary` on a thread that's not fully attached to the Java VM: ``` 08-13 12:16:10.659 12472 12472 D nativeloader: Load libSystem.Security.Cryptography.Native.Android.so using system ns (caller=/system/framework/framework.jar!classes3.dex): dlopen failed: library "libSystem.Security.Cryptography.Native.Android.so" not found ``` In this case note that both the class loader (named here just `system ns`) and the caller are generic, they have no knowledge of the application-specific shared library locations. This observation lead to the idea of using the native looper (`ALooper`) interface to post the shared library load request to the main thread from native code, and then call `System.loadLibrary` on it. This assumed that the main thread, which originally had the application-specific class loader, would still be around and able to handle the load properly. Unfortunately, this doesn't appear to be the case. Despite us attaching to the Java thread with JNI API (`AttachCurrentThread`), the application-specific class loader isn't there. This was discovered a few years ago already (see the link to issue #7616 comment) but we haven't been able to find a way to fully attach the thread to the Java VM so that the class loaders are correctly set up. This, unfortunately, leads us to our only remaining option - preloading of the JNI libraries at application started, while we're still in the properly configured main thread. This PR implements just that, but it also implements and uses code to load JNI shared libraries on-demand from any thread by posting the request to the main thread, as it may just happen that a request to load a shared library will happen on a separate managed thread during startup and we might get lucky to run the loading code on a still-attached main thread. In the future more work is required (much more) to investigate the internals of the ART runtime in order to try to find a way to fully attach managed threads so that class loaders are set up properly.
This reverts commit 064f23f. We are seeing `dotnet new maui` projects fail to debug, as they appear to be stuck in a loop on startup: 08-22 11:39:14.148 W/monodroid-assembly(10269): Timeout while waiting for shared library '/data/user/0/com.companyname.mauiapp14/files/.__override__/arm64-v8a/libSystem.Security.Cryptography.Native.Android.so' to load. 08-22 11:39:17.155 W/monodroid-assembly(10269): Timeout while waiting for shared library '/data/app/~~Q5yyfDmzDqX9Z8UwQnLoFA==/com.companyname.mauiapp14-ZPk2_y6fT3b_3leM8xhcAw==/lib/arm64/libSystem.Security.Cryptography.Native.Android.so' to load. 08-22 11:39:20.164 W/monodroid-assembly(10269): Timeout while waiting for shared library '/data/user/0/com.companyname.mauiapp14/files/.__override__/arm64-v8a/libSystem.Security.Cryptography.Native.Android' to load. 08-22 11:39:23.172 W/monodroid-assembly(10269): Timeout while waiting for shared library '/data/app/~~Q5yyfDmzDqX9Z8UwQnLoFA==/com.companyname.mauiapp14-ZPk2_y6fT3b_3leM8xhcAw==/lib/arm64/libSystem.Security.Cryptography.Native.Android' to load. 08-22 11:39:23.172 W/monodroid-assembly(10269): Shared library 'libSystem.Security.Cryptography.Native.Android' not loaded, p/invoke 'AndroidCryptoNative_SSLStreamInitialize' may fail 08-22 11:39:23.172 F/monodroid-assembly(10269): Failed to load symbol 'AndroidCryptoNative_SSLStreamInitialize' from shared library 'libSystem.Security.Cryptography.Native.Android' 08-22 11:39:26.187 W/monodroid-assembly(10269): Timeout while waiting for shared library '/data/user/0/com.companyname.mauiapp14/files/.__override__/arm64-v8a/libSystem.Security.Cryptography.Native.Android.so' to load. 08-22 11:39:29.190 W/monodroid-assembly(10269): Timeout while waiting for shared library '/data/app/~~Q5yyfDmzDqX9Z8UwQnLoFA==/com.companyname.mauiapp14-ZPk2_y6fT3b_3leM8xhcAw==/lib/arm64/libSystem.Security.Cryptography.Native.Android.so' to load. 08-22 11:39:32.198 W/monodroid-assembly(10269): Timeout while waiting for shared library '/data/user/0/com.companyname.mauiapp14/files/.__override__/arm64-v8a/libSystem.Security.Cryptography.Native.Android' to load. repeating...
…oo (#10444) Context: cba39dc Context: #10376 Context: #10324 cba39dc implemented preloading of JNI-using native libraries but it missed to update alias entries for each preloaded library. During application build we generate native code that contains cache for each native library packaged with the managed application code. Each library follows the same naming pattern: `lib<NAME>.so`. However, the managed code can refer to those libraries (when e.g. declaring a p/invoke with the `[DllImporrt]` attribute) using different forms of names. The request may have a form of `lib<NAME>` or `<NAME>` etc. When the runtime tries to resolve the p/invoke symbol, it first needs to load the shared library. This is done (in our case) by using a callback into our runtime which then tries to find the library and load it. Should the attempt fail, the runtime will mutate the library name and ask as again until all the possible names are tried or the library is loaded successfully. This roundtrip is pretty expensive, so in our native library loader code we implemented (in c227042) a scheme where at build time we mutate library names ourselves and a separate entry for each name mutation in the shared library cache. This way, when the runtime request comes, we perform a single search and are able to find the library no matter what name the managed code requested. Each of the cache entries contains, among other things irrelevant to this PR, a field which stores the native library's handle, after it is loaded. cba39dc loaded the library and set that field in just a single cache entry, the one corresponding to the canonical library name (`lib<NAME>.so`) but it failed to set the field in all the aliases. This resulted in an attempt to load the library again, with the managed code requesting it by a different name, finding the corresponding cache entry and seeing that its handle is unset. However, since the request was sent from a different thread, we attempted to load the library on the main thread (described in detail in cba39dc commit message), which attempt always failed leading to an endless loop and application crash/hang while debugging. Fix the issue by setting native shared library handle in all the cache entries corresponding to various mutations of the library name. This makes sure that further requests to load the library will see the handle set in cache and use it, instead of attempting to load the it again.
Fixes: #10324
Fixes: #7616
Context: https://docs.oracle.com/en/java/javase/17/docs/specs/jni/invocation.html#library-and-version-management
Context: #7616 (comment)
Java language/virtual machine supports implementing portions
of the API in a language (like C, C++ or Rust) which compiles
into native binary code instead of being JIT-ed or interpreted
at run time. Such implementations are contained in native shared
libraries which have to conform to a set of rules laid out in
the JNI (Java Native Interface) documentation.
Part of the specification describes a function (
JNI_OnLoad
) whichmay be present in the shared library and, if it's there, it will
be called by the JavaVM when the library is loaded. For this to
happen, however, the load must be initiated by calling the
System.loadLibrary(string)
Java method. This method will findthe named shared library, load it using an OS-specific mechanism and
then call all the exported functions described in the JNI specification,
if they are present.
Until this PR, .NET for Android (and Xamarin.Android before it) were
loading all the shared libraries in the same way, via
dlopen(2)
insteadof by using
System.loadLibrary(string)
which resulted in some of thoselibraries not being initialized properly. This PR fixes the issue by
identifying shared libraries which contain the
JNI_OnLoad
function andloading them at run time by calling
System.loadLibrary(string)
insteadof just
dlopen(2)
. This makes it certain that the libraries are properlyinitialized.
However, Android environment is quite complex and not everything in the PR
is implement the way it was intended to. The problem lies in the ability of
System.loadLibrary(string)
to find the actual shared library file. The filecan be found in a number of locations, among them two application-specific ones:
lib/{ABI}/
directory, when shared libraries are notextracted from the archive on installation.
libraries are extracted from the archive on installation.
In either case, the location is not known beforehand as each time the application
is installed, it will get a different path where both its archive and extracted
files are located. This requires the Java runtime to provide that information to
the application in some way. The way ART (the Java runtime on Android) does it is
via class loaders, which are special classes that know how to find and load Java
components as well as the native libraries.
System.loadLibrary(string)
uses thatinformation to locate the .so files with JavaVM extensions.
The mechanism described above works well as long as the
System.loadLibrary
callis made from a thread that's fully attached to the Java VM, which is to say that
the VM environment sets up the class loaders correctly, so that they contain information
about the application-specific shared library locations. With the correctly configured
class loaders, we can see a similar message when loading the shared library with
System.loadLibrary
:The bits to note above are the
class loader ns clns-9
information - it's a dynamicallyconfigured loader that is fully informed on application-specific shared library locations
and the name of the caller (cryptic-looking path ending with
base.apk
).This loader is used during, for instance, our native runtime configuration - when it is
being intialized from our (Java) package manager at application startup.
However, the problem is that the above class loader is no longer around when we call
System.loadLibrary
on a thread that's not fully attached to the Java VM:In this case note that both the class loader (named here just
system ns
) and thecaller are generic, they have no knowledge of the application-specific shared library
locations.
This observation lead to the idea of using the native looper (
ALooper
) interfaceto post the shared library load request to the main thread from native code, and then
call
System.loadLibrary
on it. This assumed that the main thread, which originally hadthe application-specific class loader, would still be around and able to handle the
load properly. Unfortunately, this doesn't appear to be the case. Despite us attaching
to the Java thread with JNI API (
AttachCurrentThread
), the application-specificclass loader isn't there. This was discovered a few years ago already (see the link to
issue #7616 comment) but we haven't been able to find a way to fully attach the thread
to the Java VM so that the class loaders are correctly set up.
This, unfortunately, leads us to our only remaining option - preloading of the JNI libraries
at application started, while we're still in the properly configured main thread.
This PR implements just that, but it also implements and uses code to load JNI shared libraries
on-demand from any thread by posting the request to the main thread, as it may just happen
that a request to load a shared library will happen on a separate managed thread during
startup and we might get lucky to run the loading code on a still-attached main thread.
In the future more work is required (much more) to investigate the internals of the ART
runtime in order to try to find a way to fully attach managed threads so that class loaders
are set up properly.