- 
                Notifications
    You must be signed in to change notification settings 
- Fork 430
Description
So I recently added tensorflow to a rust project that had an external openssl dependency (reqwests and paho-mqtt) and I immediately started seeing segfaults. The strange thing is that these segfaults are coming from crypto functions being called in the tensorflow_framework.so.2 library from from paho-mqtt (SSLSocket_initialize in the core dump shown below). If I remove the paho-mqtt dependency on ssl, I see similar things with reqwests
Relevant Logs
This backtrace reliably occurs everytime I run my program.
(gdb) bt
#0  __pthread_rwlock_wrlock_full64 (abstime=0x0, clockid=0, rwlock=0x0)
    at ./nptl/pthread_rwlock_common.c:603
#1  ___pthread_rwlock_wrlock (rwlock=0x0) at ./nptl/pthread_rwlock_wrlock.c:26
#2  0x00007f8ec0e6db69 in CRYPTO_STATIC_MUTEX_lock_write ()
   from /home/myuser/workspace/target/debug/build/tensorflow-sys-b3a831e1f8b18f5e/out/libtensorflow_framework.so.2
#3  0x00007f8ec0df6263 in CRYPTO_get_ex_new_index ()
   from /home/myuser/workspace/target/debug/build/tensorflow-sys-b3a831e1f8b18f5e/out/libtensorflow_framework.so.2
#4  0x0000564ee8a50b43 in SSLSocket_initialize ()
    at /home/myuser/.cargo/registry/src/index.crates.io-6f17d22bba15001f/paho-mqtt-sys-0.8.1/paho.mqtt.c/src/SSLSocket.c:492
#5  0x0000564ee8a440ff in MQTTAsync_createWithOptions (handle=0x7f8ea4bdfe00, 
    serverURI=0x7f8df4004fc0 "tcp://localhost:1883", 
    clientId=0x7f8df4004fe0 "program", persistence_type=1, 
    persistence_context=0x0, options=0x7f8ea4bdfcc8)
    at /home/myuser/.cargo/registry/src/index.crates.io-6f17d22bba15001f/paho-mqtt-sys-0.8.1/paho.mqtt.c/src/MQTTAsync.c:372
#6  0x0000564ee8a22c37 in paho_mqtt::async_client::AsyncClient::new<paho_mqtt::create_options::CreateOptions> (opts=...) at src/async_client.rs:201
#7  0x0000564ee8a2127a in paho_mqtt::create_options::CreateOptionsBuilder::create_client (self=...)
    at src/create_options.rs:444
Interestingly, here's what I see from ldd. Note that libssl.so.3 does correctly point to the real openssl, so I don't know why at runtime it gets linked to tensorflow_framework.so.2
$ldd target/debug/program
        linux-vdso.so.1 (0x00007ffc46ffe000)
        libtensorflow_framework.so.2 => /usr/local/lib/libtensorflow_framework.so.2 (0x00007fb1b0000000)
        libtensorflow.so.2 => /usr/local/lib/libtensorflow.so.2 (0x00007fb19f000000)
        libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007fb1b767e000)
        libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007fb19ea00000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb1b765e000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb19ef19000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb19e600000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fb1b773c000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fb19e200000)
Note: I am using the latest rust versions and the latest versions of all packages mentioned here. Here's what my uname -a output looks like:
Linux pop-os 6.4.6-76060406-generic #202307241739~1690928105~22.04~d567a38 SMP PREEMPT_DYNAMIC Tue A x86_64 x86_64 x86_64 GNU/Linux
Prior Art
The only other mention of this issue I could find was here tensorflow/tensorflow#34742, and I am currently trying to resolve my problem using the steps outlined in that issue.
Goals
A perfect fix would be for me to be able to seamlessly use tensorflow and openssl in a project without any tweaks, but I would consider this issue closed for me if we could find some workaround (environmental variables, build script or something similar) so that I could make my project run without segfaulting.