Skip to content

Tensorflow is taking over my openssl and causing segfaults #417

@msdrigg

Description

@msdrigg

So I recently added tensorflow to a rust project that had an external openssl dependency (reqwests and paho-mqtt) and I immediately started seeing segfaults. The strange thing is that these segfaults are coming from crypto functions being called in the tensorflow_framework.so.2 library from from paho-mqtt (SSLSocket_initialize in the core dump shown below). If I remove the paho-mqtt dependency on ssl, I see similar things with reqwests

Relevant Logs

This backtrace reliably occurs everytime I run my program.

(gdb) bt
#0  __pthread_rwlock_wrlock_full64 (abstime=0x0, clockid=0, rwlock=0x0)
    at ./nptl/pthread_rwlock_common.c:603
#1  ___pthread_rwlock_wrlock (rwlock=0x0) at ./nptl/pthread_rwlock_wrlock.c:26
#2  0x00007f8ec0e6db69 in CRYPTO_STATIC_MUTEX_lock_write ()
   from /home/myuser/workspace/target/debug/build/tensorflow-sys-b3a831e1f8b18f5e/out/libtensorflow_framework.so.2
#3  0x00007f8ec0df6263 in CRYPTO_get_ex_new_index ()
   from /home/myuser/workspace/target/debug/build/tensorflow-sys-b3a831e1f8b18f5e/out/libtensorflow_framework.so.2
#4  0x0000564ee8a50b43 in SSLSocket_initialize ()
    at /home/myuser/.cargo/registry/src/index.crates.io-6f17d22bba15001f/paho-mqtt-sys-0.8.1/paho.mqtt.c/src/SSLSocket.c:492
#5  0x0000564ee8a440ff in MQTTAsync_createWithOptions (handle=0x7f8ea4bdfe00, 
    serverURI=0x7f8df4004fc0 "tcp://localhost:1883", 
    clientId=0x7f8df4004fe0 "program", persistence_type=1, 
    persistence_context=0x0, options=0x7f8ea4bdfcc8)
    at /home/myuser/.cargo/registry/src/index.crates.io-6f17d22bba15001f/paho-mqtt-sys-0.8.1/paho.mqtt.c/src/MQTTAsync.c:372
#6  0x0000564ee8a22c37 in paho_mqtt::async_client::AsyncClient::new<paho_mqtt::create_options::CreateOptions> (opts=...) at src/async_client.rs:201
#7  0x0000564ee8a2127a in paho_mqtt::create_options::CreateOptionsBuilder::create_client (self=...)
    at src/create_options.rs:444

Interestingly, here's what I see from ldd. Note that libssl.so.3 does correctly point to the real openssl, so I don't know why at runtime it gets linked to tensorflow_framework.so.2

$ldd target/debug/program
        linux-vdso.so.1 (0x00007ffc46ffe000)
        libtensorflow_framework.so.2 => /usr/local/lib/libtensorflow_framework.so.2 (0x00007fb1b0000000)
        libtensorflow.so.2 => /usr/local/lib/libtensorflow.so.2 (0x00007fb19f000000)
        libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007fb1b767e000)
        libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007fb19ea00000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb1b765e000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb19ef19000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb19e600000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fb1b773c000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fb19e200000)

Note: I am using the latest rust versions and the latest versions of all packages mentioned here. Here's what my uname -a output looks like:

Linux pop-os 6.4.6-76060406-generic #202307241739~1690928105~22.04~d567a38 SMP PREEMPT_DYNAMIC Tue A x86_64 x86_64 x86_64 GNU/Linux

Prior Art

The only other mention of this issue I could find was here tensorflow/tensorflow#34742, and I am currently trying to resolve my problem using the steps outlined in that issue.

Goals

A perfect fix would be for me to be able to seamlessly use tensorflow and openssl in a project without any tweaks, but I would consider this issue closed for me if we could find some workaround (environmental variables, build script or something similar) so that I could make my project run without segfaulting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions