|
| 1 | +# Install tensorflow-gpu's C++ interface |
| 2 | +The tensorflow's C++ interface will be compiled from the source code. Firstly one installs bazel. It is highly recommended that the bazel version 0.24.1 is used. A full instruction of bazel installation can be found [here](https://docs.bazel.build/versions/master/install.html). |
| 3 | +```bash |
| 4 | +cd /some/workspace |
| 5 | +wget https://github.com/bazelbuild/bazel/releases/download/0.24.1/bazel-0.24.1-dist.zip |
| 6 | +mkdir bazel-0.24.1 |
| 7 | +cd bazel-0.24.1 |
| 8 | +unzip ../bazel-0.24.1-dist.zip |
| 9 | +./compile.sh |
| 10 | +export PATH=`pwd`/output:$PATH |
| 11 | +``` |
| 12 | + |
| 13 | +Firstly get the source code of the tensorflow |
| 14 | +```bash |
| 15 | +cd /some/workspace |
| 16 | +git clone https://github.com/tensorflow/tensorflow tensorflow -b v1.14.0 --depth=1 |
| 17 | +cd tensorflow |
| 18 | +``` |
| 19 | + |
| 20 | +DeePMD-kit is compiled by cmake, so we need to compile and integrate tensorflow with cmake projects. The rest of this section basically follows [the instruction provided by Tuatini](http://tuatini.me/building-tensorflow-as-a-standalone-project/). Now execute |
| 21 | + |
| 22 | +You will answer a list of questions that help configure the building of tensorflow. It is recommended to build for Python3. You may want to answer the question like this (please replace `$tensorflow_venv` by the virtual environment directory): |
| 23 | +```bash |
| 24 | +./configure |
| 25 | +Please specify the location of python. [Default is xxx]: |
| 26 | + |
| 27 | +Traceback (most recent call last): |
| 28 | + File "<string>", line 1, in <module> |
| 29 | +AttributeError: module 'site' has no attribute 'getsitepackages' |
| 30 | +Found possible Python library paths: |
| 31 | + /xxx/deepmd_gpu/tensorflow_venv/lib/python3.7/site-packages |
| 32 | +Please input the desired Python library path to use. Default is [xxx] |
| 33 | + |
| 34 | +Do you wish to build TensorFlow with XLA JIT support? [Y/n]: |
| 35 | +XLA JIT support will be enabled for TensorFlow. |
| 36 | + |
| 37 | +Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: |
| 38 | +No OpenCL SYCL support will be enabled for TensorFlow. |
| 39 | + |
| 40 | +Do you wish to build TensorFlow with ROCm support? [y/N]: |
| 41 | +No ROCm support will be enabled for TensorFlow. |
| 42 | + |
| 43 | +Do you wish to build TensorFlow with CUDA support? [y/N]: y |
| 44 | +CUDA support will be enabled for TensorFlow. |
| 45 | + |
| 46 | +Do you wish to build TensorFlow with TensorRT support? [y/N]: |
| 47 | +No TensorRT support will be enabled for TensorFlow. |
| 48 | + |
| 49 | +Found CUDA 10.1 in: |
| 50 | + /usr/local/cuda/lib64 |
| 51 | + /usr/local/cuda/include |
| 52 | +Found cuDNN 7 in: |
| 53 | + /usr/local/cuda/lib64 |
| 54 | + /usr/local/cuda/include |
| 55 | + |
| 56 | +Please specify a list of comma-separated CUDA compute capabilities you want to build with. |
| 57 | +You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. |
| 58 | +Please note that each additional compute capability significantly increases your build time and binary size, and that TensorFlow only supports compute capabilities >= 3.5 [Default is: 6.1,6.1]: |
| 59 | + |
| 60 | +Do you want to use clang as CUDA compiler? [y/N]: |
| 61 | +nvcc will be used as CUDA compiler. |
| 62 | + |
| 63 | +Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: |
| 64 | + |
| 65 | + |
| 66 | +Do you wish to build TensorFlow with MPI support? [y/N]: |
| 67 | +No MPI support will be enabled for TensorFlow. |
| 68 | + |
| 69 | +Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]: |
| 70 | + |
| 71 | +Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: |
| 72 | +Not configuring the WORKSPACE for Android builds. |
| 73 | + |
| 74 | +Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details. |
| 75 | + --config=mkl # Build with MKL support. |
| 76 | + --config=monolithic # Config for mostly static monolithic build. |
| 77 | + --config=gdr # Build with GDR support. |
| 78 | + --config=verbs # Build with libverbs support. |
| 79 | + --config=ngraph # Build with Intel nGraph support. |
| 80 | + --config=numa # Build with NUMA support. |
| 81 | + --config=dynamic_kernels # (Experimental) Build kernels into separate shared objects. |
| 82 | + --config=v2 # Build TensorFlow 2.x instead of 1.x. |
| 83 | +Preconfigured Bazel build configs to DISABLE default on features: |
| 84 | + --config=noaws # Disable AWS S3 filesystem support. |
| 85 | + --config=nogcp # Disable GCP support. |
| 86 | + --config=nohdfs # Disable HDFS support. |
| 87 | + --config=noignite # Disable Apache Ignite support. |
| 88 | + --config=nokafka # Disable Apache Kafka support. |
| 89 | + --config=nonccl # Disable NVIDIA NCCL support. |
| 90 | +Configuration finished |
| 91 | +``` |
| 92 | + |
| 93 | +The library path for Python should be set accordingly. |
| 94 | + |
| 95 | +Now build the shared library of tensorflow: |
| 96 | +```bash |
| 97 | +bazel build -c opt --verbose_failures //tensorflow:libtensorflow_cc.so |
| 98 | +``` |
| 99 | +You may want to add options `--copt=-msse4.2`, `--copt=-mavx`, `--copt=-mavx2` and `--copt=-mfma` to enable SSE4.2, AVX, AVX2 and FMA SIMD accelerations, respectively. It is noted that these options should be chosen according to the CPU architecture. If the RAM becomes an issue of your machine, you may limit the RAM usage by using `--local_resources 2048,.5,1.0`. |
| 100 | + |
| 101 | +Now I assume you want to install tensorflow in directory `$tensorflow_root`. Create the directory if it does not exists |
| 102 | +```bash |
| 103 | +mkdir -p $tensorflow_root |
| 104 | +``` |
| 105 | +Now, copy the libraries to the tensorflow's installation directory: |
| 106 | +```bash |
| 107 | +mkdir $tensorflow_root/lib |
| 108 | +cp -d bazel-bin/tensorflow/libtensorflow_cc.so* $tensorflow_root/lib/ |
| 109 | +cp -d bazel-bin/tensorflow/libtensorflow_framework.so* $tensorflow_root/lib/ |
| 110 | +cp -d $tensorflow_root/lib/libtensorflow_framework.so.1 $tensorflow_root/lib/libtensorflow_framework.so |
| 111 | +``` |
| 112 | +Then copy the headers |
| 113 | +```bash |
| 114 | +mkdir -p $tensorflow_root/include/tensorflow |
| 115 | +cp -r bazel-genfiles/* $tensorflow_root/include/ |
| 116 | +cp -r tensorflow/cc $tensorflow_root/include/tensorflow |
| 117 | +cp -r tensorflow/core $tensorflow_root/include/tensorflow |
| 118 | +cp -r third_party $tensorflow_root/include |
| 119 | +cp -r bazel-tensorflow/external/eigen_archive/Eigen/ $tensorflow_root/include |
| 120 | +cp -r bazel-tensorflow/external/eigen_archive/unsupported/ $tensorflow_root/include |
| 121 | +rsync -avzh --include '*/' --include '*.h' --include '*.inc' --exclude '*' bazel-tensorflow/external/protobuf_archive/src/ $tensorflow_root/include/ |
| 122 | +rsync -avzh --include '*/' --include '*.h' --include '*.inc' --exclude '*' bazel-tensorflow/external/com_google_absl/absl/ $tensorflow_root/include/absl |
| 123 | +``` |
| 124 | +Now clean up the source files in the header directories: |
| 125 | +```bash |
| 126 | +cd $tensorflow_root/include |
| 127 | +find . -name "*.cc" -type f -delete |
| 128 | +``` |
| 129 | + |
| 130 | +# Troubleshooting |
| 131 | +```bash |
| 132 | +git: unknown command -C ... |
| 133 | +``` |
| 134 | +This may be your git version issue, because low version of git does not support this command. Upgrading your git maybe helpful. |
| 135 | + |
| 136 | +```bash |
| 137 | +CMake Error: The following variables are used in this project, but they are set to NOTFOUND. |
| 138 | +Please set them or make sure they are set and tested correctly in the CMake files: |
| 139 | +FFTW_LIB (ADVANCED) |
| 140 | + linked by target "FFTW" in directory xxx |
| 141 | +``` |
| 142 | +Currently, when building eigen package, you can delete the FFTW in the cmake file. |
| 143 | + |
| 144 | +```bash |
| 145 | +fatal error: absl/numeric/int128_have_intrinsic.inc: No such file or directory |
| 146 | +``` |
| 147 | +Basicly, you could build an empty file named "int128_have_intrinsic.inc" at the same directory of "int128.h". |
| 148 | + |
| 149 | + |
0 commit comments