Skip to content

Installation on Google Cloud

saganatt edited this page Jul 8, 2021 · 2 revisions

Image: Deep Learning Image: TensorFlow 2.4, m66 CUDA 110

Install from scratch

  1. On first launch, it asks if you want to install Nvidia drivers, choose 'yes'. If it prints errors, wait a while, then try:
sudo /opt/deeplearning/install-driver.sh
  1. (Optional?) Stop jupyter notebook (powered by conda): sudo service stop jupyter
  2. Install pip packages: pip install setuptools wheel virtualenv
  3. Install aliBuild prerequisities for Ubuntu: link. Note: get default-libmysqlclient-dev instead of libmysqlclient-dev.
  4. If you see at the end some ldconfig errors, it means CUDA libraries are not symlinked correctly. Run:
sudo ln -sf /usr/local/cuda/lib64/libcudnn.so.8.0.5 /usr/local/cuda/lib64/libcudnn.so.8
sudo ln -sf /usr/local/cuda/lib64/libcudnn.so.8 /usr/local/cuda/lib64/libcudnn.so

Same for: libcudnn_{adv,cnn,ops}_{infer,train}.so, /lib/libnvinfer.so, /lib/libnvinfer_plugin.so, /lib/libnvonnxparser.so, /lib/libmyelin.so, /lib/libnvparsers.so.

You can check with ls -lha /usr/local/cuda/lib64/libcudnn* whether the files are properly symlinked. ldconfig -v tests libraries.

  1. Additional dependencies: sudo apt-get install libssl-dev libpython3.7 tcl environment-modules
  2. Install aliBuild: sudo pip install alibuild --upgrade. Add appropriate lines to your ~/.bashrc:
export ALIBUILD_WORK_DIR="/home/jupyter/alice/sw"
eval "`alienv shell-helper`"

NOTE: You might prefer to install under /home/jupyter, where the additional big disk is mounted on. Then, it is better to move all jupyter stuff to a separate directory, e.g., /home/jupyter/jupyter:

sudo mkdir /home/jupyter/jupyter
sudo chown -R jupyter:jupyter /home/jupyter/jupyter

Then, edit the jupyter paths in /home/jupyter/jupyter/.jupyter/jupyter_notebook_config.py.

  1. Install ROOT and AliPhysics:
screen # So as not to care about connection problems
sudo mkdir -p /home/jupyter/alice
sudo chown -R <user>:<user> /home/jupyter/alice/ # Replace with your username
cd /home/jupyter/alice
aliBuild init AliPhysics@master
aliBuild build AliPhysics --defaults user-next-root6 --force-unknown-architecture

Copy the data

Edit ~/.ssh/config to add tunnel to aliceml.

NOTE: Create bias / nobias subdirectories and copy only 90-17-17 or 180-33-33 data. 180-x-x files take a huge time to transfer!

sudo mkdir /home/jupyter/data
sudo chown -R <user>:<user> /home/jupyter/data
cd /home/jupyter/data
mkdir bias; mkdir nobias
rsync -vaP <user>@aliceml:<path_to_data> <target_path>

Run the analysis

Currently the installation (the steps above) is ready for use on instance 4

  1. If you haven't done it before (during the installation): pip install setuptools wheel virtualenv and add appropriate lines to your ~/.bashrc:
export ALIBUILD_WORK_DIR="/home/jupyter/alice/sw"
eval "`alienv shell-helper`"
  1. Get TPCwithDNN in the home directory: git clone https://github.com/AliceO2Group/TPCwithDNN.git.
  2. Modify load.sh:
    • check python path: which python and correct the path on line 12
    • change python version (to 3.7) on line 13
    • correct ALICE_ROOT path on line 92
  3. Install the package:
alienv enter AliPhysics/latest
source load.sh
pip install -e .
  1. pip install root_pandas root_numpy

Check results via X11 forward

sudo apt-get install qpdfview eog To get X11 forwarding, you need to install gcloud console on your machine. During the installation you can specify your default project and zone.

curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-333.0.0-linux-x86_64.tar.gz
tar -xvf google-cloud-sdk-333.0.0-linux-x86_64.tar.gz
cd google-cloud-sdk
./install.sh
source ~/.bashrc

Then ssh with gcloud:

gcloud init
gcloud compute ssh --ssh-flag="-Y" --zone <zone> --project <project_name> <username>@<cloud_instance_name>

If some commands will complain about locale, set it with: sudo dpkg-reconfigure locales, select matching locale with Space (not Enter!)

Shutdown when execution finished

Add sudo shutdown -h now at the end of your run *.sh script.

NOTE: piping with ... | shutdown -h now causes immediate shutdown!