Skip to content

Commit f174246

Browse files
authored
Merge pull request #202 from smathermather/hpc
2 parents dcdaed8 + 7306905 commit f174246

File tree

1 file changed

+90
-154
lines changed

1 file changed

+90
-154
lines changed

source/tutorials.rst

Lines changed: 90 additions & 154 deletions
Original file line numberDiff line numberDiff line change
@@ -295,54 +295,6 @@ Cleaning up after Docker
295295

296296
Docker has a lamentable use of space and by default does not clean up excess data and machines when processes are complete. This can be advantageous if we need to access a process that has since terminated, but carries the burden of using increasing amounts of storage over time. Maciej Łebkowski has an `excellent overview of how to manage excess disk usage in docker <https://lebkowski.name/docker-volumes/>`_.
297297

298-
*****************
299-
Using Singularity
300-
*****************
301-
302-
`Singularity <https://sylabs.io/>`__ is another container platform able to run Docker images.
303-
Singularity can be run both on local machins and in instances where the user does not have root access.
304-
Instances where a user may not have root privlidges include HPC clusters and cloud cluster resources.
305-
A container is a single file without anything else to install.
306-
307-
Build Singularity image from Docker image
308-
=========================================
309-
Singularity can use Docker image to build SIF image.
310-
311-
For latest ODM Docker image (Recommended) :
312-
313-
.. code:: bash
314-
315-
singularity build --disable-cache -f odm_latest.sif docker://opendronemap/odm:latest
316-
317-
For latest ODM GPU Docker image :
318-
319-
.. code:: bash
320-
321-
singularity build --disable-cache -f odm_gpu.sif docker://opendronemap/odm:gpu
322-
323-
Using Singularity SIF image
324-
===========================
325-
326-
327-
Once you have used one of the above commands to download and create the `odm_latest.sif` image, it can be ran using singularity.
328-
Place your images in a directory named “images” (for example /my/project/images) , then simply run :
329-
330-
.. code:: bash
331-
332-
singularity run --bind /my/project:/datasets/code odm_latest.sif --project-path /datasets
333-
334-
Like with docker, additional `Options and Flags <https://docs.opendronemap.org/arguments/>`_ can be added to the command :
335-
336-
.. code:: bash
337-
338-
singularity run --bind /my/project:/datasets/code \
339-
--writable-tmpfs odm_latest.sif \
340-
--orthophoto-png --mesh-octree-depth 12 --ignore-gsd --dtm \
341-
--smrf-threshold 0.4 --smrf-window 24 --dsm --pc-csv --pc-las --orthophoto-kmz \
342-
--ignore-gsd --matcher-type flann --feature-quality ultra --max-concurrency 16 \
343-
--use-hybrid-bundle-adjustment --build-overviews --time --min-num-features 10000 \
344-
--project-path /datasets
345-
346298
*************************************
347299
Using ODM from low-bandwidth location
348300
*************************************
@@ -869,176 +821,160 @@ For instance, point clouds properties can be modified to show elevation and also
869821

870822
`Learn to edit <https://github.com/opendronemap/docs#how-to-make-your-first-contribution>`_ and help improve `this page <https://github.com/OpenDroneMap/docs/blob/publish/source/tutorials.rst>`_!
871823

824+
*****************
825+
Using Singularity
826+
*****************
872827

873-
***************************************************
874-
ClusterODM, NodeODM, SLURM, with Singularity on HPC
875-
***************************************************
828+
`Singularity <https://sylabs.io/>`__ is another container platform able to run Docker images.
829+
Singularity can be run both on local machins and in instances where the user does not have root access.
830+
Instances where a user may not have root privlidges include HPC clusters and cloud cluster resources.
831+
A container is a single file without anything else to install.
876832

877-
Let's say that we will get ClusterODM and NodeODM images in the same folder
833+
Build Singularity image from Docker image
834+
=========================================
835+
Singularity can use Docker image to build SIF image.
878836

879-
Downloading and installing the images
880-
=====================================
837+
For latest ODM Docker image (Recommended) :
881838

882-
In this example ClusterODM and NodeODM will be installed in $HOME/git
839+
.. code:: bash
883840
884-
ClusterODM
885-
----------
841+
singularity build --disable-cache -f odm_latest.sif docker://opendronemap/odm:latest
886842
887-
::
843+
For latest ODM GPU Docker image :
888844

889-
cd $HOME/git
890-
git clone https://github.com/OpenDroneMap/ClusterODM
891-
cd ClusterODM
892-
singularity pull --force --disable-cache docker://opendronemap/clusterodm:latest
845+
.. code:: bash
846+
847+
singularity build --disable-cache -f odm_gpu.sif docker://opendronemap/odm:gpu
893848
894-
ClusterODM image needs to be "installed"
895-
::
849+
Using Singularity SIF image
850+
===========================
896851

897-
singularity shell --bind $PWD:/var/www clusterodm_latest.sif`
898852

899-
And then in the Singularity shell
900-
::
853+
Once you have used one of the above commands to download and create the `odm_latest.sif` image, it can be ran using singularity.
854+
Place your images in a directory named “images” (for example /my/project/images) , then simply run :
901855

902-
cd /var/www
903-
npm install --production
904-
exit
856+
.. code:: bash
905857
906-
NodeODM
907-
-------
858+
singularity run --bind /my/project:/datasets/code odm_latest.sif --project-path /datasets
908859
909-
::
860+
Like with docker, additional `Options and Flags <https://docs.opendronemap.org/arguments/>`_ can be added to the command :
910861

911-
cd $HOME/git
912-
git clone https://github.com/OpenDroneMap/NodeODM
913-
cd NodeODMDM
914-
singularity pull --force --disable-cache docker://opendronemap/nodeodm:latest
862+
.. code:: bash
915863
916-
NodeODM image needs to be "installed"
917-
::
864+
singularity run --bind /my/project:/datasets/code \
865+
--writable-tmpfs odm_latest.sif \
866+
--orthophoto-png --mesh-octree-depth 12 --ignore-gsd --dtm \
867+
--smrf-threshold 0.4 --smrf-window 24 --dsm --pc-csv --pc-las --orthophoto-kmz \
868+
--ignore-gsd --matcher-type flann --feature-quality ultra --max-concurrency 16 \
869+
--use-hybrid-bundle-adjustment --build-overviews --time --min-num-features 10000 \
870+
--project-path /datasets
918871
919-
singularity shell --bind $PWD:/var/www nodeodm_latest.sif
920872
921-
And then in the Singularity shell
922-
::
873+
***************************************************
874+
ClusterODM, NodeODM, SLURM, with Singularity on HPC
875+
***************************************************
923876

924-
cd /var/www
925-
npm install --production
926-
exit
927877

878+
You can write a SLURM script to schedule and set up available nodes with NodeODM for the ClusterODM to be wired to if you are on the HPC. Using SLURM will decrease the amount of time and processes needed to set up nodes for ClusterODM each time. This provides an easier way for user to use ODM on the HPC.
928879

880+
To setup HPC with SLURM, you must make sure SLURM is installed.
929881

882+
SLURM script will be different from cluster to cluster, depending on which nodes in the cluster that you have. However, the main idea is we want to run NodeODM on each node once, and by default, each NodeODM will be running on port 3000. Apptainer will be taking available ports starting from port 3000, so if your node's port 3000 is open, by default NodeODM will be run on that node. After that, we want to run ClusterODM on the head node and connect the running NodeODMs to the ClusterODM. With that, we will have a functional ClusterODM running on HPC.
930883

931-
Launching
932-
=========
933-
On two different terminals connected to the HPC , or with tmux (or screen...) a slurm script will start NodeODM instances.
934-
Then ClusterODM could be started
884+
Here is an example of SLURM script assigning nodes 48, 50, 51 to run NodeODM. You can freely change and use it depending on your system:
935885

936-
NodeODM
937-
-------
938-
Create a nodeodm.slurm script in $HOME/git/NodeODM with
939886
::
940887

941-
#!/usr/bin/bash
942-
#source .bashrc
943-
888+
#!/usr/bin/bash
889+
#source. bashrc
890+
#SBATCH --partition=8core
891+
#SBATCH --nodelist-node [48,50, 51]
892+
#SBATCH --time 20:00:00
944893

945-
#SBATCH -J NodeODM
946-
#SBATCH --partition=ncpulong,ncpu
947-
#SBATCH --nodes=2
948-
#SBATCH --mem=10G
949-
#SBATCH --output logs_nodeodm-%j.out
894+
cd SHOME
895+
cd ODM/NodeODM/
950896

951-
cd $HOME/git/NodeODM
897+
#Launch on Node 48
898+
srun --nodes-1 apptainer run --writable node/ &
952899

953-
#Launched on first node
954-
srun --nodes=1 singularity run --bind $PWD:/var/www nodeodm_latest.sif $
900+
#Launch on node 50
901+
srun --nodes-1 apptainer run --writable node/ &
955902

956-
#Launch on second node
903+
#Launch on node 51
904+
srun --nodes=1 apptainer run --writable node/ &
905+
wait
957906

958-
srun --nodes=1 singularity run --bind $PWD:/var/www nodeodm_latest.sif $
959907

960-
wait
908+
You can check for available nodes using sinfo:
961909

962-
start this script with
963910
::
964911

965-
sbatch $HOME/git/NodeODM/nodeodm.slurm
912+
sinfo
966913

967-
logs of this script are written in $HOME/git/NodeODM/logs_nodeodm-XXX.out XXX is the slurm job number
914+
Run the following command to schedule using the SLURM script:
968915

916+
::
969917

918+
sbatch sample.slurm
970919

971-
ClusterODM
972-
----------
973-
Then you can start ClusterODM on the head node with
920+
921+
You can also check for currently running jobs using squeue:
974922

975923
::
976924

977-
cd $HOME/git/ClusterODM
978-
singularity run --bind $PWD:/var/www clusterodm_latest.sif
925+
squeue -u $USER
926+
927+
928+
Unfortunately, SLURM does not handle assigning jobs to the head node. Hence, if we want to run ClusterODM on the head node, we have to run it locally. After that, you can connect to the CLI and wire the NodeODMs to the ClusterODMs. Here is an example following the sample SLURM script:
979929

980-
Connecting Nodes to ClusterODM
981-
==============================
982-
Use the following command to get the nodes names where NodeODM is running
983930
::
984931

985-
squeue -u $USER
932+
telnet localhost 8080
933+
> NODE ADD node48 3000
934+
> NODE ADD node50 3000
935+
> NODE ADD node51 3000
936+
> NODE LIST
986937

987-
ex : squeue -u $USER
988-
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
989-
1829323 ncpu NodeODM bonaime R 24:19 2 ncpu[015-016]
990938

991-
In this case, NodeODM run on ncpu015 and ncpu016
939+
You should always check to make sure which ports are being used to run NodeODM if ClusterODM is not wired correctly.
940+
941+
It is also possible to pre-populate nodes using JSON. If starting ClusterODM from apptainer or docker, the relevant JSON is available at `docker/data/nodes.json`. Contents might look similar to the following:
992942

993-
Web interface
994-
-------------
995-
ClusterODM administrative web interface could be used to wire NodeODMs to the ClusterODM.
996-
Open another shell window in your local machine and tunnel them to the HPC using the following command:
997943
::
998944

999-
ssh -L localhost:10000:localhost:10000 yourusername@hpc-address
1000-
Replace yourusername and hpc-address with your appropriate username and the hpc address.
945+
[
946+
{"hostname":"node48","port":"3000","token":""},
947+
{"hostname":"node50","port":"3000","token":""},
948+
{"hostname":"node51","port":"3000","token":""}
949+
]
1001950

1002-
Basically, this command will tunnel the port of the hpc to your local port.
1003-
After this, open a browser in your local machine and connect to http://localhost:10000.
1004-
Port 10000 is where ClusterODM's administrative web interface is hosted at.
1005-
Then NodeODMs could be add/deleted to ClusterODM
1006-
This is what it looks like :
1007951

1008-
.. figure:: images/clusterodm-admin-interface.png
1009-
:alt: Clusterodm admin interface
1010-
:align: center
952+
After finish hosting ClusterODM on the head node and finish wiring it to the NodeODM, you can try tunneling to see if ClusterODM works as expected. Open another shell window in your local machine and tunnel them to the HPC using the following command:
1011953

954+
::
1012955

956+
ssh -L localhost:10000:localhost:10000 user@hostname
1013957

1014-
telnet
1015-
------
1016-
You can connect to the ClusterODM CLI and wire the NodeODMs. For the previous example :
1017958

1018-
telnet localhost 8080
1019-
> NODE ADD ncpu015 3000
1020-
> NODE ADD ncpu016 3000
1021-
> NODE LIST
959+
Replace user and hostname with your appropriate username and the hpc address. Basically, this command will tunnel the port of the hpc to your local port. After this, open a browser in your local machine and connect to `http://localhost:10000`. Port 10000 is where ClusterODM's administrative web interface is hosted at. This is what it looks like:
1022960

961+
.. figure:: https://user-images.githubusercontent.com/70782465/214938402-707bee90-ea17-4573-82f8-74096d9caf03.png
962+
:alt: Screenshot of ClusterODM's administrative web interface
963+
:align: center
1023964

1024965

966+
Here you can check the NodeODMs status and even add or delete working nodes.
1025967

1026-
Using ClusterODM and its NodeODMs
1027-
=================================
968+
After that, do tunneling for port 3000 of the HPC to your local machine:
1028969

1029-
Open another shell window in your local machine and tunnel them to the HPC using the following command:
1030970
::
1031971

1032-
ssh -L localhost:10000:localhost:10000 yourusername@hpc-address
1033-
Replace yourusername and hpc-address with your appropriate username and the hpc address.
972+
ssh -L localhost:3000:localhost:3000 user@hostname
1034973

1035-
After this, open a browser in your local machine and connect to http://localhost:3000 with your browser
1036-
Here, you can Assign Tasks and observe the tasks' processes.
974+
Port 3000 is ClusterODM's proxy. This is the place we assign tasks to ClusterODM. Once again, connect to `http://localhost:3000` with your browser after tunneling. Here, you can Assign Tasks and observe the tasks' processes.
1037975

1038-
.. figure:: images/clusterodm-user-interface.png
1039-
:alt: Clusterodm user interface
976+
.. figure:: https://user-images.githubusercontent.com/70782465/214938234-113f99dc-f69e-4e78-a782-deaf94e986b0.png
977+
:alt: Screenshot of ClusterODM's jobs interface
1040978
:align: center
1041979

1042-
1043-
1044980
After adding images in this browser, you can press Start Task and see ClusterODM assigning tasks to the nodes you have wired to. Go for a walk and check the progress.

0 commit comments

Comments
 (0)