You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let's say that we will get ClusterODM and NodeODM images in the same folder
878
877
879
-
Downloading and installing the images
880
-
=====================================
878
+
You can write a SLURM script to schedule and set up available nodes with NodeODM for the ClusterODM to be wired to if you are on the HPC. Using SLURM will decrease the amount of time and processes needed to set up nodes for ClusterODM each time. This provides an easier way for user to use ODM on the HPC.
881
879
882
-
In this example ClusterODM and NodeODM will be installed in $HOME/git
880
+
To setup HPC with SLURM, you must make sure SLURM is installed.
883
881
884
-
ClusterODM
885
-
----------
886
-
887
-
::
882
+
SLURM script will be different from cluster to cluster, depending on which nodes in the cluster that you have. However, the main idea is we want to run NodeODM on each node once, and by default, each NodeODM will be running on port 3000. Apptainer will be taking available ports starting from port 3000, so if your node's port 3000 is open, by default NodeODM will be run on that node. After that, we want to run ClusterODM on the head node and connect the running NodeODMs to the ClusterODM. With that, we will have a functional ClusterODM running on HPC.
Run the following command to schedule using the SLURM script:
928
915
929
-
930
-
931
-
Launching
932
-
=========
933
-
On two different terminals connected to the HPC , or with tmux (or screen...) a slurm script will start NodeODM instances.
934
-
Then ClusterODM could be started
935
-
936
-
NodeODM
937
-
-------
938
-
Create a nodeodm.slurm script in $HOME/git/NodeODM with
939
916
::
940
917
941
-
#!/usr/bin/bash
942
-
#source .bashrc
943
-
944
-
945
-
#SBATCH -J NodeODM
946
-
#SBATCH --partition=ncpulong,ncpu
947
-
#SBATCH --nodes=2
948
-
#SBATCH --mem=10G
949
-
#SBATCH --output logs_nodeodm-%j.out
950
-
951
-
cd $HOME/git/NodeODM
918
+
sbatch sample.slurm
952
919
953
-
#Launched on first node
954
-
srun --nodes=1 singularity run --bind $PWD:/var/www nodeodm_latest.sif $
955
920
956
-
#Launch on second node
921
+
You can also check for currently running jobs using squeue:
957
922
958
-
srun --nodes=1 singularity run --bind $PWD:/var/www nodeodm_latest.sif $
959
-
960
-
wait
961
-
962
-
start this script with
963
923
::
964
924
965
-
sbatch $HOME/git/NodeODM/nodeodm.slurm
925
+
squeue -u $USER
966
926
967
-
logs of this script are written in $HOME/git/NodeODM/logs_nodeodm-XXX.out XXX is the slurm job number
968
927
969
-
970
-
971
-
ClusterODM
972
-
----------
973
-
Then you can start ClusterODM on the head node with
928
+
Unfortunately, SLURM does not handle assigning jobs to the head node. Hence, if we want to run ClusterODM on the head node, we have to run it locally. After that, you can connect to the CLI and wire the NodeODMs to the ClusterODMs. Here is an example following the sample SLURM script:
974
929
975
930
::
976
931
977
-
cd $HOME/git/ClusterODM
978
-
singularity run --bind $PWD:/var/www clusterodm_latest.sif
979
-
980
-
Connecting Nodes to ClusterODM
981
-
==============================
982
-
Use the following command to get the nodes names where NodeODM is running
983
-
::
932
+
telnet localhost 8080
933
+
> NODE ADD node48 3000
934
+
> NODE ADD node50 3000
935
+
> NODE ADD node51 3000
936
+
> NODE LIST
984
937
985
-
squeue -u $USER
986
938
987
-
ex : squeue -u $USER
988
-
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
989
-
1829323 ncpu NodeODM bonaime R 24:19 2 ncpu[015-016]
939
+
You should always check to make sure which ports are being used to run NodeODM if ClusterODM is not wired correctly.
990
940
991
-
In this case, NodeODM run on ncpu015 and ncpu016
941
+
It is also possible to pre-populate nodes using JSON. If starting ClusterODM from apptainer or docker, the relevant JSON is available at `docker/data/nodes.json`. Contents might look similar to the following:
992
942
993
-
Web interface
994
-
-------------
995
-
ClusterODM administrative web interface could be used to wire NodeODMs to the ClusterODM.
996
-
Open another shell window in your local machine and tunnel them to the HPC using the following command:
Replace yourusername and hpc-address with your appropriate username and the hpc address.
945
+
[
946
+
{"hostname":"node48","port":"3000","token":""},
947
+
{"hostname":"node50","port":"3000","token":""},
948
+
{"hostname":"node51","port":"3000","token":""}
949
+
]
1001
950
1002
-
Basically, this command will tunnel the port of the hpc to your local port.
1003
-
After this, open a browser in your local machine and connect to http://localhost:10000.
1004
-
Port 10000 is where ClusterODM's administrative web interface is hosted at.
1005
-
Then NodeODMs could be add/deleted to ClusterODM
1006
-
This is what it looks like :
1007
951
1008
-
.. figure:: images/clusterodm-admin-interface.png
1009
-
:alt:Clusterodm admin interface
1010
-
:align:center
952
+
After finish hosting ClusterODM on the head node and finish wiring it to the NodeODM, you can try tunneling to see if ClusterODM works as expected. Open another shell window in your local machine and tunnel them to the HPC using the following command:
You can connect to the ClusterODM CLI and wire the NodeODMs. For the previous example :
1017
958
1018
-
telnet localhost 8080
1019
-
> NODE ADD ncpu015 3000
1020
-
> NODE ADD ncpu016 3000
1021
-
> NODE LIST
959
+
Replace user and hostname with your appropriate username and the hpc address. Basically, this command will tunnel the port of the hpc to your local port. After this, open a browser in your local machine and connect to `http://localhost:10000`. Port 10000 is where ClusterODM's administrative web interface is hosted at. This is what it looks like:
After this, open a browser in your local machine and connect to http://localhost:3000 with your browser
1036
-
Here, you can Assign Tasks and observe the tasks' processes.
974
+
Port 3000 is ClusterODM's proxy. This is the place we assign tasks to ClusterODM. Once again, connect to `http://localhost:3000` with your browser after tunneling. Here, you can Assign Tasks and observe the tasks' processes.
After adding images in this browser, you can press Start Task and see ClusterODM assigning tasks to the nodes you have wired to. Go for a walk and check the progress.
0 commit comments