@@ -8,51 +8,87 @@ Storage: local drives
88
99Local disks on computing nodes are the preferred place for doing your
1010IO. The general idea is use network storage as a backend and local disk
11- for actual data processing.
11+ for actual data processing. **Some nodes have no disks ** (local
12+ storage comes out of the job memory, **some older nodes have HDDs **
13+ (spinning disks), and some **SSDs **.
1214
13- - In the beginning of the job cd to ``/tmp `` and make a unique directory
14- for your run
15- - copy needed input from WRKDIR to there
16- - run your calculation normally forwarding all the output to ``/tmp ``
17- - in the end copy relevant output to WRKDIR for analysis and further
18- usage
15+ A general use pattern:
16+
17+ - In the beginning of the job, copy needed input from WRKDIR to ``/tmp ``.
18+ - Run your calculation normally reading input from or writing output
19+ to to ``/tmp ``.
20+ - In the end copy relevant output to WRKDIR for analysis and further
21+ usage.
1922
2023Pros
2124
2225- You get better and steadier IO performance. WRKDIR is shared over all
2326 users making per-user performance actually rather poor.
2427- You save performance for WRKDIR to those who cannot use local disks.
2528- You get much better performance when using many small files (Lustre
26- works poorly here).
29+ works poorly here) or random access .
2730- Saves your quota if your code generate lots of data but finally you
2831 need only part of it
2932- In general, it is an excellent choice for single-node runs (that is
3033 all job's task run on the same node).
3134
3235Cons
3336
34- - Not feasible for huge files (>100GB). Use WRKDIR instead.
37+ - NOT for the long-term data. Cleaned every time your job is finished.
38+ - Space is more limited (but still can be TBs on some nodes)
39+ - Need some awareness of what is on each node, since they are different
3540- Small learning curve (must copy files before and after the job).
36- - Not feasible for cross-node IO (MPI jobs). Use WRKDIR instead.
41+ - Not feasible for cross-node IO (MPI jobs where different tasks
42+ write to the same files). Use WRKDIR instead.
43+
44+
3745
3846How to use local drives on compute nodes
3947----------------------------------------
4048
41- NOT for the long-term data. Cleaned every time your job is finished.
49+ ``/tmp `` is the temporary directory. It is per-user (not per-job), if
50+ you get two jobs running on the same node, you get the same ``/tmp ``.
51+ It is automatically removed once the last job on a node finishes.
52+
53+
54+ Nodes with local disks
55+ ~~~~~~~~~~~~~~~~~~~~~~
56+
57+ You can see the nodes with local disks on :doc: `../overview `. (To
58+ double check from within the cluster, you can verify node info with
59+ ``sinfo show node NODENAME `` and see the ``localdisk `` tag in
60+ ``slurm features ``). Disk sizes greatly vary from hundreds of GB to
61+ tens of TB.
62+
63+ You have to use ``--constraint=localdisk `` to ensure that you get a
64+ hard disk. You can use ``--tmp=NNNG `` (for example ``--tmp=100G ``) to
65+ request a node with at least that much temporary space. But,
66+ ``--tmp `` doesn't allocate this space just for you: it's shared among
67+ all users, including those which didn't request storage space. So,
68+ you *might * not have as much as you think. Beware and handle out of
69+ memory gracefully.
4270
43- You have to use ``--constraint=localdisk `` to ensure that you get a hard
44- disk.
4571
46- ``/tmp `` is a bind-mounted user specific directory. Directory is per-user
47- (not per-job that is), if you get two jobs running on the same node, you
48- get the same ``/tmp ``.
72+ Nodes without local disks
73+ ~~~~~~~~~~~~~~~~~~~~~~~~~
74+
75+ You can still use ``/tmp ``, but it is an in-memory ramdisk. This
76+ means it is *very * fast, but is using the actual main memory that is
77+ used by the programs. It comes out of your job's memory allocation,
78+ so use a ``--mem `` amount with enough space for your job and any
79+ temporary storage.
80+
81+
82+
83+ Examples
84+ --------
4985
5086Interactively
5187~~~~~~~~~~~~~
5288
5389How to use /tmp when you login interactively
5490
55- ::
91+ .. code-block :: console
5692
5793 $ sinteractive --time=1:00:00 # request a node for one hour
5894 (node)$ mkdir /tmp/$SLURM_JOB_ID # create a unique directory, here we use
@@ -65,10 +101,10 @@ How to use /tmp when you login interactively
65101 In batch script
66102~~~~~~~~~~~~~~~
67103
68- Batch job example that prevents data lost in case program gets
104+ This batch job example that prevents data loss in case program gets
69105terminated (either because of ``scancel `` or due to time limit).
70106
71- ::
107+ .. code-block :: slurm
72108
73109 #!/bin/bash
74110
@@ -88,6 +124,8 @@ terminated (either because of ``scancel`` or due to time limit).
88124
89125 mv /tmp/$SLURM_JOB_ID/output $WRKDIR/SOMEDIR # move your output fully or partially
90126
127+
128+
91129 Batch script for thousands input/output files
92130~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
93131
@@ -111,7 +149,7 @@ Working with tar balls is done in a following fashion:
111149
112150A sample code is below:
113151
114- ::
152+ .. code-block :: slurm
115153
116154 #!/bin/bash
117155
0 commit comments