triton/usage/localstorage: Update for latest info (ramfs, some with or without disks)

rkdarst · rkdarst · commit f8a1359aaacf · 2025-02-13T15:31:02.000+02:00
diff --git a/triton/usage/localstorage.rst b/triton/usage/localstorage.rst
@@ -8,51 +8,87 @@ Storage: local drives
 
 Local disks on computing nodes are the preferred place for doing your
 IO. The general idea is use network storage as a backend and local disk
-for actual data processing.
+for actual data processing.  **Some nodes have no disks** (local
+storage comes out of the job memory, **some older nodes have HDDs**
+(spinning disks), and some **SSDs**.
 
--  In the beginning of the job cd to ``/tmp`` and make a unique directory
-   for your run
--  copy needed input from WRKDIR to there
--  run your calculation normally forwarding all the output to ``/tmp``
--  in the end copy relevant output to WRKDIR for analysis and further
-   usage
+A general use pattern:
+
+- In the beginning of the job, copy needed input from WRKDIR to ``/tmp``.
+- Run your calculation normally reading input from or writing output
+  to to ``/tmp``.
+- In the end copy relevant output to WRKDIR for analysis and further
+  usage.
 
 Pros
 
 -  You get better and steadier IO performance. WRKDIR is shared over all
    users making per-user performance actually rather poor.
 -  You save performance for WRKDIR to those who cannot use local disks.
 -  You get much better performance when using many small files (Lustre
-   works poorly here).
+   works poorly here) or random access.
 -  Saves your quota if your code generate lots of data but finally you
    need only part of it
 -  In general, it is an excellent choice for single-node runs (that is
    all job's task run on the same node).
 
 Cons
 
--  Not feasible for huge files (>100GB). Use WRKDIR instead.
+-  NOT for the long-term data. Cleaned every time your job is finished.
+-  Space is more limited (but still can be TBs on some nodes)
+-  Need some awareness of what is on each node, since they are different
 -  Small learning curve (must copy files before and after the job).
--  Not feasible for cross-node IO (MPI jobs). Use WRKDIR instead.
+-  Not feasible for cross-node IO (MPI jobs where different tasks
+   write to the same files). Use WRKDIR instead.
+
+
 
 How to use local drives on compute nodes
 ----------------------------------------
 
-NOT for the long-term data. Cleaned every time your job is finished.
+``/tmp`` is the temporary directory.  It is per-user (not per-job), if
+you get two jobs running on the same node, you get the same ``/tmp``.
+It is automatically removed once the last job on a node finishes.
+
+
+Nodes with local disks
+~~~~~~~~~~~~~~~~~~~~~~
+
+You can see the nodes with local disks on :doc:`../overview`.  (To
+double check from within the cluster, you can verify node info with
+``sinfo show node NODENAME`` and see the ``localdisk`` tag in
+``slurm features``).  Disk sizes greatly vary from hundreds of GB to
+tens of TB.
+
+You have to use ``--constraint=localdisk`` to ensure that you get a
+hard disk.  You can use ``--tmp=NNNG`` (for example ``--tmp=100G``) to
+request a node with at least that much temporary space.  But,
+``--tmp`` doesn't allocate this space just for you: it's shared among
+all users, including those which didn't request storage space.  So,
+you *might* not have as much as you think.  Beware and handle out of
+memory gracefully.
 
-You have to use ``--constraint=localdisk`` to ensure that you get a hard
-disk.
 
-``/tmp`` is a bind-mounted user specific directory. Directory is per-user
-(not per-job that is), if you get two jobs running on the same node, you
-get the same ``/tmp``.
+Nodes without local disks
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You can still use ``/tmp``, but it is an in-memory ramdisk.  This
+means it is *very* fast, but is using the actual main memory that is
+used by the programs.  It comes out of your job's memory allocation,
+so use a ``--mem`` amount with enough space for your job and any
+temporary storage.
+
+
+
+Examples
+--------
 
 Interactively
 ~~~~~~~~~~~~~
 
 How to use /tmp when you login interactively
 
-::
+.. code-block:: console
 
     $ sinteractive --time=1:00:00              # request a node for one hour
     (node)$ mkdir /tmp/$SLURM_JOB_ID       # create a unique directory, here we use
@@ -65,10 +101,10 @@ How to use /tmp when you login interactively
 In batch script
 ~~~~~~~~~~~~~~~
 
-Batch job example that prevents data lost in case program gets
+This batch job example that prevents data loss in case program gets
 terminated (either because of ``scancel`` or due to time limit).
 
-::
+.. code-block:: slurm
 
     #!/bin/bash
 
@@ -88,6 +124,8 @@ terminated (either because of ``scancel`` or due to time limit).
 
     mv /tmp/$SLURM_JOB_ID/output $WRKDIR/SOMEDIR                   # move your output fully or partially
 
+
+
 Batch script for thousands input/output files
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -111,7 +149,7 @@ Working with tar balls is done in a following fashion:
 
 A sample code is below:
 
-::
+.. code-block:: slurm
 
     #!/bin/bash