Skip to content

Commit 81dd4df

Browse files
authored
A full documentation for stashing (#6936)
Following commit b2a6e2, This commit updates documentation on stashing calcjob.
1 parent cb5348e commit 81dd4df

File tree

1 file changed

+154
-24
lines changed

1 file changed

+154
-24
lines changed

docs/source/topics/calculations/usage.rst

Lines changed: 154 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -603,15 +603,67 @@ The order can be controlled through the ``file_copy_operation_order`` attribute
603603
604604
.. _topics:calculations:usage:calcjobs:stashing:
605605
606-
Stashing on the remote
607-
~~~~~~~~~~~~~~~~~~~~~~
608606
609-
The ``stash`` option namespace allows a user to specify certain files and/or folders that are created by the calculation job to be stashed somewhere on the remote where the job is run.
610-
This can be useful if these need to be stored for a longer time on a machine where the scratch space is cleaned regularly, but they need to be kept on the remote machine and not retrieved.
611-
Examples are files that are necessary to restart a calculation but are too big to be retrieved and stored permanently in the local file repository.
607+
Stashing Files on the Remote
608+
----------------------------
609+
610+
611+
In many scientific workflows, calculations produce files that are either too large to retrieve to your local AiiDA repository or simply not needed locally. However, you may still want to keep these files available on the remote machine—for example, to facilitate restarts, enable debugging, or for archiving purposes—but outside the compute or scratch directory that might be cleaned up regularly.
612+
613+
AiiDA offers a stashing mechanism to help with this: it can automatically copy or archive specified files to a persistent location on the remote computer, either immediately after the calculation completes or as a separate follow-up calcjob.
614+
615+
Below, we briefly describe the two supported methods for remote stashing and provide guidance on how to choose the best approach for your use case.
616+
617+
Which method should I use?
618+
~~~~~~~~~~~~~~~~~~~~~~~~~~
619+
620+
.. list-table::
621+
:header-rows: 1
622+
623+
* - Scenario
624+
- Recommended method
625+
* - Stash files regardless of calculation outcome (even if failed)
626+
- Method 1: Stashing **Immediately After Job Completion on HPC**
627+
* - Stash files from an already completed calculation
628+
- Method 2: Stashing via a **Separate Calculation Job**
629+
* - I want to submit my own custom script for stashing
630+
- Method 2: Stashing via a **Separate Calculation Job**
631+
632+
Quick comparison between these methods:
633+
634+
::
635+
636+
(Method 1) Immediate stashing:
637+
+---------------------+ +--------------------------------+
638+
| Calculation job | ---> | Stash files with no submission |
639+
+---------------------+ | (before retrieve) |
640+
+--------------------------------+
641+
|
642+
v
643+
+------------------------+
644+
| Retrieve & parse files |
645+
+------------------------+
646+
647+
(Method 2) Post-completion stashing:
648+
+---------------------+ +------------------------+
649+
| Calculation job | ---> | Retrieve & parse files | ->
650+
+---------------------+ +------------------------+
651+
652+
+---------------------+ +---------------------------------+
653+
| StashCalculation | ---> | Stash files with no submission |
654+
+---------------------+ | or |
655+
| Submit as a custom script |
656+
+---------------------------------+
657+
612658
613-
The files/folder that need to be stashed are specified through their relative filepaths within the working directory in the ``stash.source_list`` option.
614-
Using the ``COPY`` mode, the target path defines another location (on the same filesystem as the calculation) to copy the files to, and is set through the ``stash.target_base`` option, for example:
659+
Method 1: Stashing Immediately After Job Completion on HPC
660+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
661+
662+
This approach performs stashing as soon as the calculation finishes, but **before** any files are retrieved or parsed. It is available for stash modes: ``COPY``, ``COMPRESS_TAR``, ``COMPRESS_TARBZ2``, ``COMPRESS_TARGZ``, and ``COMPRESS_TARXZ``.
663+
664+
**Typical use case:** You need to preserve output files from all runs, even failed ones, for debugging or restarting purposes.
665+
666+
Specify which files or folders to stash (by relative paths) using the ``stash.source_list`` option, and the destination on the remote using ``stash.target_base``. Example:
615667
616668
.. code-block:: python
617669
@@ -623,28 +675,28 @@ Using the ``COPY`` mode, the target path defines another location (on the same f
623675
'metadata': {
624676
'options': {
625677
'stash': {
626-
'source_list': ['aiida.out', 'output.txt'],
627-
'target_base': '/storage/project/stash_folder',
628678
'stash_mode': StashMode.COPY.value,
679+
'target_base': '/storage/project/stash_folder',
680+
'source_list': ['aiida.out', 'output.txt'],
629681
}
630682
}
631683
}
632684
}
633685
634-
.. note::
635-
In addition to the ``COPY`` mode, the following modes, these storage efficient modes are also are available:
636-
``COMPRESS_TAR``, ``COMPRESS_TARBZ2``, ``COMPRESS_TARGZ``, ``COMPRESS_TARXZ``.
637-
638-
The stashed files and folders are represented by an output node that is attached to the calculation node through the label ``remote_stash``, as a ``RemoteStashFolderData`` node.
639-
Just like the ``remote_folder`` node, this represents a location or files on a remote machine and so is equivalent to a "symbolic link".
686+
The stashed files are represented by an output node with the label ``remote_stash`` (an instance of ``RemoteStashFolderData``), attached to the calculation node. This node acts like a "symbolic link" pointing to the location on the remote system.
640687
641688
.. important::
642689
643-
If the ``stash`` option namespace is defined for a generic calculation job, the daemon will perform the stashing operations before the files are retrieved.
644-
This means that the stashing happens before the parsing of the output files (which occurs after the retrieving step), such that that the files will be stashed independent of the final exit status that the parser will assign to the calculation job.
645-
This may cause files to be stashed for calculations that will later be considered to have failed.
690+
The stashing operation occurs *before* any file retrieval or parsing. As a result, files may be stashed even for calculations that later turn out to have failed.
691+
692+
Method 2: Stashing via a Separate Calculation Job
693+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
694+
695+
This approach lets you stash files **only after a successful calculation**. This is done by running a follow-up `core.stash` calculation that copies or archives files from the remote folder of a finished calculation job.
646696
647-
To avoid this scenario, you can instead, stash via a separate calculation job, for example:
697+
**Typical use case:** You want to avoid keeping files from failed calculations, or need to run custom post-processing scripts.
698+
699+
This method requires specifying the ``remote_folder`` of the original calculation as ``source_node``. Example:
648700
649701
.. code-block:: python
650702
@@ -658,12 +710,12 @@ To avoid this scenario, you can instead, stash via a separate calculation job, f
658710
659711
inputs = {
660712
'metadata': {
661-
'computer': load_computer(label="localhost"),
713+
'computer': load_computer(label=<COMPUTER_LABEL>),
662714
'options': {
663715
'stash': {
664-
'source_list': ['aiida.out', 'output.txt'],
665-
'target_base': '/scratch/',
666-
'stash_mode': StashMode.COPY.value,
716+
'stash_mode': StashMode.COPY.value,
717+
'target_base': '/scratch/',
718+
'source_list': ['aiida.out', 'output.txt'],
667719
},
668720
},
669721
},
@@ -672,10 +724,88 @@ To avoid this scenario, you can instead, stash via a separate calculation job, f
672724
673725
result = run(StashCalculation, **inputs)
674726
727+
Custom script stashing (advanced)
728+
.................................
729+
730+
You can run your own script as part of the stashing step, using the ``SUBMIT_CUSTOM_CODE`` stash mode.
731+
First, place your script on the remote machine and define it as an AiiDA code:
732+
733+
.. code-block:: python
734+
735+
code = InstalledCode(
736+
label='<MY_CODE>',
737+
default_calc_job_plugin='core.stash',
738+
computer=load_computer(<COMPUTER_LABEL>),
739+
filepath_executable=str(<Path_to_script.sh>),
740+
)
741+
code.store()
742+
743+
Run the custom stashing job with:
744+
745+
.. code-block:: python
746+
747+
StashCalculation = CalculationFactory('core.stash')
748+
inputs = {
749+
'metadata': {
750+
'computer': load_computer(<COMPUTER_LABEL>),
751+
'options': {
752+
'resources': {'num_machines': 1},
753+
'stash': {
754+
'stash_mode': StashMode.SUBMIT_CUSTOM_CODE.value,
755+
'target_base': str(target_base),
756+
'source_list': ['aiida.out', 'output.txt'],
757+
},
758+
},
759+
},
760+
'source_node': <orm.RemoteData>,
761+
'code': load_code(label='<MY_CODE>'),
762+
}
763+
submit(StashCalculation, **inputs)
764+
765+
766+
767+
This calculation produces an ``aiida.in`` file in JSON format with the stashing parameters, for example:
768+
769+
.. code-block:: none
770+
771+
{"working_directory": <orm.RemoteData>.get_remote_path(),
772+
"source_list": ["aiida.out", "output.txt"],
773+
"target_base": "/path/to/stash"}
774+
775+
Which is used as an input to your script:
776+
777+
::
778+
779+
./script.sh < aiida.in > aiida.out
780+
781+
Therefore, your script should parse the JSON, and implement the stashing by any means. For example:
782+
783+
.. code-block:: bash
784+
785+
json=$(cat)
786+
working_directory=$(echo "$json" | jq -r '.working_directory')
787+
source_list=$(echo "$json" | jq -r '.source_list[]')
788+
target_base=$(echo "$json" | jq -r '.target_base')
789+
790+
mkdir -p "$target_base"
791+
for item in $source_list; do
792+
cp "$working_directory/$item" "$target_base/"
793+
echo "$working_directory/$item copied successfully."
794+
done
795+
796+
This way you can implement any custom logic in your script, such as tape commands, handling errors, or filtering files dynamically.
797+
798+
Caveats and best practices
799+
""""""""""""""""""""""""""
675800
676801
.. important::
677802
678-
AiiDA does not actually control the files in the remote stash, and so the contents may disappear at some point.
803+
- **AiiDA does not manage the files in the remote stash after creation.** Files may be deleted or lost at any time, depending on the cluster's configuration or cleanup policies.
804+
- **Check quotas and permissions**: Make sure you have write access and sufficient quota in the target stash directory.
805+
- **Handle errors**: If the stashing operation fails (e.g., due to missing files or lack of permissions), AiiDA will log the issue, but will not raise. It is your responsibility to check and recover as needed.
806+
- **Source files are not deleted after stashing**: This is to prevent unwanted data-loss.
807+
808+
679809
680810
.. _topics:calculations:usage:calcjobs:options:
681811

0 commit comments

Comments
 (0)