You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -603,15 +603,67 @@ The order can be controlled through the ``file_copy_operation_order`` attribute
603
603
604
604
.. _topics:calculations:usage:calcjobs:stashing:
605
605
606
-
Stashing on the remote
607
-
~~~~~~~~~~~~~~~~~~~~~~
608
606
609
-
The ``stash`` option namespace allows a user to specify certain files and/or folders that are created by the calculation job to be stashed somewhere on the remote where the job is run.
610
-
This can be useful if these need to be stored for a longer time on a machine where the scratch space is cleaned regularly, but they need to be kept on the remote machine and not retrieved.
611
-
Examples are files that are necessary to restart a calculation but are too big to be retrieved and stored permanently in the local file repository.
607
+
Stashing Files on the Remote
608
+
----------------------------
609
+
610
+
611
+
In many scientific workflows, calculations produce files that are either too large to retrieve to your local AiiDA repository or simply not needed locally. However, you may still want to keep these files available on the remote machine—for example, to facilitate restarts, enable debugging, or for archiving purposes—but outside the compute or scratch directory that might be cleaned up regularly.
612
+
613
+
AiiDA offers a stashing mechanism to help with this: it can automatically copy or archive specified files to a persistent location on the remote computer, either immediately after the calculation completes or as a separate follow-up calcjob.
614
+
615
+
Below, we briefly describe the two supported methods for remote stashing and provide guidance on how to choose the best approach for your use case.
616
+
617
+
Which method should I use?
618
+
~~~~~~~~~~~~~~~~~~~~~~~~~~
619
+
620
+
.. list-table::
621
+
:header-rows: 1
622
+
623
+
* - Scenario
624
+
- Recommended method
625
+
* - Stash files regardless of calculation outcome (even if failed)
626
+
- Method 1: Stashing **Immediately After Job Completion on HPC**
627
+
* - Stash files from an already completed calculation
628
+
- Method 2: Stashing via a **Separate Calculation Job**
629
+
* - I want to submit my own custom script for stashing
630
+
- Method 2: Stashing via a **Separate Calculation Job**
| StashCalculation | ---> | Stash files with no submission |
654
+
+---------------------+ | or |
655
+
| Submit as a custom script |
656
+
+---------------------------------+
657
+
612
658
613
-
The files/folder that need to be stashed are specified through their relative filepaths within the working directory in the ``stash.source_list`` option.
614
-
Using the ``COPY`` mode, the target path defines another location (on the same filesystem as the calculation) to copy the files to, and is set through the ``stash.target_base`` option, for example:
659
+
Method 1: Stashing Immediately After Job Completion on HPC
This approach performs stashing as soon as the calculation finishes, but **before** any files are retrieved or parsed. It is available for stash modes: ``COPY``, ``COMPRESS_TAR``, ``COMPRESS_TARBZ2``, ``COMPRESS_TARGZ``, and ``COMPRESS_TARXZ``.
663
+
664
+
**Typical use case:** You need to preserve output files from all runs, even failed ones, for debugging or restarting purposes.
665
+
666
+
Specify which files or folders to stash (by relative paths) using the ``stash.source_list`` option, and the destination on the remote using ``stash.target_base``. Example:
615
667
616
668
.. code-block:: python
617
669
@@ -623,28 +675,28 @@ Using the ``COPY`` mode, the target path defines another location (on the same f
623
675
'metadata': {
624
676
'options': {
625
677
'stash': {
626
-
'source_list': ['aiida.out', 'output.txt'],
627
-
'target_base': '/storage/project/stash_folder',
628
678
'stash_mode': StashMode.COPY.value,
679
+
'target_base': '/storage/project/stash_folder',
680
+
'source_list': ['aiida.out', 'output.txt'],
629
681
}
630
682
}
631
683
}
632
684
}
633
685
634
-
.. note::
635
-
In addition to the ``COPY`` mode, the following modes, these storage efficient modes are also are available:
The stashed files and folders are represented by an output node that is attached to the calculation node through the label ``remote_stash``, as a ``RemoteStashFolderData`` node.
639
-
Just like the ``remote_folder`` node, this represents a location or files on a remote machine and so is equivalent to a "symbolic link".
686
+
The stashed files are represented by an output node with the label ``remote_stash`` (an instance of ``RemoteStashFolderData``), attached to the calculation node. This node acts like a "symbolic link" pointing to the location on the remote system.
640
687
641
688
.. important::
642
689
643
-
If the ``stash`` option namespace is defined for a generic calculation job, the daemon will perform the stashing operations before the files are retrieved.
644
-
This means that the stashing happens before the parsing of the output files (which occurs after the retrieving step), such that that the files will be stashed independent of the final exit status that the parser will assign to the calculation job.
645
-
This may cause files to be stashed for calculations that will later be considered to have failed.
690
+
The stashing operation occurs *before* any file retrieval or parsing. As a result, files may be stashed even for calculations that later turn out to have failed.
691
+
692
+
Method 2: Stashing via a Separate Calculation Job
693
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
694
+
695
+
This approach lets you stash files **only after a successful calculation**. This is done by running a follow-up `core.stash` calculation that copies or archives files from the remote folder of a finished calculation job.
646
696
647
-
To avoid this scenario, you can instead, stash via a separate calculation job, for example:
697
+
**Typical use case:** You want to avoid keeping files from failed calculations, or need to run custom post-processing scripts.
698
+
699
+
This method requires specifying the ``remote_folder`` of the original calculation as ``source_node``. Example:
648
700
649
701
.. code-block:: python
650
702
@@ -658,12 +710,12 @@ To avoid this scenario, you can instead, stash via a separate calculation job, f
This way you can implement any custom logic in your script, such as tape commands, handling errors, or filtering files dynamically.
797
+
798
+
Caveats and best practices
799
+
""""""""""""""""""""""""""
675
800
676
801
.. important::
677
802
678
-
AiiDA does not actually control the files in the remote stash, and so the contents may disappear at some point.
803
+
- **AiiDA does not manage the files in the remote stash after creation.** Files may be deleted or lost at any time, depending on the cluster's configuration or cleanup policies.
804
+
- **Check quotas and permissions**: Make sure you have write access and sufficient quota in the target stash directory.
805
+
- **Handle errors**: If the stashing operation fails (e.g., due to missing files or lack of permissions), AiiDA will log the issue, but will not raise. It is your responsibility to check and recover as needed.
806
+
- **Source files are not deleted after stashing**: This is to prevent unwanted data-loss.
0 commit comments