Skip to content

Commit 6d8e765

Browse files
MLP storage advice (#145)
1 parent 3f0bdb9 commit 6d8e765

File tree

1 file changed

+21
-3
lines changed

1 file changed

+21
-3
lines changed

docs/platforms/mlp/index.md

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,9 @@ There are three main file systems mounted on the MLP clusters Clariden and Brist
3535

3636
| type |mount | filesystem |
3737
| -- | -- | -- |
38-
| Home | /users/$USER | [VAST][ref-alps-vast] |
38+
| Home | `/users/$USER` | [VAST][ref-alps-vast] |
3939
| Scratch | `/iopsstor/scratch/cscs/$USER` | [Iopsstor][ref-alps-iopsstor] |
40+
| | `/capstor/scratch/cscs/$USER` | [Capstor][ref-alps-capstor] |
4041
| Project | `/capstor/store/cscs/swissai/<project>` | [Capstor][ref-alps-capstor] |
4142

4243
### Home
@@ -51,14 +52,31 @@ Use scratch to store datasets that will be accessed by jobs, and for job output.
5152
Scratch is per user - each user gets separate scratch path and quota.
5253

5354
* The environment variable `SCRATCH=/iopsstor/scratch/cscs/$USER` is set automatically when you log into the system, and can be used as a shortcut to access scratch.
55+
* There is an additional scratch path mounted on [Capstor][ref-alps-capstor] at `/capstor/scratch/cscs/$USER`.
5456

5557
!!! warning "scratch cleanup policy"
5658
Files that have not been accessed in 30 days are automatically deleted.
5759

5860
**Scratch is not intended for permanent storage**: transfer files back to the capstor project storage after job runs.
5961

60-
!!! note
61-
There is an additional scratch path mounted on [Capstor][ref-alps-capstor] at `/capstor/scratch/cscs/$USER`, however this is not recommended for ML workloads for performance reasons.
62+
!!! note "file system suitability"
63+
The Capstor scratch filesystem is based on HDDs and is optimized for large, sequential read and write operations.
64+
We recommend using Capstor for storing **checkpoint files** and other **large, contiguous outputs** generated by your training runs.
65+
In contrast, Iopstor uses high-performance NVMe drives, which excel at handling **IOPS-intensive workloads** involving frequent, random access. This makes it a better choice for storing **training datasets**, especially when accessed randomly during machine learning training.
66+
67+
### Scratch Usage Recommendations
68+
69+
Use Iopstor scratch (`$SCRATCH`) for:
70+
71+
* Training and validation datasets that are read frequently and non-sequentially.
72+
* Workloads that perform many small, random I/O operations.
73+
74+
Use Capstor scratch (`/capstor/scratch/cscs/$USER`) for:
75+
76+
* Storing model checkpoints.
77+
* Outputs from simulations or training jobs that involve large, contiguous I/O.
78+
79+
After your job completes, remember to transfer any important results to your permanent project storage.
6280

6381
### Project
6482

0 commit comments

Comments
 (0)