You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/alps/storage.md
+45-1Lines changed: 45 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,7 +31,51 @@ These separate clusters are on the same Slingshot 11 network as the Alps.
31
31
## Capstor
32
32
33
33
Capstor is the largest file system, for storing large amounts of input and output data.
34
-
It is used to provide SCRATCH and STORE for different clusters - the precise details are platform-specific.
34
+
It is used to provide [scratch][ref-storage-scratch] and [store][ref-storage-store].
35
+
36
+
!!! todo "add information about meta data services, and their distribution over scratch and store"
37
+
38
+
[](){#ref-alps-capstor-scratch}
39
+
### Scratch
40
+
41
+
All users on Alps get their own scratch path on Alps, `/capstor/scratch/cscs/$USER`.
42
+
43
+
[](){#ref-alps-capstor-store}
44
+
### Store
45
+
46
+
The Store mount point on Capstor provides stable storage with [backups][ref-storage-backups] and no [cleaning policy][ref-storage-cleanup].
47
+
It is mounted on clusters at the `/capstor/store` mount point, with folders created for each project.
48
+
49
+
To accomodate the different customers and projects on Alps, the directory structure is more complicated than the per-user paths on Scratch.
50
+
Project paths are organised as follows:
51
+
52
+
```
53
+
/capstor/store/<tenant>/<customer>/<group_id>
54
+
```
55
+
56
+
!!! question "What are `tenant`, `customer` and `group_id` in this context?"
57
+
58
+
* **`tenant`**: there are currently two tenants, `cscs` and `mch`:
59
+
* the vast majority of projects are hosted by the `cscs` tenant.
60
+
* **`customer`**: refers to the contractual partner responsible for the project.
61
+
Examples of customers include:
62
+
* `userlab`: projects allocated in the CSCS User Lab through open calls. The majority of projects are hosted here, particularly on the [HPC platform][ref-platform-hpcp].
63
+
* `swissai`: most projects allocated on the [Machine Learning Platform][ref-platform-mlp].
64
+
* `2go`: projects allocated under the [CSCS2GO](https://2go.cscs.ch) scheme.
65
+
* **`group_id`**: refers to the linux group created for the project.
66
+
67
+
Users often are part of multiple projects, and by extension their associated `groupd_id` groups.
68
+
You can get a list of your groups using the `id` command in the terminal:
Here the user `bobsmith` is in three projects, with the project `g152` being their **primary project** (which can also be determined using the `id -gn $USER`).
74
+
75
+
* They are also in the `vasp6` group, which users who have been granted access to the [VASP][ref-uenv-vasp] application.
76
+
77
+
!!! info "The `$PROJECT` environment variable"
78
+
On some clusters, for example [Eiger][ref-cluster-eiger] and [Eiger][ref-cluster-daint], the project folder for your primary project can be accessed using the `$PROJECT` environment variable.
Copy file name to clipboardExpand all lines: docs/storage/filesystems.md
+16-15Lines changed: 16 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -82,13 +82,12 @@ Daily [snapshots][ref-storage-snapshots] for the last seven days are provided in
82
82
[](){#ref-storage-scratch}
83
83
## Scratch
84
84
85
-
The scratch file system is a fast workspace with temporary storage for use by jobs, with and emphasis on performance over reliability.
86
-
All CSCS systems provide a scratch personal folder for users that can be accessed through the environment variable `$SCRATCH`.
85
+
The scratch file system is a fast workspace tuned for use by parallel jobs, with and emphasis on performance over reliability, hosted on the [Capstor][ref-alps-capstor] Lustre filesystem.
87
86
88
-
!!! info "`$SCRATCH` on MLP points to Iopsstore"
89
-
All users on Alps get their own Scratch path, `/capstor/scratch/cscs/$USER`, which is pointed to by the variable `$SCRATCH` on the [HPC Platform][ref-platform-hpcp] and [Climate and Weather Platform][ref-platform-cwp] clusters Eiger, Daint and Santis.
87
+
All users on Alps get their own scratch path, `/capstor/scratch/cscs/$USER`, which is pointed to by the variable `$SCRATCH` on the [HPC Platform][ref-platform-hpcp] and [Climate and Weather Platform][ref-platform-cwp] clusters Eiger, Daint and Santis.
90
88
91
-
On the MLP systems [clariden][ref-cluster-clariden] and [bristen][ref-cluster-bristen] the `$SCRATCH` variable points to storage on [Iopstore][ref-alps-iopsstor].
89
+
!!! info "`$SCRATCH` on MLP points to Iopsstore"
90
+
On the machine learning platform (MLP) systems [clariden][ref-cluster-clariden] and [bristen][ref-cluster-bristen] the `$SCRATCH` variable points to storage on [Iopstore][ref-alps-iopsstor].
92
91
See the [MLP docs][ref-mlp-storage] for more information.
93
92
94
93
### Cleanup and Expiration
@@ -127,21 +126,21 @@ Space on Store is allocated per-project, with a path created for each project:
127
126
* the capacity and inode limit is per-project, based on the initial resource request;
128
127
* users have read and write access to the store paths for each project that they are a member of.
129
128
129
+
!!! info
130
+
More information about how per-project paths are organised on store is available on the [Capstor][ref-alps-capstor-store] documentation.
131
+
130
132
!!! warning "Avoid using store for jobs"
131
133
Store is tuned for storing results and shared datasets, specifically it has fewer meta data servers assigned to it.
132
134
133
135
Use the Scratch file systems, which are tuned for fast parallel I/O, for storing input and output for jobs.
134
136
135
-
!!! todo
136
-
Low level information about `/capstor/store/cscs/<customer>/<group_id>` from [KB](https://confluence.cscs.ch/spaces/KB/pages/879142656/capstor+store) can be put into a folded admonition.
137
-
138
137
### Cleanup and Expiration
139
138
140
-
There is no [cleanup policy][ref-storage-cleanup] on store, and the contents of are retained for three months after the project ends.
139
+
There is no [cleanup policy][ref-storage-cleanup] on store, and the contents are retained for three months after the project ends.
141
140
142
141
### Quota
143
142
144
-
Space on Store is allocated per-project, with a path is created for each project:
143
+
Space on Store is allocated per-project, with a path created for each project:
145
144
146
145
* the capacity and inode limit is per-project, based on the initial resource request;
147
146
* users have read and write access to the store paths for each project that they are a member of.
@@ -156,7 +155,7 @@ Space on Store is allocated per-project, with a path is created for each project
156
155
[](){#ref-storage-quota}
157
156
## Quota
158
157
159
-
Storage quota is a limit on available storage, that is applied to:
158
+
Storage quota is a limit on available storage applied to:
160
159
161
160
***capacity**: the total size of files;
162
161
* and **inodes**: the total number of files and directories.
@@ -219,7 +218,7 @@ Usage data updated on: 2025-05-21 11:10:02
The available capacity and used capacity is show for each file system that you have access to.
221
+
The available capacity and used capacity is shown for each file system that you have access to.
223
222
If you are in multiple projects, information for the [store][ref-storage-store] path for each project that you are a member of will be shown.
224
223
In the example above, the user is in two projects, namely `g33` and `csstaff`.
225
224
@@ -275,7 +274,7 @@ A snapshot is a full copy of a file system at a certain point in time, that can
275
274
## Cleanup policies
276
275
277
276
The performance of Lustre file systems is affected by file system occupancy and the number of files.
278
-
Ideally occupancy should not exceed 60%, with severe performance degradation for all users when occupancy exceeds 80% or there are too many small files.
277
+
Ideally occupancy should not exceed 60%, with severe performance degradation for all users when occupancy exceeds 80% and when there are too many small files.
279
278
280
279
File cleanup removes files that are not being used to ensure that occupancy and file counts do not affect file system performance.
281
280
@@ -305,8 +304,10 @@ In addition to the automatic deletion of old files, if occupancy exceeds 60% the
305
304
??? question "My files are gone, but the directories are still there"
306
305
When the [cleanup policy][ref-storage-cleanup] is applied on LUSTRE file systems, the files are removed, but the directories remain.
307
306
308
-
!!! todo
309
-
FAQ question: [why did I run out of space](https://confluence.cscs.ch/spaces/KB/pages/278036496/Why+did+I+run+out+of+space+on+HOME)
307
+
??? question "What do messages like `mkdir: cannot create directory 'test': Disk quota exceeded` mean?"
308
+
You have run out of quota on the target file system.
309
+
Consider deleting unneeded files, or moving data to a different file system.
310
+
Specifcially, if you see this message when using [home][ref-storage-home], which has a relatively small 50 GB limit, consider moving the data to your project's [store][ref-storage-store] path.
310
311
311
312
!!! todo
312
313
FAQ question: [writing with specific group access](https://confluence.cscs.ch/spaces/KB/pages/276955350/Writing+on+project+if+you+belong+to+more+than+one+group)
0 commit comments