Skip to content

Commit b4a8789

Browse files
lucamarLuca Marsella
andauthored
Update policies introducing maintenance policies (#249)
I provide an update of the policies as described in [SRM-274](https://jira.cscs.ch/browse/SRM-274) in view of the presentation "General Information for Users on Alps" at the UserLab Day 2025. --------- Co-authored-by: Luca Marsella <[email protected]>
1 parent 1cbb3ca commit b4a8789

File tree

4 files changed

+51
-16
lines changed

4 files changed

+51
-16
lines changed

docs/policies/index.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,9 @@ The CSCS [code of conduct](code-of-conduct.md) outlines the responsibilities and
66

77
The [User Regulations](regulations.md) define the basic guidelines for the usage of CSCS computing resources. The right to access CSCS resources may be revoked to whoever breaches any of the user regulations.
88

9-
## Computing Budget
9+
The [User Support Policies](support.md), the [Slack Code of Conduct](slack.md) and the [Scheduled Maintenance and System Unavailability Policies](maintenance.md) provide additional information on support services, the regulations of the Users Slack space and the scheduled maintenance events.
10+
11+
## Resource Allocation Policies
1012

1113
Compute time on Alps systems is measured in node hours. Currently, we only support exclusive node allocations. This means that even if you utilize only a portion of a node’s resources (e.g., a single GPU), your account will still be charged for the entire node.
1214

@@ -18,7 +20,7 @@ Please note that resources at CSCS are assigned over three-months windows
1820

1921
## Data Retention Policies
2022

21-
Data belonging to active projects in the filesystems /users, /project, /store are under backup. There is no backup for data under the scratch filesystem, therefore no data recovery is possible in case of accidental loss or for data deleted due to the cleaning policy implemented on this filesystem.
23+
Data belonging to active projects in the filesystems `/users` and `/capstor/store` are under backup. There is no backup for data under the scratch filesystem, therefore no data recovery is possible in case of accidental loss or for data deleted due to the cleaning policy implemented on this filesystem.
2224

2325
Please note that the long term storage service is granted as long as your project is active, and the data will be removed without further notice 3 months after the expiration of the project: please check the applicable filesystem policies for the grace period granted after the expiration of the project.
2426

docs/policies/maintenance.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
[](){#ref-maintenance}
2+
# Scheduled Maintenance and System Unavailability Policy
3+
4+
To ensure the reliability and performance of the Alps production vClusters, CSCS continues to implement rolling updates aimed at reducing downtime during routine maintenance. However, regular interventions are still necessary at this stage.
5+
6+
## Advance notice
7+
8+
We strive to announce scheduled system unavailability at least one week in advance. In some cases, earlier notice may be possible, although this depends on external factors and internal approval processes.
9+
10+
## Shared infrastructure
11+
12+
Alps is a shared research infrastructure supporting a diverse range of research communities, partners, and projects. Occasionally, the system may be temporarily dedicated to specific scientific projects to enable large-scale capability runs.
13+
14+
[](){#ref-maintenance-cadence}
15+
## Maintenance and availability cadence
16+
17+
To help users plan their activities within each allocation quarter, we provide a tentative schedule of system unavailability. Please note that this schedule is subject to change based on operational requirements:
18+
19+
### Routine maintenance
20+
* __Cadence__: Occurs weekly, depending on need
21+
* __Typical duration__: Half a day; occasionally up to one full day
22+
23+
24+
### Extraordinary maintenance
25+
* __Cadence__: At least once per quarter
26+
* __Typical duration__: Two days; may be extended if necessary
27+
28+
### Dedicated large-scale capability runs of scientific projects
29+
* __Cadence__: At most once per quarter
30+
* __Typical duration__: One week
31+
32+
[](){#ref-maintenance-feedback}
33+
## Communication and feedback
34+
35+
CSCS values the constructive feedback provided by users. We will use this input to enhance our communication practices and to develop mitigation strategies for scheduled events that may significantly impact system usability.

docs/policies/support.md

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,11 @@
11
[](){#ref-support}
2-
# User Support Policy
3-
4-
## 1. User Support Policy
2+
# User Support Policies
53

64
CSCS operates an advanced research infrastructure dedicated to High-Performance Computing (HPC) and other scientific applications.
75
Our infrastructure encompasses a wide array of resources including compute, network, supporting software and tools, and several software applications used by a broad user base.
8-
Our user support policy outlines the level of assistance users can expect, the types of support offered, and the guidelines for requesting and receiving assistance.
6+
Our user support policies outline the level of assistance users can expect, the types of support offered, and the guidelines for requesting and receiving assistance.
97

10-
## 2. Best Effort Support
8+
## Best Effort Support
119

1210
CSCS is committed to offering best effort support to our users.
1311
Our goal is to provide responsive and effective assistance, ensuring the hardware and software infrastructure operates at a high level to satisfy the majority of the scientific community’s needs.
@@ -25,7 +23,7 @@ Support will be focused on ensuring that the resources are used in alignment wit
2523
Requests that significantly deviate from the original proposal may not be accommodated.
2624

2725
[](){#ref-support-user-apps}
28-
## 3. User Applications
26+
## User Applications
2927

3028
User applications are those brought to CSCS systems by the users, whether they are developed by the users themselves or another third-party.
3129
Packages or applications not provided by CSCS are considered user applications.
@@ -35,7 +33,7 @@ While we can assist with infrastructure-related issues, we can not configure, op
3533
Users are responsible for resolving application-specific issues themselves or contacting the respective developers.
3634

3735
[](){#ref-support-apps}
38-
## 4. Officially Supported Applications
36+
## Officially Supported Applications
3937

4038
CSCS offers a range of officially supported applications and their respective versions and configurations, which are packaged and released by CSCS or its supply partners.
4139
These packages benefit from our resources, expertise, and comprehensive documentation.
@@ -46,33 +44,32 @@ This support also extends to common tools and libraries provided by CSCS for the
4644
While CSCS provides enhanced support for third-party software included in our officially supported applications, our ability to resolve issues is contingent on the extent of our expertise and control.
4745
Bugs or other problems that fall outside of our immediate control will be escalated to the relevant third-party vendors, but further resolution will depend on their response and capabilities, limiting our ability to fully address such issues.
4846

49-
## 5. Prioritisation Criteria
47+
## Prioritisation Criteria
5048

5149
Support cases will be prioritised based on factors such as the impact on CSCS's overall mission and services, potential for knowledge transfer, degree of expertise required, and time and effort required to provide support.
5250
Issues directly concerning products and services offered by CSCS will be given higher priority.
5351

54-
## 6. Collaborative Support
52+
## Collaborative Support
5553

5654
The effectiveness and efficiency of our support are greatly enhanced when users work collaboratively with us. By providing thorough information users enable us to deliver more effective and timely assistance. To facilitate effective support, users are expected to:
5755

5856
* *Consult Documentation*: Users are encouraged to review the provided documentation and indicate what they have consulted before seeking support.
5957
* *Provide Detailed Information*: Users should offer, to the best of their ability, sufficient documentation and information about their software and the issues they are experiencing.
6058
This includes detailing previous attempts to resolve the issue and any relevant error messages or logs. Clear and precise communication of the problem and steps already taken helps us diagnose and address issues more efficiently.
6159

62-
## 7. Closure of Support Tickets
60+
## Closure of Support Tickets
6361

6462
Support tickets related to user applications will be closed if, after providing all feasible guidance and troubleshooting within our support scope and capacity, it is determined that the issue lies beyond the control of CSCS, such as in the user’s application code or third-party dependencies.
6563
In such cases, the ticket will be closed after the user has been informed of the situation and provided with any relevant recommendations or resources for further investigation.
6664
Users are welcome to reopen the ticket if new, actionable information becomes available.
6765

68-
## 8. Communication Channels
66+
## Communication Channels
6967

7068
Users can request support through the CSCS Service Desk. Updates and communication with support staff will be provided through e-mail or via the Service Desk. Users are also encouraged to communicate with each other via our community channels. CSCS reserves the right to make other forms of communication also available.
7169

72-
## 9. Continuous Improvement
70+
## Continuous Improvement
7371

7472
We are committed to continuously improving our support services.
7573
Feedback from users is welcomed and will be used to refine our support policies and procedures to better meet the needs of our community.
7674

7775
By adhering to this user support policy, we aim to ensure a consistent and satisfactory support experience for all users at CSCS.
78-

mkdocs.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,8 +141,9 @@ nav:
141141
- policies/index.md
142142
- 'User Regulations': policies/regulations.md
143143
- 'Code of Conduct': policies/code-of-conduct.md
144-
- 'UserLab Support Policy': policies/support.md
144+
- 'User Support Policies': policies/support.md
145145
- 'Slack Code of Conduct': policies/slack.md
146+
- 'Scheduled Maintenance and System Unavailability Policies': policies/maintenance.md
146147
- 'Contributing':
147148
- contributing/index.md
148149

0 commit comments

Comments
 (0)