Skip to content

Commit 2cb5d69

Browse files
authored
Separated Setup Instructions
1 parent 3e88f1b commit 2cb5d69

File tree

1 file changed

+152
-0
lines changed

1 file changed

+152
-0
lines changed
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
---
2+
title: Cloud Bursting Setup Instruction
3+
description: Learn how to setup Cloud bursting using Azure CycleCloud and Slurm.
4+
author: vinil-v
5+
ms.date: 12/23/2024
6+
ms.author: padmalathas
7+
---
8+
9+
## Setup Instructions
10+
11+
After we have the prerequisites ready, we can follow these steps to integrate the external Slurm Scheduler node with the CycleCloud cluster:
12+
13+
### Importing a Cluster Using the Slurm Headless Template in CycleCloud
14+
15+
- This step must be executed on the **CycleCloud VM**.
16+
- Make sure that the **CycleCloud 8.6.4 VM** is running and accessible via the `cyclecloud` CLI.
17+
- Execute the `cyclecloud-project-build.sh` script and provide the desired cluster name (e.g., `hpc1`). This will set up a custom project based on the `cyclecloud-slurm-3.0.9` version and import the cluster using the Slurm headless template.
18+
- In the example provided, `hpc1` is used as the cluster name. You can choose any cluster name, but be consistent and use the same name throughout the entire setup.
19+
20+
21+
```bash
22+
git clone https://github.com/Azure/cyclecloud-slurm.git
23+
cd cyclecloud-slurm/cloud_bursting/slurm-23.11.9-1/cyclecloud
24+
sh cyclecloud-project-build.sh
25+
```
26+
27+
Output :
28+
29+
```bash
30+
[user1@cc86vm ~]$ cd cyclecloud-slurm/cloud_bursting/slurm-23.11.9-1/cyclecloud
31+
[user1@cc86vm cyclecloud]$ sh cyclecloud-project-build.sh
32+
Enter Cluster Name: hpc1
33+
Cluster Name: hpc1
34+
Use the same cluster name: hpc1 in building the scheduler
35+
Importing Cluster
36+
Importing cluster Slurm_HL and creating cluster hpc1....
37+
----------
38+
hpc1 : off
39+
----------
40+
Resource group:
41+
Cluster nodes:
42+
Total nodes: 0
43+
Locker Name: cyclecloud_storage
44+
Fetching CycleCloud project
45+
Uploading CycleCloud project to the locker
46+
```
47+
48+
### Slurm Scheduler Installation and Configuration
49+
50+
- A VM should be deployed using the specified **AlmaLinux HPC 8.7** or **Ubuntu HPC 22.04** image.
51+
- If you already have a Slurm Scheduler installed, you may skip this step. However, it is recommended to review the script to ensure compatibility with your existing setup.
52+
- Run the Slurm scheduler installation script (`slurm-scheduler-builder.sh`) and provide the cluster name (`hpc1`) when prompted.
53+
- This script will setup NFS server and install and configure Slurm Scheduler.
54+
- If you are using an external NFS server, you can remove the NFS setup entries from the script.
55+
56+
57+
```bash
58+
git clone https://github.com/Azure/cyclecloud-slurm.git
59+
cd cyclecloud-slurm/cloud_bursting/slurm-23.11.9-1/scheduler
60+
sh slurm-scheduler-builder.sh
61+
```
62+
Output
63+
64+
```bash
65+
------------------------------------------------------------------------------------------------------------------------------
66+
Building Slurm scheduler for cloud bursting with Azure CycleCloud
67+
------------------------------------------------------------------------------------------------------------------------------
68+
69+
Enter Cluster Name: hpc1
70+
------------------------------------------------------------------------------------------------------------------------------
71+
72+
Summary of entered details:
73+
Cluster Name: hpc1
74+
Scheduler Hostname: masternode2
75+
NFSServer IP Address: 10.222.xxx.xxx
76+
```
77+
78+
### CycleCloud UI Configuration
79+
80+
- Access the **CycleCloud UI** and navigate to the settings for the `hpc1` cluster.
81+
- Edit the cluster settings to configure the VM SKUs and networking options as needed.
82+
- In the **Network Attached Storage** section, enter the NFS server IP address for the `/sched` and `/shared` mounts.
83+
- Select the OS from Advance setting tab - **Ubuntu 22.04** or **AlmaLinux 8** from the drop down based on the scheduler VM.
84+
- Once all settings are configured, click **Save** and then **Start** the `hpc1` cluster.
85+
86+
![NFS settings](../images/slurm-cloud-burst/cyclecloud-ui-config.png)
87+
88+
### CycleCloud Autoscaler Integration on Slurm Scheduler
89+
90+
- Integrate Slurm with CycleCloud using the `cyclecloud-integrator.sh` script.
91+
- Provide CycleCloud details (username, password, and ip address) when prompted.
92+
93+
```bash
94+
cd cyclecloud-slurm/cloud_bursting/slurm-23.11.9-1/scheduler
95+
sh cyclecloud-integrator.sh
96+
```
97+
Output:
98+
99+
```bash
100+
[root@masternode2 scripts]# sh cyclecloud-integrator.sh
101+
Please enter the CycleCloud details to integrate with the Slurm scheduler
102+
103+
Enter Cluster Name: hpc1
104+
Enter CycleCloud Username: user1
105+
Enter CycleCloud Password:
106+
Enter CycleCloud IP (e.g., 10.220.x.xx): 10.220.x.xx
107+
------------------------------------------------------------------------------------------------------------------------------
108+
109+
Summary of entered details:
110+
Cluster Name: hpc1
111+
CycleCloud Username: user1
112+
CycleCloud URL: https://10.220.x.xx
113+
114+
------------------------------------------------------------------------------------------------------------------------------
115+
```
116+
117+
### User and Group Setup (Optional)
118+
119+
- Ensure consistent user and group IDs across all nodes.
120+
- It is advisable to use a centralized User Management system like LDAP to maintain consistent UID and GID across all nodes.
121+
- In this example, we are using the `useradd_example.sh` script to create a test user `user1` and a group for job submission. (User `user1` already exists in CycleCloud)
122+
123+
```bash
124+
cd cyclecloud-slurm/cloud_bursting/slurm-23.11.9-1/scheduler
125+
sh useradd_example.sh
126+
```
127+
128+
### Testing the Setup
129+
130+
- Log in as a test user (e.g., `user1`) on the Scheduler node.
131+
- Submit a test job to verify that the setup is functioning correctly.
132+
133+
```bash
134+
su - user1
135+
srun hostname &
136+
```
137+
Output:
138+
```bash
139+
[root@masternode2 scripts]# su - user1
140+
Last login: Tue May 14 04:54:51 UTC 2024 on pts/0
141+
[user1@masternode2 ~]$ srun hostname &
142+
[1] 43448
143+
[user1@masternode2 ~]$ squeue
144+
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
145+
1 hpc hostname user1 CF 0:04 1 hpc1-hpc-1
146+
[user1@masternode2 ~]$ hpc1-hpc-1
147+
```
148+
![Node Creation](../images/slurm-cloud-burst/cyclecloud-ui-new-node.png)
149+
150+
You should see the job running successfully, indicating a successful integration with CycleCloud.
151+
152+
For further details and advanced configurations, refer to the scripts and documentation within this repository.

0 commit comments

Comments
 (0)