Skip to content

Commit 361f42c

Browse files
authored
Merge pull request #302108 from tfitzmac/0701edit4
copy edit
2 parents 8200ccf + 3d4dd78 commit 361f42c

10 files changed

+276
-287
lines changed

articles/cyclecloud/how-to/bursting/setup-instructions-for-cloud-bursting.md

Lines changed: 27 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,20 @@
22
title: Cloud Bursting Setup Instruction
33
description: Learn how to set up Cloud bursting using Azure CycleCloud and Slurm.
44
author: vinil-v
5-
ms.date: 04/17/2025
5+
ms.date: 07/01/2025
66
ms.author: padmalathas
77
---
88

9-
# Setup Instructions
9+
# Setup instructions
1010

11-
After we have the prerequisites ready, we can follow these steps to integrate the external Slurm Scheduler node with the CycleCloud cluster:
11+
After you prepare the prerequisites, follow these steps to integrate the external Slurm Scheduler node with the CycleCloud cluster:
1212

13-
## Importing a Cluster Using the Slurm Headless Template in CycleCloud
13+
## Import a cluster by using the Slurm headless template in CycleCloud
1414

15-
- This step must be executed on the **CycleCloud VM**.
16-
- Make sure that the **CycleCloud 8.6.4 VM** is running and accessible via the `cyclecloud` CLI.
17-
- Execute the `cyclecloud-project-build.sh` script and provide the desired cluster name (for example, `hpc1`). This sets a custom project based on the `cyclecloud-slurm-3.0.9` version and import the cluster using the Slurm headless template.
18-
- In the example provided, `<clustername>` is used as the cluster name. Choose any cluster name you like, but same name must be consistently used throughout the entire setup.
15+
- Run this step on the **CycleCloud VM**.
16+
- Make sure that the **CycleCloud 8.6.4 VM** is running and accessible through the `cyclecloud` CLI.
17+
- Run the `cyclecloud-project-build.sh` script and enter the cluster name you want (for example, `hpc1`). The script sets a custom project based on the `cyclecloud-slurm-3.0.9` version and imports the cluster by using the Slurm headless template.
18+
- In the following example, `<clustername>` is the cluster name. You can choose any cluster name you want, but you must use the same name throughout the entire setup.
1919

2020

2121
```bash
@@ -45,20 +45,20 @@ Fetching CycleCloud project
4545
Uploading CycleCloud project to the locker
4646
```
4747

48-
## Slurm Scheduler Installation and Configuration
48+
## Slurm scheduler installation and configuration
4949

50-
- A VM should be deployed using the specified **AlmaLinux HPC 8.7** or **Ubuntu HPC 22.04** image.
51-
- If you already have a Slurm Scheduler installed, you can skip this step. However, it's advisable to review the script to make sure it's compatible with your current setup.
52-
- Run the Slurm scheduler installation script (`slurm-scheduler-builder.sh`) and provide the cluster name (`<clustername>`) when prompted.
53-
- This script sets up the NFS server and installs and configures the Slurm Scheduler.
50+
- Deploy a VM by using the **AlmaLinux HPC 8.7** or **Ubuntu HPC 22.04** image.
51+
- If you already have a Slurm scheduler installed, you can skip this step. However, we recommend that you review the script to make sure it's compatible with your current setup.
52+
- Run the Slurm scheduler installation script (`slurm-scheduler-builder.sh`) and enter the cluster name (`<clustername>`) when prompted.
53+
- The script sets up the NFS server and installs and configures the Slurm scheduler.
5454
- If you're using an external NFS server, you can delete the NFS setup entries from the script.
5555

5656
```bash
5757
git clone https://github.com/Azure/cyclecloud-slurm.git
5858
cd cyclecloud-slurm/cloud_bursting/slurm-23.11.9-1/scheduler
5959
sh slurm-scheduler-builder.sh
6060
```
61-
Output:
61+
Output:
6262

6363
```bash
6464
------------------------------------------------------------------------------------------------------------------------------
@@ -74,20 +74,20 @@ Scheduler Hostname: <scheduler hostname>
7474
NFSServer IP Address: 10.222.xxx.xxx
7575
```
7676

77-
## CycleCloud UI Configuration
77+
## CycleCloud UI configuration
7878

79-
- Access the **CycleCloud UI** and navigate to the settings for the `<clustername>` cluster.
80-
- Edit the cluster settings to configure the VM SKUs and networking options as needed.
79+
- Access the **CycleCloud UI** and go to the settings for the `<clustername>` cluster.
80+
- Edit the cluster settings to set up the VM versions and networking options.
8181
- In the **Network Attached Storage** section, enter the NFS server IP address for the `/sched` and `/shared` mounts.
8282
- On the Advance setting tab, from the dropdown menu choose the OS: either **Ubuntu 22.04** or **AlmaLinux 8** based on the scheduler VM.
83-
- Once all settings are configured, click **Save** and then **Start** the `<clustername>` cluster.
83+
- When you finish configuring the settings, select **Save** and then **Start** the `<clustername>` cluster.
8484

8585
![NFS settings](../../images/slurm-cloud-burst/cyclecloud-ui-config.png)
8686

8787
## CycleCloud Autoscaler Integration on Slurm Scheduler
8888

89-
- Integrate Slurm with CycleCloud using the `cyclecloud-integrator.sh` script.
90-
- Provide CycleCloud details (username, password, and ip address) when prompted.
89+
- Integrate Slurm with CycleCloud by using the `cyclecloud-integrator.sh` script.
90+
- Enter your CycleCloud username, password, and IP address when prompted.
9191

9292
```bash
9393
cd cyclecloud-slurm/cloud_bursting/slurm-23.11.9-1/scheduler
@@ -115,19 +115,19 @@ CycleCloud URL: https://<ip address>
115115

116116
## User and Group Setup (Optional)
117117

118-
- Ensure consistent user and group IDs across all nodes.
119-
- It's advisable to use a centralized User Management system like LDAP to maintain consistent UID and GID across all nodes.
120-
- In this example, we're using the `useradd_example.sh` script to create a test user `<username>` and a group for job submission. (User `<username>` already exists in CycleCloud)
118+
- Make sure user and group IDs are consistent across all nodes.
119+
- To keep UID and GID consistent across all nodes, use a centralized User Management system like LDAP.
120+
- In this example, use the `useradd_example.sh` script to create a test user `<username>` and a group for job submission. (User `<username>` already exists in CycleCloud)
121121

122122
```bash
123123
cd cyclecloud-slurm/cloud_bursting/slurm-23.11.9-1/scheduler
124124
sh useradd_example.sh
125125
```
126126

127-
## Testing the Setup
127+
## Testing the setup
128128

129-
- Log in as a test user (example, `<username>`) on the Scheduler node.
130-
- Submit a test job to verify that the setup is functioning correctly.
129+
- Sign in as a test user (for example, `<username>`) on the scheduler node.
130+
- Submit a test job to verify that the setup is working correctly.
131131

132132
```bash
133133
su - <username>
@@ -146,6 +146,6 @@ Last login: Tue May 14 04:54:51 UTC 2024 on pts/0
146146
```
147147
![Node Creation](../../images/slurm-cloud-burst/cyclecloud-ui-new-node.png)
148148

149-
You should see the job running successfully, indicating a successful integration with CycleCloud.
149+
You should see the job running successfully, which indicates a successful integration with CycleCloud.
150150

151151
For more information and advanced configurations, see the scripts and documentation within this repository.

articles/cyclecloud/how-to/bursting/slurm-cloud-bursting-setup.md

Lines changed: 28 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -2,70 +2,69 @@
22
title: Cloud Bursting Using Azure CycleCloud and Slurm
33
description: Learn how to configure Cloud bursting using Azure CycleCloud and Slurm.
44
author: vinil-v
5-
ms.date: 04/17/2025
5+
ms.date: 07/01/2025
66
ms.author: padmalathas
77
---
88

9-
# What is Cloud Bursting?
9+
# What is cloud bursting?
1010

11-
Cloud bursting is a configuration in cloud computing that allows an organization to handle peaks in IT demand by using a combination of private and public clouds. When the resources in a private cloud reach their maximum capacity, the overflow traffic is directed to a public cloud to ensure there's no interruption in services. This setup provides flexibility and cost savings, as you only pay for the supplemental resources when there's a demand for them.
11+
Cloud bursting is a configuration in cloud computing that helps your organization handle peaks in IT demand by using a combination of private and public clouds. When the resources in a private cloud reach their maximum capacity, the configuration directs the overflow traffic to a public cloud. This setup ensures there's no interruption in services. Cloud bursting provides flexibility and cost savings, as you only pay for the supplemental resources when there's a demand for them.
1212

13-
For example, an application can run on a private cloud and "burst" to a public cloud only when necessary to meet peak demands. This approach helps avoid the costs associated with maintaining extra capacity that isn't always in use.
13+
For example, an application can run on a private cloud and "burst" to a public cloud only when necessary to meet peak demands. This approach helps you avoid the costs associated with maintaining extra capacity that isn't always in use.
1414

15-
Cloud bursting can be used in various scenarios, such as enabling on-premises workloads to be sent to the cloud for processing, known as hybrid HPC (High-Performance Computing). It allows users to optimize their resource utilization and cost efficiency while accessing the scalability and flexibility of the cloud.
15+
You can use cloud bursting in various scenarios, such as enabling on-premises workloads to be sent to the cloud for processing, known as hybrid HPC (High-Performance Computing). It allows you to optimize your resource utilization and cost efficiency while accessing the scalability and flexibility of the cloud.
1616

1717
## Overview
1818

1919
This document offers a step-by-step guide on installing and configuring a Slurm scheduler to burst computing resources into the cloud using Azure CycleCloud. It explains how to create a hybrid HPC environment by extending on-premises Slurm clusters into Azure, allowing for seamless access to scalable and flexible cloud computing resources. The guide provides a practical example of optimizing compute capacity by integrating local infrastructure with cloud-based solutions.
2020

2121

22-
## Requirements to Setup Slurm Cloud Bursting Using CycleCloud on Azure
22+
## Requirements to Set Up Slurm Cloud Bursting Using CycleCloud on Azure
2323

2424
## Azure subscription account
25-
You must obtain an Azure subscription or be assigned as an Owner role of the subscription.
25+
You must have an Azure subscription or be assigned the Owner role for a subscription.
2626

27-
* To create an Azure subscription, go to the [Create a Subscription](/azure/cost-management-billing/manage/create-subscription#create-a-subscription) documentation.
27+
* To create an Azure subscription, see [Create a Subscription](/azure/cost-management-billing/manage/create-subscription#create-a-subscription).
2828
* To access an existing subscription, go to the [Azure portal](https://portal.azure.com/).
2929

3030
## Network infrastructure
31-
If you intend to create a Slurm cluster entirely within Azure, you must deploy both the head nodes and the CycleCloud compute nodes within a single Azure Virtual Network (VNET).
31+
To create a Slurm cluster entirely within Azure, deploy both the head nodes and the CycleCloud compute nodes within a single Azure Virtual Network (VNET).
3232

3333
![Slurm cluster](../../images/slurm-cloud-burst/slurm-cloud-burst-architecture.png)
3434

35-
To create a hybrid HPC cluster with head nodes on your on-premises corporate network and compute nodes in Azure, set up a [Site-to-Site](/azure/vpn-gateway/tutorial-site-to-site-portal) VPN or an [ExpressRoute](/azure/expressroute/) connection. This links your network to the Azure VNET. The head nodes must be able to connect to Azure services online. You might need to work with your network administrator to set this up.
36-
37-
## Network Ports and Security
38-
The following NSG rules must be configured for successful communication between Master node, CycleCloud server, and compute nodes.
35+
To create a hybrid HPC cluster with head nodes on your on-premises corporate network and compute nodes in Azure, set up a [Site-to-Site](/azure/vpn-gateway/tutorial-site-to-site-portal) VPN or an [ExpressRoute](/azure/expressroute/) connection. This setup links your network to the Azure VNET. The head nodes must be able to connect to Azure services online. You might need to work with your network administrator to set up this connection.
3936

37+
## Network ports and security
38+
To enable communication between the primary node, CycleCloud server, and compute nodes, configure the following NSG rules.
4039

4140
| **Service** | **Port** | **Protocol** | **Direction** | **Purpose** | **Requirement** |
4241
|------------------------------------|-----------------|--------------|------------------|------------------------------------------------------------------------|---------------------------------------------------------------------------------|
43-
| **SSH (Secure Shell)** | 22 | TCP | Inbound/Outbound | Secure command-line access to the Slurm Master node | Open on both on-premises firewall and Azure NSGs |
44-
| **Slurm Control (slurmctld, slurmd)** | 6817, 6818 | TCP | Inbound/Outbound | Communication between Slurm Master and compute nodes | Open in on-premises firewall and Azure NSGs |
45-
| **Munge Authentication Service** | 4065 | TCP | Inbound/Outbound | Authentication between Slurm Master and compute nodes | Open on both on-premises network and Azure NSGs |
46-
| **CycleCloud Service** | 443 | TCP | Outbound | Communication between Slurm Master node and Azure CycleCloud | Allow outbound connections to Azure CycleCloud services from the Slurm Master node |
47-
| **NFS ports** | 2049 | TCP | Inbound/Outbound | Shared filesystem access between Master node and Azure CycleCloud | Open on both on-premises network and Azure NSGs |
48-
| **LDAP port** (Optional) | 389 | TCP | Inbound/Outbound | Centralized authentication mechanism for user management | Open on both on-premises network and Azure NSGs
42+
| **SSH (Secure Shell)** | 22 | TCP | Inbound/Outbound | Secure command-line access to the Slurm primary node | Open on both on-premises firewall and Azure NSGs |
43+
| **Slurm Control (slurmctld, slurmd)** | 6817, 6818 | TCP | Inbound/Outbound | Communication between Slurm primary and compute nodes | Open in on-premises firewall and Azure NSGs |
44+
| **Munge Authentication Service** | 4065 | TCP | Inbound/Outbound | Authentication between Slurm primary and compute nodes | Open on both on-premises network and Azure NSGs |
45+
| **CycleCloud Service** | 443 | TCP | Outbound | Communication between Slurm primary node and Azure CycleCloud | Allow outbound connections to Azure CycleCloud services from the Slurm primary node |
46+
| **NFS ports** | 2049 | TCP | Inbound/Outbound | Shared filesystem access between primary node and Azure CycleCloud | Open on both on-premises network and Azure NSGs |
47+
| **LDAP port** (Optional) | 389 | TCP | Inbound/Outbound | Centralized authentication mechanism for user management | Open on both on-premises network and Azure NSGs
4948

50-
Refer [Slurm Network Configuration Guide](https://slurm.schedmd.com/network.html)
49+
See [Slurm Network Configuration Guide](https://slurm.schedmd.com/network.html).
5150

52-
## Software Requirement
51+
## Software requirements
5352

54-
- **OS Version**: AlmaLinux release 8.x or Ubuntu 22.04
55-
- **CycleCloud Version**: 8.x or later
56-
- **CycleCloud-Slurm Project Version**: 3.0.x
53+
- **OS version**: AlmaLinux release 8.x or Ubuntu 22.04
54+
- **CycleCloud version**: 8.x or later
55+
- **CycleCloud-Slurm project version**: 3.0.x
5756

58-
## NFS File server
59-
A shared file system between the external Slurm Scheduler node and the CycleCloud cluster. You can use Azure NetApp Files, Azure Files, NFS, or other methods to mount the same file system on both sides. In this example, we're using a Scheduler VM as an NFS server.
57+
## NFS file server
58+
A shared file system between the external Slurm scheduler node and the CycleCloud cluster. You can use Azure NetApp Files, Azure Files, NFS, or other methods to mount the same file system on both sides. In this example, use a scheduler VM as an NFS server.
6059

61-
## Centralized User management system (LDAP or AD)
60+
## Centralized user management system (LDAP or AD)
6261
In HPC environments, maintaining consistent user IDs (UIDs) and group IDs (GIDs) across the cluster is critical for seamless user access and resource management. A centralized user management system, such as LDAP or Active Directory (AD), ensures that UIDs and GIDs are synchronized across all compute nodes and storage systems.
6362

6463
> [!Important]
6564
>
66-
> For more information on how to setup and instructions, see the blog post about [Slurm Cloud Bursting Using CycleCloud on Azure](https://techcommunity.microsoft.com/blog/azurehighperformancecomputingblog/setting-up-slurm-cloud-bursting-using-cyclecloud-on-azure/4140922).
65+
> For more information on how to set up a centralized user management system, see the blog post about [Slurm Cloud Bursting Using CycleCloud on Azure](https://techcommunity.microsoft.com/blog/azurehighperformancecomputingblog/setting-up-slurm-cloud-bursting-using-cyclecloud-on-azure/4140922).
6766
68-
### Next Steps
67+
### Next steps
6968

7069
* [GitHub repo - cyclecloud-slurm](https://github.com/Azure/cyclecloud-slurm/tree/master)
7170
* [Azure CycleCloud Documentation](../../overview.md)

0 commit comments

Comments
 (0)