|
1 | | -# oci-hpc-terraform-arch |
| 1 | +# hpc-quickstart |
| 2 | + |
2 | 3 | <pre> |
3 | 4 | `-/+++++++++++++++++/-.` |
4 | 5 | `/syyyyyyyyyyyyyyyyyyyyyyys/. |
5 | 6 | :yyyyo/-...............-/oyyyy/ |
6 | 7 | /yyys- .oyyy+ |
7 | 8 | .yyyy` `syyy- |
8 | | -:yyyo /yyy/ Oracle Cloud HPC cluster demo |
9 | | -.yyyy` `syyy- https://github.com/oci-hpc/oci-hpc-terraform-arch |
| 9 | +:yyyo /yyy/ Oracle Cloud HPC Cluster Demo |
| 10 | +.yyyy` `syyy- https://github.com/oracle/hpc-quickstart |
10 | 11 | /yyys. .oyyyo |
11 | 12 | /yyyyo:-...............-:oyyyy/` |
12 | 13 | `/syyyyyyyyyyyyyyyyyyyyyyys+. |
13 | 14 | `.:/+ooooooooooooooo+/:.` |
14 | 15 | ` |
15 | 16 | </pre> |
| 17 | + |
16 | 18 | High Performance Computing and storage in the cloud can be very confusing and it can be difficult to determine where to start. This repository is designed to be a first step in expoloring a cloud based HPC storage and compute architecture. There are many different configurations and deployment methods that could be used, but this repository focuses on a bare metal compute system deployed with Terraform. After deployment fully independant and functioning IaaS HPC compute cluster has been deployed based on the architecture below. |
17 | 19 |
|
18 | 20 | This deployment is an example of cluster provisioning using Terraform and SaltStack. Terraform is used to provision infrastructure, while Salt is a configuration and cluster management system. |
19 | 21 |
|
20 | 22 | Salt configuration is stored under ./salt directory, containing pillar/ (variables) and salt/ (state) information. Read more about salt in the documentation: https://docs.saltstack.com/en/latest/ |
21 | 23 |
|
22 | | -### Architecture |
23 | | - |
| 24 | +## Architecture |
24 | 25 |  |
25 | 26 |
|
26 | | -### Authentication |
27 | | - |
| 27 | +## Authentication |
28 | 28 | terraform.tvars contain required authentication variables |
29 | 29 |
|
30 | | -### Operations |
| 30 | +## Operations |
31 | 31 | Salt commands should be executed from the headnode. |
32 | 32 | IntelMPI installation: sudo salt '*' state.apply intelmpi |
33 | 33 |
|
34 | | -### SSH Key |
| 34 | +## SSH Key |
35 | 35 | SSH key is generated each time for the environment in the ./key.pem file. |
36 | 36 |
|
37 | | -### Networking |
38 | | - |
39 | | -- Public subnet |
40 | | - Headnode acts a jump host and it's placed in the public subnet. The subnet is open to SSH connections from everywhere. Other ports are closed and can be opened using custom-security-list in the OCI console/cli. |
41 | | - All connections from VCN are accepted. Host firewall service is disabled by default. |
42 | | - |
43 | | -- Private subnet |
44 | | - All connections from VCN are accepted. Public IP's are prohibited in the subnet. Internet access is provided by NAT gateway. |
| 37 | +## Networking |
| 38 | +* Public subnet - Headnode acts a jump host and it's placed in the public subnet. The subnet is open to SSH connections from everywhere. Other ports are closed and can be opened using custom-security-list in the OCI console/cli. All connections from VCN are accepted. Host firewall service is disabled by default. |
| 39 | +* Private subnet - All connections from VCN are accepted. Public IP's are prohibited in the subnet. Internet access is provided by NAT gateway. |
45 | 40 |
|
46 | | -### Roles |
47 | | - |
48 | | -Roles are set in variables.tf as |
49 | | -additional_headnode_roles, additional_worker_roles, additional_storage_roles or additional_role_all |
50 | | -Additional roles provide ability to install and configure applications defined as Salt states. |
| 41 | +## Roles |
| 42 | +Roles are set in variables.tf as additional_headnode_roles, additional_worker_roles, additional_storage_roles or additional_role_all Additional roles provide ability to install and configure applications defined as Salt states. |
51 | 43 |
|
52 | | -Example roles: |
53 | | -- intelmpi: provides configured Intel yum repository and installs IntelMPI distribution |
54 | | -- openmpi: installs OpenMPI from OL repository |
| 44 | +Example roles: |
| 45 | +* intelmpi: provides configured Intel yum repository and installs IntelMPI distribution |
| 46 | +* openmpi: installs OpenMPI from OL repository |
55 | 47 |
|
56 | | -### Storage |
57 | | -- Storage node require to be DenseIO shape (NVME devices are detected and configured). |
| 48 | +## Storage |
| 49 | +* Storage node require to be DenseIO shape (NVME devices are detected and configured). |
58 | 50 |
|
59 | | -#### Filesystems |
| 51 | +### Filesystems |
60 | 52 |
|
61 | 53 | Storage role servers will be configured as filesystem nodes, while headnode and worker nodes will act as a clients. |
| 54 | +* GlusterFS (requires storage role) - To use GlusterFS set storage_type to glusterfs. Filesystem will be greated as :/gfs and mounted under /mnt/gluster |
| 55 | +* BeeGFS (requires storage role) - To use BeeGFS set storage_type to beegfs. Filesystem will be mounted under /mnt/beegfs |
62 | 56 |
|
63 | | -- GlusterFS (requires storage role) |
64 | | - |
65 | | - To use BeeGFS set storage_type to beegfs. |
66 | | - Filesystem will be greated as :/gfs and mounted under /mnt/gluster |
67 | | - |
68 | | -- BeeGFS (requires storage role) |
69 | | - To use BeeGFS set storage_type to beegfs. |
70 | | - |
71 | | - Filesystem will be mounted under /mnt/beegfs |
72 | | - |
73 | | -#### NFS |
74 | | - |
75 | | -- Block volumes |
76 | | - Each node type can be configured with block volumes in the variables.tf |
| 57 | +### NFS |
| 58 | +* Block volumes - Each node type can be configured with block volumes in the variables.tf |
77 | 59 | Headnode will export first block volume as NFS share under /mnt/share (configured in salt/salt/nfs.sls) |
78 | 60 | Other block volume attachments need to be configured manually after cluster provisioning. |
79 | 61 |
|
80 | | -- FSS |
81 | | - File system service endpoint will be created in the private subnet and mounted on each node under /mnt/fss |
82 | | - |
83 | | - |
| 62 | +* FSS - File system service endpoint will be created in the private subnet and mounted on each node under /mnt/fss |
0 commit comments