You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/README.md
+52-24Lines changed: 52 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -335,10 +335,14 @@ destroy the cluster or change it manually on the Puppet server.
335
335
336
336
Since Magic Cluster configuration is managed with git, it is possible to specify
337
337
which version of the configuration you wish to use. Typically, it will match the
338
-
version number of the release you have downloaded (i.e: `9.3`).
338
+
version number of the release you have downloaded (i.e: `15.1.0`).
339
339
340
340
**Requirement**: Must refer to a git commit, tag or branch existing
341
-
in the git repository pointed by `config_git_url`.
341
+
in the git repository pointed by `config_git_url`. It cannot be an empty string.
342
+
343
+
**Warning**: The validity of the string as a git reference is not verified. In the
344
+
event it is invalid, Magic Castle defaults to using the latest release tag available
345
+
and logs a warning in the puppet server message of the day (`/etc/motd`).
342
346
343
347
**Post build modification effect**: none. To change the Puppet configuration version,
344
348
destroy the cluster or change it manually on the Puppet server.
@@ -617,7 +621,7 @@ available models per region
617
621
618
622
##### Incus
619
623
620
-
- `target`: name of the [specific cluster member](https://linuxcontainers.org/incus/docs/main/howto/cluster_manage_instance/#launch-an-instance-on-a-specific-cluster-member) to deploy the instance. **Only use with Incus cluster.**
624
+
- `target`: name of the [specific cluster member](https://linuxcontainers.org/incus/docs/main/howto/cluster_manage_instance/#launch-an-instance-on-a-specific-cluster-member) to deploy the instance. **Only use with Incus cluster.**
621
625
622
626
#### 4.7.3 Post build modification effect
623
627
@@ -1383,35 +1387,59 @@ for more information.
1383
1387
1384
1388
## 8. Deployment
1385
1389
1386
-
To create the resources defined by your main, enter the following command
1387
-
```
1390
+
To create the resources defined in your Terraform configuration, run:
1391
+
1392
+
```bash
1388
1393
terraform apply
1389
1394
```
1390
1395
1391
-
The command will produce the same output as the `plan` command, but after
1392
-
the output it will ask for a confirmation to perform the proposed actions.
1393
-
Enter `yes`.
1396
+
This command will first display the execution plan (equivalent to `terraform plan`) and then prompt you to confirm the proposed actions. Type `yes` to proceed.
1397
+
1398
+
Terraform will then create the infrastructure resources defined in the configuration. This step typically takes a few minutes. Once completed, Terraform will output:
1399
+
1400
+
- Guest account usernames and passwords
1401
+
- The sudo-enabled username
1402
+
- The floating IP address of the login node
1403
+
1404
+
### Important: Cluster Readiness
1405
+
1406
+
Although Terraform reports completion once the connection information is displayed,
1407
+
**the cluster is not immediately ready for use**.
1408
+
1409
+
Instance creation is only the first phase of the cluster build. A second, automated configuration phase follows, during which Magic Castle installs and configures core services such as:
1410
+
user accounts, FreeIPA, Slurm, JupyterHub, etc.
1394
1411
1395
-
Terraform will then proceed to create the resources defined by the
1396
-
configuration file. It should take a few minutes. Once the creation process
1397
-
is completed, Terraform will output the guest account usernames and password,
1398
-
the sudoer username and the floating ip of the login
1399
-
node.
1412
+
This configuration phase typically takes **approximately 15 minutes** after the instances are created.
1400
1413
1401
-
**Warning**: although the instance creation process is finished once Terraform
1402
-
outputs the connection information, you will not be able to
1403
-
connect and use the cluster immediately. The instance creation is only the
1404
-
first phase of the cluster-building process. The configuration: the
1405
-
creation of the user accounts, installation of FreeIPA, Slurm, configuration
1406
-
of JupyterHub, etc.; takes around 15 minutes after the instances are created.
1414
+
### Instance Configuration Process
1415
+
1416
+
Each instance goes through a two-stage configuration process:
1417
+
1418
+
1. **cloud-init**
1419
+
- Upgrades operating system packages
1420
+
- Installs Puppet
1421
+
2. **Puppet**
1422
+
- Installs and configures software based on the instance role, as defined by instance tags (e.g. `node`)
If an error occurs during the first (cloud-init) stage, a warning is displayed in the instance
1432
+
message of the day (e.g.: `/etc/motd`). The failed commands are recorded in:
1433
+
1434
+
```
1435
+
/run/cloud-init-failed
1436
+
```
1407
1437
1408
-
Once it is booted, you can follow an instance configuration process by looking at:
1438
+
Because successful completion of the first stage is required for the second stage to proceed, the configuration process halts if cloud-init fails.
1409
1439
1410
-
* `/var/log/cloud-init-output.log`
1411
-
* `journalctl -u puppet`
1440
+
You may resume the configuration by manually re-running the failed commands listed in `/run/cloud-init-failed` once the underlying issue has been resolved.
1412
1441
1413
-
If unexpected problems occur during configuration, you can provide these
1414
-
logs to the authors of Magic Castle to help you debug.
1442
+
Failures during the first stage are rare and are most often caused by external dependencies, such as temporary unavailability of GitHub or package repositories.
0 commit comments