Skip to content

Commit 64c68ba

Browse files
Merge branch 'aws:main' into add-debug-flag
2 parents 2ff708a + 36140e3 commit 64c68ba

File tree

10 files changed

+722
-178
lines changed

10 files changed

+722
-178
lines changed

README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -826,6 +826,19 @@ status = stack.get_status(region="us-west-2")
826826
print(status)
827827
```
828828
829+
#### Deleting a Cluster Stack
830+
831+
```python
832+
# Delete with custom logger
833+
import logging
834+
logger = logging.getLogger(__name__)
835+
HpClusterStack.delete("my-stack-name", region="us-west-2", logger=logger)
836+
837+
# Delete with retained resources (only works on DELETE_FAILED stacks)
838+
HpClusterStack.delete("my-stack-name", retain_resources=["S3Bucket", "EFSFileSystem"])
839+
840+
```
841+
829842
### Training SDK
830843
831844
#### Creating a Training Job

doc/cli/cluster_management/cli_cluster_management.md

Lines changed: 125 additions & 111 deletions
Original file line numberDiff line numberDiff line change
@@ -9,18 +9,20 @@ Complete reference for SageMaker HyperPod cluster management parameters and conf
99
```
1010

1111
* [Initialize Configuration](#hyp-init)
12+
* [Configure Parameters](#hyp-configure)
13+
* [Validate Configuration](#hyp-validate)
14+
* [Reset Configuration](#hyp-reset)
1215
* [Create Cluster Stack](#hyp-create)
16+
1317
* [Update Cluster](#hyp-update-cluster)
1418
* [List Cluster Stacks](#hyp-list-cluster-stack)
1519
* [Describe Cluster Stack](#hyp-describe-cluster-stack)
20+
* [Delete Cluster Stack](#hyp-delete-cluster-stack)
1621
* [List HyperPod Clusters](#hyp-list-cluster)
1722
* [Set Cluster Context](#hyp-set-cluster-context)
1823
* [Get Cluster Context](#hyp-get-cluster-context)
1924
* [Get Monitoring](#hyp-get-monitoring)
2025

21-
* [Configure Parameters](#hyp-configure)
22-
* [Validate Configuration](#hyp-validate)
23-
* [Reset Configuration](#hyp-reset)
2426

2527
## hyp init
2628

@@ -46,6 +48,108 @@ The `resource_name_prefix` parameter in the generated `config.yaml` file serves
4648
**Cluster stack names must be unique within each AWS region.** If you attempt to create a cluster stack with a name that already exists in the same region, the deployment will fail.
4749
```
4850

51+
## hyp configure
52+
53+
Configure cluster parameters interactively or via command line.
54+
55+
```{important}
56+
**Pre-Deployment Configuration**: This command modifies local `config.yaml` files **before** cluster creation. For updating **existing, deployed clusters**, use `hyp update cluster` instead.
57+
```
58+
59+
#### Syntax
60+
61+
```bash
62+
hyp configure [OPTIONS]
63+
```
64+
65+
#### Parameters
66+
67+
This command dynamically supports all configuration parameters available in the current template's schema. Common parameters include:
68+
69+
| Parameter | Type | Required | Description |
70+
|-----------|------|----------|-------------|
71+
| `--resource-name-prefix` | TEXT | No | Prefix for all AWS resources |
72+
| `--create-hyperpod-cluster-stack` | BOOLEAN | No | Create HyperPod Cluster Stack |
73+
| `--hyperpod-cluster-name` | TEXT | No | Name of SageMaker HyperPod Cluster |
74+
| `--create-eks-cluster-stack` | BOOLEAN | No | Create EKS Cluster Stack |
75+
| `--kubernetes-version` | TEXT | No | Kubernetes version |
76+
| `--eks-cluster-name` | TEXT | No | Name of the EKS cluster |
77+
| `--create-helm-chart-stack` | BOOLEAN | No | Create Helm Chart Stack |
78+
| `--namespace` | TEXT | No | Namespace to deploy HyperPod Helm chart |
79+
| `--node-provisioning-mode` | TEXT | No | Continuous provisioning mode |
80+
| `--node-recovery` | TEXT | No | Node recovery setting ("Automatic" or "None") |
81+
| `--create-vpc-stack` | BOOLEAN | No | Create VPC Stack |
82+
| `--vpc-id` | TEXT | No | Existing VPC ID |
83+
| `--vpc-cidr` | TEXT | No | VPC CIDR block |
84+
| `--create-security-group-stack` | BOOLEAN | No | Create Security Group Stack |
85+
| `--enable-hp-inference-feature` | BOOLEAN | No | Enable inference operator |
86+
| `--stage` | TEXT | No | Deployment stage ("gamma" or "prod") |
87+
| `--create-fsx-stack` | BOOLEAN | No | Create FSx Stack |
88+
| `--storage-capacity` | INTEGER | No | FSx storage capacity in GiB |
89+
| `--tags` | JSON | No | Resource tags as JSON object |
90+
91+
**Note:** The exact parameters available depend on your current template type and version. Run `hyp configure --help` to see all available options for your specific configuration.
92+
93+
## hyp validate
94+
95+
Validate the current directory's configuration file syntax and structure.
96+
97+
#### Syntax
98+
99+
```bash
100+
# Validate current configuration syntax
101+
hyp validate
102+
103+
# Example output on success
104+
✔️ config.yaml is valid!
105+
106+
# Example output with syntax errors
107+
❌ Config validation errors:
108+
– kubernetes_version: Field is required
109+
– vpc_cidr: Expected string, got number
110+
```
111+
112+
#### Parameters
113+
114+
No parameters required.
115+
116+
```{note}
117+
This command performs **syntactic validation only** of the `config.yaml` file against the appropriate schema. It checks:
118+
119+
- **YAML syntax**: Ensures file is valid YAML
120+
- **Required fields**: Verifies all mandatory fields are present
121+
- **Data types**: Confirms field values match expected types (string, number, boolean, array)
122+
- **Schema structure**: Validates against the template's defined structure
123+
124+
This command performs syntactic validation only and does **not** verify the actual validity of values (e.g., whether AWS regions exist, instance types are available, or resources can be created).
125+
126+
**Prerequisites**
127+
128+
- Must be run in a directory where `hyp init` has created configuration files
129+
- A `config.yaml` file must exist in the current directory
130+
131+
**Output**
132+
133+
- **Success**: Displays confirmation message if syntax is valid
134+
- **Errors**: Lists specific syntax errors with field names and descriptions
135+
```
136+
137+
138+
## hyp reset
139+
140+
Reset the current directory's config.yaml to default values.
141+
142+
#### Syntax
143+
144+
```bash
145+
hyp reset
146+
```
147+
148+
#### Parameters
149+
150+
No parameters required.
151+
152+
49153
## hyp create
50154

51155
Create a new HyperPod cluster stack using the provided configuration.
@@ -128,6 +232,24 @@ hyp describe cluster-stack STACK-NAME [OPTIONS]
128232
| `--region` | TEXT | No | AWS region of the stack |
129233
| `--debug` | FLAG | No | Enable debug logging |
130234

235+
236+
## hyp delete cluster-stack
237+
238+
Delete a HyperPod cluster stack. Removes the specified CloudFormation stack and all associated AWS resources. This operation cannot be undone.
239+
240+
#### Syntax
241+
```bash
242+
hyp delete cluster-stack <stack-name>
243+
```
244+
245+
#### Parameters
246+
| Option | Type | Description |
247+
|--------|------|-------------|
248+
| `--region <region>` | Required | The AWS region where the stack exists. |
249+
| `--retain-resources <list>` | Optional | Comma-separated list of logical resource IDs to retain during deletion (only works on DELETE_FAILED stacks). Resource names are shown in failed deletion output, or use AWS CLI: `aws cloudformation list-stack-resources STACK_NAME --region REGION`. Example: `S3Bucket-TrainingData,EFSFileSystem-Models` |
250+
| `--debug` | Optional | Enable debug mode for detailed logging. |
251+
252+
131253
## hyp list-cluster
132254

133255
List SageMaker HyperPod clusters with capacity information.
@@ -201,114 +323,6 @@ hyp get-monitoring [OPTIONS]
201323
| `--prometheus` | FLAG | No | Return Prometheus workspace URL |
202324
| `--list` | FLAG | No | Return list of available metrics |
203325

204-
## hyp configure
205-
206-
Configure cluster parameters interactively or via command line.
207-
208-
```{important}
209-
**Pre-Deployment Configuration**: This command modifies local `config.yaml` files **before** cluster creation. For updating **existing, deployed clusters**, use `hyp update cluster` instead.
210-
```
211-
212-
#### Syntax
213-
214-
```bash
215-
hyp configure [OPTIONS]
216-
```
217-
218-
#### Parameters
219-
220-
This command dynamically supports all configuration parameters available in the current template's schema. Common parameters include:
221-
222-
| Parameter | Type | Required | Description |
223-
|-----------|------|----------|-------------|
224-
| `--resource-name-prefix` | TEXT | No | Prefix for all AWS resources |
225-
| `--create-hyperpod-cluster-stack` | BOOLEAN | No | Create HyperPod Cluster Stack |
226-
| `--hyperpod-cluster-name` | TEXT | No | Name of SageMaker HyperPod Cluster |
227-
| `--create-eks-cluster-stack` | BOOLEAN | No | Create EKS Cluster Stack |
228-
| `--kubernetes-version` | TEXT | No | Kubernetes version |
229-
| `--eks-cluster-name` | TEXT | No | Name of the EKS cluster |
230-
| `--create-helm-chart-stack` | BOOLEAN | No | Create Helm Chart Stack |
231-
| `--namespace` | TEXT | No | Namespace to deploy HyperPod Helm chart |
232-
| `--node-provisioning-mode` | TEXT | No | Continuous provisioning mode |
233-
| `--node-recovery` | TEXT | No | Node recovery setting ("Automatic" or "None") |
234-
| `--create-vpc-stack` | BOOLEAN | No | Create VPC Stack |
235-
| `--vpc-id` | TEXT | No | Existing VPC ID |
236-
| `--vpc-cidr` | TEXT | No | VPC CIDR block |
237-
| `--create-security-group-stack` | BOOLEAN | No | Create Security Group Stack |
238-
| `--enable-hp-inference-feature` | BOOLEAN | No | Enable inference operator |
239-
| `--stage` | TEXT | No | Deployment stage ("gamma" or "prod") |
240-
| `--create-fsx-stack` | BOOLEAN | No | Create FSx Stack |
241-
| `--storage-capacity` | INTEGER | No | FSx storage capacity in GiB |
242-
| `--tags` | JSON | No | Resource tags as JSON object |
243-
244-
**Note:** The exact parameters available depend on your current template type and version. Run `hyp configure --help` to see all available options for your specific configuration.
245-
246-
## hyp validate
247-
248-
Validate the current directory's configuration file syntax and structure.
249-
250-
#### Syntax
251-
252-
```bash
253-
hyp validate
254-
```
255-
256-
#### Parameters
257-
258-
No parameters required.
259-
260-
```{note}
261-
This command performs **syntactic validation only** of the `config.yaml` file against the appropriate schema. It checks:
262-
263-
- **YAML syntax**: Ensures file is valid YAML
264-
- **Required fields**: Verifies all mandatory fields are present
265-
- **Data types**: Confirms field values match expected types (string, number, boolean, array)
266-
- **Schema structure**: Validates against the template's defined structure
267-
268-
This command performs syntactic validation only and does **not** verify the actual validity of values (e.g., whether AWS regions exist, instance types are available, or resources can be created).
269-
270-
**Prerequisites**
271-
272-
- Must be run in a directory where `hyp init` has created configuration files
273-
- A `config.yaml` file must exist in the current directory
274-
275-
**Output**
276-
277-
- **Success**: Displays confirmation message if syntax is valid
278-
- **Errors**: Lists specific syntax errors with field names and descriptions
279-
```
280-
281-
282-
#### Syntax
283-
284-
```bash
285-
# Validate current configuration syntax
286-
hyp validate
287-
288-
# Example output on success
289-
✔️ config.yaml is valid!
290-
291-
# Example output with syntax errors
292-
❌ Config validation errors:
293-
– kubernetes_version: Field is required
294-
– vpc_cidr: Expected string, got number
295-
```
296-
297-
## hyp reset
298-
299-
Reset the current directory's config.yaml to default values.
300-
301-
#### Syntax
302-
303-
```bash
304-
hyp reset
305-
```
306-
307-
#### Parameters
308-
309-
No parameters required.
310-
311-
312326

313327
## Parameter Reference
314328

0 commit comments

Comments
 (0)