Skip to content

Commit de45978

Browse files
committed
add example notebooks to documentation, add delete SDK command to readme, update init experience documentation flow
1 parent 6f192e8 commit de45978

File tree

9 files changed

+450
-166
lines changed

9 files changed

+450
-166
lines changed

README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -416,6 +416,19 @@ status = stack.get_status(region="us-west-2")
416416
print(status)
417417
```
418418

419+
#### Deleting a Cluster Stack
420+
421+
```python
422+
# Delete with custom logger
423+
import logging
424+
logger = logging.getLogger(__name__)
425+
HpClusterStack.delete("my-stack-name", region="us-west-2", logger=logger)
426+
427+
# Delete with retained resources (only works on DELETE_FAILED stacks)
428+
HpClusterStack.delete("my-stack-name", retain_resources=["S3Bucket", "EFSFileSystem"])
429+
430+
```
431+
419432
### Training SDK
420433

421434
#### Creating a Training Job

doc/cli/cluster_management/cli_cluster_management.md

Lines changed: 110 additions & 113 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,11 @@ Complete reference for SageMaker HyperPod cluster management parameters and conf
99
```
1010

1111
* [Initialize Configuration](#hyp-init)
12+
* [Configure Parameters](#hyp-configure)
13+
* [Validate Configuration](#hyp-validate)
14+
* [Reset Configuration](#hyp-reset)
1215
* [Create Cluster Stack](#hyp-create)
16+
1317
* [Update Cluster](#hyp-update-cluster)
1418
* [List Cluster Stacks](#hyp-list-cluster-stack)
1519
* [Describe Cluster Stack](#hyp-describe-cluster-stack)
@@ -19,9 +23,6 @@ Complete reference for SageMaker HyperPod cluster management parameters and conf
1923
* [Get Cluster Context](#hyp-get-cluster-context)
2024
* [Get Monitoring](#hyp-get-monitoring)
2125

22-
* [Configure Parameters](#hyp-configure)
23-
* [Validate Configuration](#hyp-validate)
24-
* [Reset Configuration](#hyp-reset)
2526

2627
## hyp init
2728

@@ -47,6 +48,108 @@ The `resource_name_prefix` parameter in the generated `config.yaml` file serves
4748
**Cluster stack names must be unique within each AWS region.** If you attempt to create a cluster stack with a name that already exists in the same region, the deployment will fail.
4849
```
4950

51+
## hyp configure
52+
53+
Configure cluster parameters interactively or via command line.
54+
55+
```{important}
56+
**Pre-Deployment Configuration**: This command modifies local `config.yaml` files **before** cluster creation. For updating **existing, deployed clusters**, use `hyp update cluster` instead.
57+
```
58+
59+
#### Syntax
60+
61+
```bash
62+
hyp configure [OPTIONS]
63+
```
64+
65+
#### Parameters
66+
67+
This command dynamically supports all configuration parameters available in the current template's schema. Common parameters include:
68+
69+
| Parameter | Type | Required | Description |
70+
|-----------|------|----------|-------------|
71+
| `--resource-name-prefix` | TEXT | No | Prefix for all AWS resources |
72+
| `--create-hyperpod-cluster-stack` | BOOLEAN | No | Create HyperPod Cluster Stack |
73+
| `--hyperpod-cluster-name` | TEXT | No | Name of SageMaker HyperPod Cluster |
74+
| `--create-eks-cluster-stack` | BOOLEAN | No | Create EKS Cluster Stack |
75+
| `--kubernetes-version` | TEXT | No | Kubernetes version |
76+
| `--eks-cluster-name` | TEXT | No | Name of the EKS cluster |
77+
| `--create-helm-chart-stack` | BOOLEAN | No | Create Helm Chart Stack |
78+
| `--namespace` | TEXT | No | Namespace to deploy HyperPod Helm chart |
79+
| `--node-provisioning-mode` | TEXT | No | Continuous provisioning mode |
80+
| `--node-recovery` | TEXT | No | Node recovery setting ("Automatic" or "None") |
81+
| `--create-vpc-stack` | BOOLEAN | No | Create VPC Stack |
82+
| `--vpc-id` | TEXT | No | Existing VPC ID |
83+
| `--vpc-cidr` | TEXT | No | VPC CIDR block |
84+
| `--create-security-group-stack` | BOOLEAN | No | Create Security Group Stack |
85+
| `--enable-hp-inference-feature` | BOOLEAN | No | Enable inference operator |
86+
| `--stage` | TEXT | No | Deployment stage ("gamma" or "prod") |
87+
| `--create-fsx-stack` | BOOLEAN | No | Create FSx Stack |
88+
| `--storage-capacity` | INTEGER | No | FSx storage capacity in GiB |
89+
| `--tags` | JSON | No | Resource tags as JSON object |
90+
91+
**Note:** The exact parameters available depend on your current template type and version. Run `hyp configure --help` to see all available options for your specific configuration.
92+
93+
## hyp validate
94+
95+
Validate the current directory's configuration file syntax and structure.
96+
97+
#### Syntax
98+
99+
```bash
100+
# Validate current configuration syntax
101+
hyp validate
102+
103+
# Example output on success
104+
✔️ config.yaml is valid!
105+
106+
# Example output with syntax errors
107+
❌ Config validation errors:
108+
– kubernetes_version: Field is required
109+
– vpc_cidr: Expected string, got number
110+
```
111+
112+
#### Parameters
113+
114+
No parameters required.
115+
116+
```{note}
117+
This command performs **syntactic validation only** of the `config.yaml` file against the appropriate schema. It checks:
118+
119+
- **YAML syntax**: Ensures file is valid YAML
120+
- **Required fields**: Verifies all mandatory fields are present
121+
- **Data types**: Confirms field values match expected types (string, number, boolean, array)
122+
- **Schema structure**: Validates against the template's defined structure
123+
124+
This command performs syntactic validation only and does **not** verify the actual validity of values (e.g., whether AWS regions exist, instance types are available, or resources can be created).
125+
126+
**Prerequisites**
127+
128+
- Must be run in a directory where `hyp init` has created configuration files
129+
- A `config.yaml` file must exist in the current directory
130+
131+
**Output**
132+
133+
- **Success**: Displays confirmation message if syntax is valid
134+
- **Errors**: Lists specific syntax errors with field names and descriptions
135+
```
136+
137+
138+
## hyp reset
139+
140+
Reset the current directory's config.yaml to default values.
141+
142+
#### Syntax
143+
144+
```bash
145+
hyp reset
146+
```
147+
148+
#### Parameters
149+
150+
No parameters required.
151+
152+
50153
## hyp create
51154

52155
Create a new HyperPod cluster stack using the provided configuration.
@@ -130,18 +233,20 @@ hyp describe cluster-stack STACK-NAME [OPTIONS]
130233
| `--debug` | FLAG | No | Enable debug logging |
131234

132235

133-
#### hyp delete cluster-stack
236+
## hyp delete cluster-stack
134237

135238
Delete a HyperPod cluster stack. Removes the specified CloudFormation stack and all associated AWS resources. This operation cannot be undone.
136239

240+
#### Syntax
137241
```bash
138242
hyp delete cluster-stack <stack-name>
139243
```
140244

245+
#### Parameters
141246
| Option | Type | Description |
142247
|--------|------|-------------|
143248
| `--region <region>` | Required | The AWS region where the stack exists. |
144-
| `--retain-resources S3Bucket-TrainingData,EFSFileSystem-Models` | Optional | Comma-separated list of logical resource IDs to retain during deletion (only works on DELETE_FAILED stacks). Resource names are shown in failed deletion output, or use AWS CLI: `aws cloudformation list-stack-resources STACK_NAME --region REGION`. |
249+
| `--retain-resources <list>` | Optional | Comma-separated list of logical resource IDs to retain during deletion (only works on DELETE_FAILED stacks). Resource names are shown in failed deletion output, or use AWS CLI: `aws cloudformation list-stack-resources STACK_NAME --region REGION`. Example: `S3Bucket-TrainingData,EFSFileSystem-Models` |
145250
| `--debug` | Optional | Enable debug mode for detailed logging. |
146251

147252

@@ -218,114 +323,6 @@ hyp get-monitoring [OPTIONS]
218323
| `--prometheus` | FLAG | No | Return Prometheus workspace URL |
219324
| `--list` | FLAG | No | Return list of available metrics |
220325

221-
## hyp configure
222-
223-
Configure cluster parameters interactively or via command line.
224-
225-
```{important}
226-
**Pre-Deployment Configuration**: This command modifies local `config.yaml` files **before** cluster creation. For updating **existing, deployed clusters**, use `hyp update cluster` instead.
227-
```
228-
229-
#### Syntax
230-
231-
```bash
232-
hyp configure [OPTIONS]
233-
```
234-
235-
#### Parameters
236-
237-
This command dynamically supports all configuration parameters available in the current template's schema. Common parameters include:
238-
239-
| Parameter | Type | Required | Description |
240-
|-----------|------|----------|-------------|
241-
| `--resource-name-prefix` | TEXT | No | Prefix for all AWS resources |
242-
| `--create-hyperpod-cluster-stack` | BOOLEAN | No | Create HyperPod Cluster Stack |
243-
| `--hyperpod-cluster-name` | TEXT | No | Name of SageMaker HyperPod Cluster |
244-
| `--create-eks-cluster-stack` | BOOLEAN | No | Create EKS Cluster Stack |
245-
| `--kubernetes-version` | TEXT | No | Kubernetes version |
246-
| `--eks-cluster-name` | TEXT | No | Name of the EKS cluster |
247-
| `--create-helm-chart-stack` | BOOLEAN | No | Create Helm Chart Stack |
248-
| `--namespace` | TEXT | No | Namespace to deploy HyperPod Helm chart |
249-
| `--node-provisioning-mode` | TEXT | No | Continuous provisioning mode |
250-
| `--node-recovery` | TEXT | No | Node recovery setting ("Automatic" or "None") |
251-
| `--create-vpc-stack` | BOOLEAN | No | Create VPC Stack |
252-
| `--vpc-id` | TEXT | No | Existing VPC ID |
253-
| `--vpc-cidr` | TEXT | No | VPC CIDR block |
254-
| `--create-security-group-stack` | BOOLEAN | No | Create Security Group Stack |
255-
| `--enable-hp-inference-feature` | BOOLEAN | No | Enable inference operator |
256-
| `--stage` | TEXT | No | Deployment stage ("gamma" or "prod") |
257-
| `--create-fsx-stack` | BOOLEAN | No | Create FSx Stack |
258-
| `--storage-capacity` | INTEGER | No | FSx storage capacity in GiB |
259-
| `--tags` | JSON | No | Resource tags as JSON object |
260-
261-
**Note:** The exact parameters available depend on your current template type and version. Run `hyp configure --help` to see all available options for your specific configuration.
262-
263-
## hyp validate
264-
265-
Validate the current directory's configuration file syntax and structure.
266-
267-
#### Syntax
268-
269-
```bash
270-
hyp validate
271-
```
272-
273-
#### Parameters
274-
275-
No parameters required.
276-
277-
```{note}
278-
This command performs **syntactic validation only** of the `config.yaml` file against the appropriate schema. It checks:
279-
280-
- **YAML syntax**: Ensures file is valid YAML
281-
- **Required fields**: Verifies all mandatory fields are present
282-
- **Data types**: Confirms field values match expected types (string, number, boolean, array)
283-
- **Schema structure**: Validates against the template's defined structure
284-
285-
This command performs syntactic validation only and does **not** verify the actual validity of values (e.g., whether AWS regions exist, instance types are available, or resources can be created).
286-
287-
**Prerequisites**
288-
289-
- Must be run in a directory where `hyp init` has created configuration files
290-
- A `config.yaml` file must exist in the current directory
291-
292-
**Output**
293-
294-
- **Success**: Displays confirmation message if syntax is valid
295-
- **Errors**: Lists specific syntax errors with field names and descriptions
296-
```
297-
298-
299-
#### Syntax
300-
301-
```bash
302-
# Validate current configuration syntax
303-
hyp validate
304-
305-
# Example output on success
306-
✔️ config.yaml is valid!
307-
308-
# Example output with syntax errors
309-
❌ Config validation errors:
310-
– kubernetes_version: Field is required
311-
– vpc_cidr: Expected string, got number
312-
```
313-
314-
## hyp reset
315-
316-
Reset the current directory's config.yaml to default values.
317-
318-
#### Syntax
319-
320-
```bash
321-
hyp reset
322-
```
323-
324-
#### Parameters
325-
326-
No parameters required.
327-
328-
329326

330327
## Parameter Reference
331328

0 commit comments

Comments
 (0)