Skip to content

Commit 3cdd28d

Browse files
committed
update documentation to add init experience for all templates
1 parent d08aefb commit 3cdd28d

File tree

5 files changed

+289
-45
lines changed

5 files changed

+289
-45
lines changed

doc/cli/cluster_management/cli_cluster_management.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ Complete reference for SageMaker HyperPod cluster management parameters and conf
1313
* [Update Cluster](#hyp-update-cluster)
1414
* [List Cluster Stacks](#hyp-list-cluster-stack)
1515
* [Describe Cluster Stack](#hyp-describe-cluster-stack)
16+
* [Delete Cluster Stack](#hyp-delete-cluster-stack)
1617
* [List HyperPod Clusters](#hyp-list-cluster)
1718
* [Set Cluster Context](#hyp-set-cluster-context)
1819
* [Get Cluster Context](#hyp-get-cluster-context)
@@ -128,6 +129,22 @@ hyp describe cluster-stack STACK-NAME [OPTIONS]
128129
| `--region` | TEXT | No | AWS region of the stack |
129130
| `--debug` | FLAG | No | Enable debug logging |
130131

132+
133+
#### Delete Cluster Stack
134+
135+
Delete a HyperPod cluster stack. Removes the specified CloudFormation stack and all associated AWS resources. This operation cannot be undone.
136+
137+
```bash
138+
hyp delete cluster-stack <stack-name>
139+
```
140+
141+
| Option | Type | Description |
142+
|--------|------|-------------|
143+
| `--region <region>` | Required | The AWS region where the stack exists. |
144+
| `--retain-resources S3Bucket-TrainingData,EFSFileSystem-Models` | Optional | Comma-separated list of logical resource IDs to retain during deletion (only works on DELETE_FAILED stacks). Resource names are shown in failed deletion output, or use AWS CLI: `aws cloudformation list-stack-resources STACK_NAME --region REGION`. |
145+
| `--debug` | Optional | Enable debug mode for detailed logging. |
146+
147+
131148
## hyp list-cluster
132149

133150
List SageMaker HyperPod clusters with capacity information.

doc/cli/inference/cli_inference.md

Lines changed: 37 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,8 @@
44

55
Complete reference for SageMaker HyperPod inference parameters and configuration options.
66

7-
```{note}
8-
**Region Configuration**: For commands that accept the `--region` option, if no region is explicitly provided, the command will use the default region from your AWS credentials configuration.
9-
```
10-
7+
* [Initialize Configuration](#hyp-init)
8+
* [Create with Configuration](#hyp-create)
119
* [Create JumpStart Endpoint](#hyp-create-hyp-jumpstart-endpoint)
1210
* [Create Custom Endpoint](#hyp-create-hyp-custom-endpoint)
1311

@@ -28,6 +26,41 @@ Complete reference for SageMaker HyperPod inference parameters and configuration
2826
* [Get Custom Operator Logs](#hyp-get-operator-logs-hyp-custom-endpoint)
2927

3028

29+
## hyp init
30+
31+
Initialize a template scaffold in the current directory.
32+
33+
#### Syntax
34+
35+
```bash
36+
hyp init TEMPLATE [DIRECTORY] [OPTIONS]
37+
```
38+
39+
#### Parameters
40+
41+
| Parameter | Type | Required | Description |
42+
|-----------|------|----------|-------------|
43+
| `TEMPLATE` | CHOICE | Yes | Template type (cluster-stack, hyp-pytorch-job, hyp-custom-endpoint, hyp-jumpstart-endpoint) |
44+
| `DIRECTORY` | PATH | No | Target directory (default: current directory) |
45+
| `--version` | TEXT | No | Schema version to use |
46+
47+
48+
## hyp create
49+
50+
Create a new HyperPod endpoint using the provided configuration.
51+
52+
#### Syntax
53+
54+
```bash
55+
hyp create [OPTIONS]
56+
```
57+
58+
#### Parameters
59+
60+
| Parameter | Type | Required | Description |
61+
|-----------|------|----------|-------------|
62+
| `--debug` | FLAG | No | Enable debug logging |
63+
3164

3265
## hyp create hyp-jumpstart-endpoint
3366

doc/cli/training/cli_training.md

Lines changed: 38 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,8 @@
55

66
Complete reference for SageMaker HyperPod PyTorch training job parameters and configuration options.
77

8-
```{note}
9-
**Region Configuration**: For commands that accept the `--region` option, if no region is explicitly provided, the command will use the default region from your AWS credentials configuration.
10-
```
11-
8+
* [Initialize Configuration](#hyp-init)
9+
* [Create with Configuration](#hyp-create)
1210
* [Create PyTorch Job](#hyp-create-hyp-pytorch-job)
1311
* [List Jobs](#hyp-list-hyp-pytorch-job)
1412
* [Describe Job](#hyp-describe-hyp-pytorch-job)
@@ -17,6 +15,42 @@ Complete reference for SageMaker HyperPod PyTorch training job parameters and co
1715
* [Get Logs](#hyp-get-logs-hyp-pytorch-job)
1816

1917

18+
## hyp init
19+
20+
Initialize a template scaffold in the current directory.
21+
22+
#### Syntax
23+
24+
```bash
25+
hyp init TEMPLATE [DIRECTORY] [OPTIONS]
26+
```
27+
28+
#### Parameters
29+
30+
| Parameter | Type | Required | Description |
31+
|-----------|------|----------|-------------|
32+
| `TEMPLATE` | CHOICE | Yes | Template type (cluster-stack, hyp-pytorch-job, hyp-custom-endpoint, hyp-jumpstart-endpoint) |
33+
| `DIRECTORY` | PATH | No | Target directory (default: current directory) |
34+
| `--version` | TEXT | No | Schema version to use |
35+
36+
37+
## hyp create
38+
39+
Create a new HyperPod training job using the provided configuration.
40+
41+
#### Syntax
42+
43+
```bash
44+
hyp create [OPTIONS]
45+
```
46+
47+
#### Parameters
48+
49+
| Parameter | Type | Required | Description |
50+
|-----------|------|----------|-------------|
51+
| `--debug` | FLAG | No | Enable debug logging |
52+
53+
2054
## hyp create hyp-pytorch-job
2155

2256
Create distributed PyTorch training jobs on SageMaker HyperPod clusters.

doc/getting_started/inference.md

Lines changed: 119 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,95 @@ SageMaker HyperPod inference endpoints allow you to:
1515
- Invoke endpoints for real-time predictions
1616
- Monitor endpoint performance
1717

18+
19+
## Creating Inference Endpoints -- CLI Init Experience
20+
21+
The following is a step-by-step guide of creating a jumpstart endpoint with `hyp-jumpstart-endpoint` template for init experience. In order to create a custom endpoint, you can use `hyp-custom-endpoint` template during the init command call. The init experience is the same across templates.
22+
23+
### 1. Start with a Clean Directory
24+
25+
It\'s recommended to start with a new and clean directory for each
26+
endpoint configuration:
27+
28+
``` bash
29+
mkdir my-endpoint
30+
cd my-endpoint
31+
```
32+
33+
### 2. Initialize a New Endpoint Configuration
34+
35+
36+
`````{tab-set}
37+
````{tab-item} CLI
38+
``` bash
39+
hyp init hyp-jumpstart-endpoint
40+
```
41+
````
42+
`````
43+
1844
```{note}
19-
**Region Configuration**: For commands that accept the `--region` option, if no region is explicitly provided, the command will use the default region from your AWS credentials configuration.
45+
In order to create custom endpoint, you can simply use `hyp init hyp-custom-endpoint`.
2046
```
2147

22-
## Creating Inference Endpoints
48+
This creates three files:
49+
50+
- `config.yaml`: The main configuration file you\'ll use to customize
51+
your endpoint
52+
- `k8s.jinja`: A reference template for parameters mapping in kubernetes payload
53+
- `README.md`: Usage guide with instructions and examples
54+
55+
56+
57+
### 3. Configure Your Endpoint
58+
59+
You can configure your endpoint in two ways:
60+
61+
**Option 1: Edit config.yaml directly**
62+
63+
The config.yaml file contains key parameters like:
64+
65+
``` yaml
66+
template: hyp-jumpstart-endpoint
67+
version: 1.0
68+
model_id:
69+
instance_type:
70+
endpoint_name:
71+
```
72+
73+
**Option 2: Use CLI command (Pre-Deployment)**
74+
75+
`````{tab-set}
76+
````{tab-item} CLI
77+
``` bash
78+
hyp configure --endpoint-name your-endpoint-name
79+
```
80+
````
81+
`````
82+
83+
```{note}
84+
The `hyp configure` command only modifies local configuration files. It
85+
does not affect existing deployed endpoints.
86+
```
87+
88+
### 4. Create the Endpoint
89+
90+
`````{tab-set}
91+
````{tab-item} CLI
92+
``` bash
93+
hyp create
94+
```
95+
````
96+
`````
97+
98+
This will:
99+
100+
- Validate your configuration
101+
- Create a timestamped folder in the `run` directory
102+
- Initialize the endpoint creation process
103+
104+
105+
106+
## Creating Inference Endpoints -- CLI/SDK
23107

24108
You can create inference endpoints using either JumpStart models or custom models:
25109

@@ -80,48 +164,48 @@ hyp create hyp-custom-endpoint \
80164
81165
````{tab-item} SDK
82166
```python
83-
from sagemaker.hyperpod.inference.config.hp_custom_endpoint_config import Model, Server, SageMakerEndpoint, TlsConfig, EnvironmentVariables
167+
from sagemaker.hyperpod.inference.config.hp_endpoint_config import CloudWatchTrigger, Dimensions, AutoScalingSpec, Metrics, S3Storage, ModelSourceConfig, TlsConfig, EnvironmentVariables, ModelInvocationPort, ModelVolumeMount, Resources, Worker
84168
from sagemaker.hyperpod.inference.hp_endpoint import HPEndpoint
85169
86-
model = Model(
87-
model_source_type="s3",
88-
model_location="test-pytorch-job",
89-
s3_bucket_name="my-bucket",
90-
s3_region="us-east-2",
91-
prefetch_enabled=True
170+
model_source_config = ModelSourceConfig(
171+
model_source_type='s3',
172+
model_location="<my-model-folder-in-s3>",
173+
s3_storage=S3Storage(
174+
bucket_name='<my-model-artifacts-bucket>',
175+
region='us-east-2',
176+
),
92177
)
93178
94-
server = Server(
95-
instance_type="ml.g5.8xlarge",
96-
image_uri="763104351884.dkr.ecr.us-east-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.4.0-tgi2.3.1-gpu-py311-cu124-ubuntu22.04-v2.0",
97-
container_port=8080,
98-
model_volume_mount_name="model-weights"
99-
)
100-
101-
resources = {
102-
"requests": {"cpu": "30000m", "nvidia.com/gpu": 1, "memory": "100Gi"},
103-
"limits": {"nvidia.com/gpu": 1}
104-
}
105-
106-
env = EnvironmentVariables(
107-
HF_MODEL_ID="/opt/ml/model",
108-
SAGEMAKER_PROGRAM="inference.py",
109-
SAGEMAKER_SUBMIT_DIRECTORY="/opt/ml/model/code",
110-
MODEL_CACHE_ROOT="/opt/ml/model",
111-
SAGEMAKER_ENV="1"
179+
environment_variables = [
180+
EnvironmentVariables(name="HF_MODEL_ID", value="/opt/ml/model"),
181+
EnvironmentVariables(name="SAGEMAKER_PROGRAM", value="inference.py"),
182+
EnvironmentVariables(name="SAGEMAKER_SUBMIT_DIRECTORY", value="/opt/ml/model/code"),
183+
EnvironmentVariables(name="MODEL_CACHE_ROOT", value="/opt/ml/model"),
184+
EnvironmentVariables(name="SAGEMAKER_ENV", value="1"),
185+
]
186+
187+
worker = Worker(
188+
image='763104351884.dkr.ecr.us-east-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.4.0-tgi2.3.1-gpu-py311-cu124-ubuntu22.04-v2.0',
189+
model_volume_mount=ModelVolumeMount(
190+
name='model-weights',
191+
),
192+
model_invocation_port=ModelInvocationPort(container_port=8080),
193+
resources=Resources(
194+
requests={"cpu": "30000m", "nvidia.com/gpu": 1, "memory": "100Gi"},
195+
limits={"nvidia.com/gpu": 1}
196+
),
197+
environment_variables=environment_variables,
112198
)
113199
114-
endpoint_name = SageMakerEndpoint(name="endpoint-custom-pytorch")
115-
116-
tls_config = TlsConfig(tls_certificate_output_s3_uri="s3://sample-bucket")
200+
tls_config=TlsConfig(tls_certificate_output_s3_uri='s3://<my-tls-bucket-name>')
117201
118202
custom_endpoint = HPEndpoint(
119-
model=model,
120-
server=server,
121-
resources=resources,
122-
environment=env,
123-
sage_maker_endpoint=endpoint_name,
203+
endpoint_name='<my-endpoint-name>',
204+
instance_type='ml.g5.8xlarge',
205+
model_name='deepseek15b-test-model-name',
124206
tls_config=tls_config,
207+
model_source_config=model_source_config,
208+
worker=worker,
125209
)
126210
127211
custom_endpoint.create()

0 commit comments

Comments
 (0)