You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Alternatively, if you are using Amazon SageMaker HyperPod recipes, you can follow the following instructions:
199
209
210
+
Prerequisites: you need ``git`` installed on your client to access Amazon SageMaker HyperPod recipes code.
200
211
201
-
Call the fit Method
202
-
===================
212
+
When using a recipe, you must set the ``training_recipe`` arg in place of providing a training script.
213
+
This can be a recipe from `here <https://github.com/aws/sagemaker-hyperpod-recipes>`_
214
+
or a local file or a custom url. Please note that you must override the following using
215
+
``recipe_overrides``:
216
+
217
+
* directory paths for the local container in the recipe as appropriate for Python SDK
218
+
* the output s3 URIs
219
+
* Huggingface access token
220
+
* any other recipe fields you wish to edit
221
+
222
+
The code snippet below shows an example.
223
+
Please refer to `SageMaker docs <https://docs.aws.amazon.com/sagemaker/latest/dg/model-train-storage.html>`_
224
+
for more details about the expected local paths in the container and the Amazon SageMaker
225
+
HyperPod recipes tutorial for more examples.
226
+
You can override the fields by either setting ``recipe_overrides`` or
227
+
providing a modified ``training_recipe`` through a local file or a custom url.
228
+
When using the recipe, any provided ``entry_point`` will be ignored.
229
+
230
+
SageMaker will automatically set up the distribution args.
231
+
It will also determine the image to use for your model and device type,
232
+
but you can override this with the ``image_uri`` arg.
233
+
234
+
You can also override the number of nodes in the recipe with the ``instance_count`` arg to estimator.
235
+
``source_dir`` will default to current working directory unless specified.
236
+
A local copy of training scripts and recipe will be saved in the ``source_dir``.
237
+
You can specify any additional packages you want to install for training in an optional ``requirements.txt`` in the ``source_dir``.
238
+
239
+
Note for llama3.2 multi-modal models, you need to upgrade transformers library by providing a ``requirements.txt`` in the source file with ``transformers==4.45.2``.
240
+
Please refer to the Amazon SageMaker HyperPod recipes documentation for more details.
241
+
242
+
243
+
Here is an example usage for recipe ``hf_llama3_8b_seq8k_gpu_p5x16_pretrain``.
Copy file name to clipboardExpand all lines: doc/overview.rst
+43-2Lines changed: 43 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,7 @@ Using the SageMaker Python SDK
4
4
5
5
SageMaker Python SDK provides several high-level abstractions for working with Amazon SageMaker. These are:
6
6
7
+
- **ModelTrainer**: New interface encapsulating training on SageMaker.
7
8
- **Estimators**: Encapsulate training on SageMaker.
8
9
- **Models**: Encapsulate built ML models.
9
10
- **Predictors**: Provide real-time inference and transformation using Python data-types against a SageMaker endpoint.
@@ -24,8 +25,8 @@ Train a Model with the SageMaker Python SDK
24
25
To train a model by using the SageMaker Python SDK, you:
25
26
26
27
1. Prepare a training script
27
-
2. Create an estimator
28
-
3. Call the ``fit`` method of the estimator
28
+
2. Create a ModelTrainer or Estimator
29
+
3. Call the ``train`` method of the ModelTrainer or the ``fit`` method of the Estimator
29
30
30
31
After you train a model, you can save it, and then serve the model as an endpoint to get real-time inferences or get inferences for an entire dataset by using batch transform.
31
32
@@ -85,6 +86,46 @@ If you want to use, for example, boolean hyperparameters, you need to specify ``
85
86
For more on training environment variables, please visit `SageMaker Containers <https://github.com/aws/sagemaker-containers>`_.
86
87
87
88
89
+
Using ModelTrainer
90
+
==================
91
+
92
+
To use the ModelTrainer class, you need to provide a few essential parameters such as the training image URI and the source code configuration. The class allows you to spin up a SageMaker training job with minimal parameters, particularly by specifying the source code and training image.
93
+
94
+
For more information about class definitions see `ModelTrainer <https://sagemaker.readthedocs.io/en/stable/api/training/model_trainer.html>`_.
95
+
96
+
Example: Launching a Training Job with Custom Script
97
+
98
+
.. code:: python
99
+
100
+
from sagemaker.modules.train import ModelTrainer
101
+
from sagemaker.modules.configs import SourceCode, InputData
0 commit comments