aws
diff --git a/‎.github/workflows/codebuild-canaries.yml‎
Lines changed: 24 additions & 0 deletions b/‎.github/workflows/codebuild-canaries.yml‎
Lines changed: 24 additions & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 126 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 126 additions & 0 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 6 additions & 2 deletions b/‎CONTRIBUTING.md‎
Lines changed: 6 additions & 2 deletions
diff --git a/‎VERSION‎
Lines changed: 1 addition & 1 deletion b/‎VERSION‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎doc/frameworks/pytorch/using_pytorch.rst‎
Lines changed: 3 additions & 2 deletions b/‎doc/frameworks/pytorch/using_pytorch.rst‎
Lines changed: 3 additions & 2 deletions
diff --git a/‎doc/frameworks/tensorflow/deploying_tensorflow_serving.rst‎
Lines changed: 2 additions & 2 deletions b/‎doc/frameworks/tensorflow/deploying_tensorflow_serving.rst‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎doc/frameworks/tensorflow/using_tf.rst‎
Lines changed: 9 additions & 6 deletions b/‎doc/frameworks/tensorflow/using_tf.rst‎
Lines changed: 9 additions & 6 deletions
diff --git a/‎doc/overview.rst‎
Lines changed: 10 additions & 2 deletions b/‎doc/overview.rst‎
Lines changed: 10 additions & 2 deletions
diff --git a/‎doc/v2.rst‎
Lines changed: 2 additions & 2 deletions b/‎doc/v2.rst‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎doc/workflows/step_functions/index.rst‎
Lines changed: 1 addition & 1 deletion b/‎doc/workflows/step_functions/index.rst‎
Lines changed: 1 addition & 1 deletion
@@ -0,0 +1,24 @@
+name: Canaries
+on:
+  schedule:
+    - cron: "0 */3 * * *"
+  workflow_dispatch:
+
+permissions:
+    id-token: write # This is required for requesting the JWT
+
+jobs:
+  tests:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Configure AWS Credentials
+        uses: aws-actions/configure-aws-credentials@v4
+        with:
+          role-to-assume: ${{ secrets.CI_AWS_ROLE_ARN }}
+          aws-region: us-west-2
+          role-duration-seconds: 10800
+      - name: Run Integ Tests
+        uses: aws-actions/aws-codebuild-run-build@v1
+        id: codebuild
+        with:
+          project-name: sagemaker-python-sdk-canaries
@@ -1,5 +1,131 @@
 # Changelog
 
+## v2.242.0 (2025-03-14)
+
+### Features
+
+ * add integ tests for training JumpStart models in private hub
+
+### Bug Fixes and Other Changes
+
+ * Torch upgrade
+ * Prevent RunContext overlap between test_run tests
+ * remove s3 output location requirement from hub class init
+ * Fixing Pytorch training python version in tests
+ * update image_uri_configs  03-11-2025 07:18:09 PST
+ * resolve infinite loop in _find_config on Windows systems
+ * pipeline definition function doc update
+
+## v2.241.0 (2025-03-06)
+
+### Features
+
+ * Make DistributedConfig Extensible
+ * support training for JumpStart model references as part of Curated Hub Phase 2
+ * Allow ModelTrainer to accept hyperparameters file
+
+### Bug Fixes and Other Changes
+
+ * Skip tests with deprecated instance type
+ * Ensure Model.is_repack() returns a boolean
+ * Fix error when there is no session to call _create_model_request()
+ * Use sagemaker session's s3_resource in download_folder
+ * Added check for the presence of model package group before creating one
+ * Fix key error in _send_metrics()
+
+## v2.240.0 (2025-02-25)
+
+### Features
+
+ * Add support for TGI Neuronx 0.0.27 and HF PT 2.3.0 image in PySDK
+
+### Bug Fixes and Other Changes
+
+ * Remove main function entrypoint in ModelBuilder dependency manager.
+ * forbid extras in Configs
+ * altconfig hubcontent and reenable integ test
+ * Merge branch 'master-rba' into local_merge
+ * py_version doc fixes
+ * Add backward compatbility for RecordSerializer and RecordDeserializer
+ * update image_uri_configs  02-21-2025 06:18:10 PST
+ * update image_uri_configs  02-20-2025 06:18:08 PST
+
+### Documentation Changes
+
+ * Removed a line about python version requirements of training script which can misguide users.
+
+## v2.239.3 (2025-02-19)
+
+### Bug Fixes and Other Changes
+
+ * added ap-southeast-7 and mx-central-1 for Jumpstart
+ * update image_uri_configs  02-19-2025 06:18:15 PST
+
+## v2.239.2 (2025-02-18)
+
+### Bug Fixes and Other Changes
+
+ * Add warning about not supporting torch.nn.SyncBatchNorm
+ * pass in inference_ami_version to model_based endpoint type
+ * Fix hyperparameter strategy docs
+ * Add framework_version to all TensorFlowModel examples
+ * Move RecordSerializer and RecordDeserializer to sagemaker.serializers and sagemaker.deserialzers
+
+## v2.239.1 (2025-02-14)
+
+### Bug Fixes and Other Changes
+
+ * keep sagemaker_session from being overridden to None
+ * Fix all type hint and docstrings for callable
+ * Fix the workshop link for Step Functions
+ * Fix Tensorflow doc link
+ * Fix FeatureGroup docstring
+ * Add type hint for ProcessingOutput
+ * Fix sourcedir.tar.gz filenames in docstrings
+ * Fix documentation for local mode
+ * bug in get latest version was getting the max sorted alphabetically
+ * Add cleanup logic to model builder integ tests for endpoints
+ * Fixed pagination failing while listing collections
+ * fix ValueError when updating a data quality monitoring schedule
+ * Add docstring for image_uris.retrieve
+ * Create GitHub action to trigger canaries
+ * update image_uri_configs  02-04-2025 06:18:00 PST
+
+## v2.239.0 (2025-02-01)
+
+### Features
+
+ * Add support for deepseek recipes
+
+### Bug Fixes and Other Changes
+
+ * mpirun protocol - distributed training with @remote decorator
+ * Allow telemetry only in supported regions
+ * Fix ssh host policy
+
+## v2.238.0 (2025-01-29)
+
+### Features
+
+ * use jumpstart deployment config image as default optimization image
+
+### Bug Fixes and Other Changes
+
+ * chore: add new images for HF TGI
+ * update image_uri_configs  01-29-2025 06:18:08 PST
+ * skip TF tests for unsupported versions
+ * Merge branch 'master-rba' into local_merge
+ * Add missing attributes to local resourceconfig
+ * update image_uri_configs  01-27-2025 06:18:13 PST
+ * update image_uri_configs  01-24-2025 06:18:11 PST
+ * add missing schema definition in docs
+ * Omegaconf upgrade
+ * SageMaker @remote function: Added multi-node functionality
+ * remove option
+ * fix typo
+ * fix tests
+ * Add an option for user to remove inputs and container artifacts when using local model trainer
+
 ## v2.237.3 (2025-01-09)
 
 ### Bug Fixes and Other Changes
 
@@ -61,6 +61,10 @@ Before sending us a pull request, please ensure that:
    1. Follow the instructions at [Modifying an EBS Volume Using Elastic Volumes (Console)](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/requesting-ebs-volume-modifications.html#modify-ebs-volume) to increase the EBS volume size associated with the newly created EC2 instance.
    1. Wait 5-10min for the new EBS volume increase to finalize.
    1. Allow EC2 to claim the additional space by stopping and then starting your EC2 host.
+2. Set up a venv to manage dependencies:
+   1. `python -m venv ~/.venv/myproject-env` to create the venv
+   2. `source ~/.venv/myproject-env/bin/activate` to activate the venv
+   3. `deactivate` to exit the venv
 
 
 ### Pull Down the Code
@@ -74,8 +78,8 @@ Before sending us a pull request, please ensure that:
 ### Run the Unit Tests
 
 1. Install tox using `pip install tox`
-1. Install coverage using `pip install .[test]`
-1. cd into the sagemaker-python-sdk folder: `cd sagemaker-python-sdk` or `cd /environment/sagemaker-python-sdk`
+1. cd into the github project sagemaker-python-sdk folder: `cd sagemaker-python-sdk` or `cd /environment/sagemaker-python-sdk`
+1. Install coverage using `pip install '.[test]'`
 1. Run the following tox command and verify that all code checks and unit tests pass: `tox tests/unit`
 1. You can also run a single test with the following command: `tox -e py310 -- -s -vv <path_to_file><file_name>::<test_function_name>`
 1. You can run coverage via runcvoerage env : `tox -e runcoverage -- tests/unit` or `tox -e py310 -- tests/unit --cov=sagemaker --cov-append --cov-report xml`
 
@@ -1 +1 @@
-2.237.4.dev0
+2.242.1.dev0
@@ -28,8 +28,6 @@ To train a PyTorch model by using the SageMaker Python SDK:
 Prepare a PyTorch Training Script
 =================================
 
-Your PyTorch training script must be a Python 3.6 compatible source file.
-
 Prepare your script in a separate source file than the notebook, terminal session, or source file you're
 using to submit the script to SageMaker via a ``PyTorch`` Estimator. This will be discussed in further detail below.
 
@@ -375,6 +373,9 @@ To initialize distributed training in your script, call
 `torch.distributed.init_process_group
 <https://pytorch.org/docs/master/distributed.html#torch.distributed.init_process_group>`_
 with the desired backend and the rank of the current host.
+Warning: Some torch features, such as (and likely not limited to) ``torch.nn.SyncBatchNorm``
+is not supported and its existence in ``init_process_group`` will cause an exception during
+distributed training.
 
 .. code:: python
 
 
@@ -64,7 +64,7 @@ If you already have existing model artifacts in S3, you can skip training and de
 
   from sagemaker.tensorflow import TensorFlowModel
 
-  model = TensorFlowModel(model_data='s3://mybucket/model.tar.gz', role='MySageMakerRole')
+  model = TensorFlowModel(model_data='s3://mybucket/model.tar.gz', role='MySageMakerRole', framework_version='x.x.x')
 
   predictor = model.deploy(initial_instance_count=1, instance_type='ml.c5.xlarge')
 
@@ -74,7 +74,7 @@ Python-based TensorFlow serving on SageMaker has support for `Elastic Inference
 
     from sagemaker.tensorflow import TensorFlowModel
 
-    model = TensorFlowModel(model_data='s3://mybucket/model.tar.gz', role='MySageMakerRole')
+    model = TensorFlowModel(model_data='s3://mybucket/model.tar.gz', role='MySageMakerRole', framework_version='x.x.x')
 
     predictor = model.deploy(initial_instance_count=1, instance_type='ml.c5.xlarge', accelerator_type='ml.eia1.medium')
 
 
@@ -246,7 +246,7 @@ Training with parameter servers
 
 If you specify parameter_server as the value of the distribution parameter, the container launches a parameter server
 thread on each instance in the training cluster, and then executes your training code. You can find more information on
-TensorFlow distributed training at `TensorFlow docs <https://www.tensorflow.org/deploy/distributed>`__.
+TensorFlow distributed training at `TensorFlow docs <https://www.tensorflow.org/guide/distributed_training>`__.
 To enable parameter server training:
 
 .. code:: python
@@ -468,7 +468,7 @@ If you already have existing model artifacts in S3, you can skip training and de
 
   from sagemaker.tensorflow import TensorFlowModel
 
-  model = TensorFlowModel(model_data='s3://mybucket/model.tar.gz', role='MySageMakerRole')
+  model = TensorFlowModel(model_data='s3://mybucket/model.tar.gz', role='MySageMakerRole', framework_version='x.x.x')
 
   predictor = model.deploy(initial_instance_count=1, instance_type='ml.c5.xlarge')
 
@@ -478,7 +478,7 @@ Python-based TensorFlow serving on SageMaker has support for `Elastic Inference
 
     from sagemaker.tensorflow import TensorFlowModel
 
-    model = TensorFlowModel(model_data='s3://mybucket/model.tar.gz', role='MySageMakerRole')
+    model = TensorFlowModel(model_data='s3://mybucket/model.tar.gz', role='MySageMakerRole', framework_version='x.x.x')
 
     predictor = model.deploy(initial_instance_count=1, instance_type='ml.c5.xlarge', accelerator_type='ml.eia1.medium')
 
@@ -767,7 +767,8 @@ This customized Python code must be named ``inference.py`` and is specified thro
 
     model = TensorFlowModel(entry_point='inference.py',
                             model_data='s3://mybucket/model.tar.gz',
-                            role='MySageMakerRole')
+                            role='MySageMakerRole',
+                            framework_version='x.x.x')
 
 In the example above, ``inference.py`` is assumed to be a file inside ``model.tar.gz``. If you want to use a local file instead, you must add the ``source_dir`` argument. See the documentation on `TensorFlowModel <https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/sagemaker.tensorflow.html#sagemaker.tensorflow.model.TensorFlowModel>`_.
 
@@ -923,7 +924,8 @@ processing. There are 2 ways to do this:
     model = TensorFlowModel(entry_point='inference.py',
                             dependencies=['requirements.txt'],
                             model_data='s3://mybucket/model.tar.gz',
-                            role='MySageMakerRole')
+                            role='MySageMakerRole',
+                            framework_version='x.x.x')
 
 
 2. If you are working in a network-isolation situation or if you don't
@@ -941,7 +943,8 @@ processing. There are 2 ways to do this:
     model = TensorFlowModel(entry_point='inference.py',
                            dependencies=['/path/to/folder/named/lib'],
                            model_data='s3://mybucket/model.tar.gz',
-                           role='MySageMakerRole')
+                           role='MySageMakerRole',
+                           framework_version='x.x.x')
 
 For more information, see: https://github.com/aws/sagemaker-tensorflow-serving-container#prepost-processing
 
 
@@ -30,6 +30,11 @@ To train a model by using the SageMaker Python SDK, you:
 
 After you train a model, you can save it, and then serve the model as an endpoint to get real-time inferences or get inferences for an entire dataset by using batch transform.
 
+
+Important Note:
+
+*  When using torch to load Models, it is recommended to use version torch>=2.6.0 and torchvision>=0.17.0
+
 Prepare a Training script
 =========================
 
@@ -1958,15 +1963,15 @@ Make sure to have a Compose Version compatible with your Docker Engine installat
 Local mode configuration
 ========================
 
-The local mode uses a YAML configuration file located at ``~/.sagemaker/config.yaml`` to define the default values that are automatically passed to the ``config`` attribute of ``LocalSession``. This is an example of the configuration, for the full schema, see `sagemaker.config.config_schema.SAGEMAKER_PYTHON_SDK_LOCAL_MODE_CONFIG_SCHEMA <https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/config/config_schema.py>`_.
+The local mode uses a YAML configuration file located at ``${user_config_directory}/sagemaker/config.yaml`` to define the default values that are automatically passed to the ``config`` attribute of ``LocalSession``. This is an example of the configuration, for the full schema, see `sagemaker.config.config_schema.SAGEMAKER_PYTHON_SDK_LOCAL_MODE_CONFIG_SCHEMA <https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/config/config_schema.py>`_.
 
 .. code:: yaml
 
     local:
         local_code: true # Using everything locally
         region_name: "us-west-2" # Name of the region
         container_config: # Additional docker container config
-            shm_size: "128M
+            shm_size: "128M"
 
 If you want to keep everything local, and not use Amazon S3 either, you can enable "local code" in one of two ways:
 
@@ -2565,6 +2570,9 @@ set default values for. For the full schema, see `sagemaker.config.config_schema
           KmsKeyId: 'kmskeyid10'
         TransformResources:
           VolumeKmsKeyId: 'volumekmskeyid4'
+        Tags:
+        - Key: 'tag_key'
+          Value: 'tag_value
       CompilationJob:
       # https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateCompilationJob.html
         OutputConfig:
 
@@ -324,9 +324,9 @@ The follow serializer/deserializer classes have been renamed and/or moved:
 +--------------------------------------------------------+-------------------------------------------------------+
 | ``sagemaker.predictor._NPYSerializer``                 | ``sagemaker.serializers.NumpySerializer``             |
 +--------------------------------------------------------+-------------------------------------------------------+
-| ``sagemaker.amazon.common.numpy_to_record_serializer`` | ``sagemaker.amazon.common.RecordSerializer``          |
+| ``sagemaker.amazon.common.numpy_to_record_serializer`` | ``sagemaker.serializers.RecordSerializer``            |
 +--------------------------------------------------------+-------------------------------------------------------+
-| ``sagemaker.amazon.common.record_deserializer``        | ``sagemaker.amazon.common.RecordDeserializer``        |
+| ``sagemaker.amazon.common.record_deserializer``        | ``sagemaker.deserializers.RecordDeserializer``        |
 +--------------------------------------------------------+-------------------------------------------------------+
 | ``sagemaker.predictor._JsonDeserializer``              | ``sagemaker.deserializers.JSONDeserializer``          |
 +--------------------------------------------------------+-------------------------------------------------------+
 
@@ -11,5 +11,5 @@ without having to provision and integrate the AWS services separately.
 The AWS Step Functions Python SDK uses the SageMaker Python SDK as a dependency.
 To get started with step functions, try the workshop or visit the SDK's website:
 
-* `Workshop on using AWS Step Functions with SageMaker <https://www.sagemakerworkshop.com/step/>`__
+* `Create and manage Amazon SageMaker AI jobs with Step Functions <https://docs.aws.amazon.com/step-functions/latest/dg/connect-sagemaker.html>`__
 * `AWS Step Functions Python SDK website <https://aws-step-functions-data-science-sdk.readthedocs.io/en/stable/>`__