Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
e608801
fix: Allow Py 3.7 for MMS Test Docker env (#3080)
shreyapandit Dec 2, 2022
16be276
refactoring : using with statement (#3286)
maldil Dec 2, 2022
12e500d
Update local_requirements.txt PyYAML version (#3095)
shreyapandit Dec 2, 2022
738c75e
feature: Update TF 2.9 and TF 2.10 inference DLCs (#3465)
arjkesh Dec 2, 2022
ab19cfc
feature: Added transform with monitoring pipeline step in transformer…
keshav-chandak Dec 2, 2022
39dd6bb
fix: Fix bug forcing uploaded tar to be named sourcedir (#3412)
claytonparnell Dec 2, 2022
e6e64a3
feature: Add Code Owners file (#3503)
navinsoni Dec 2, 2022
bccf062
prepare release v2.119.0
Dec 3, 2022
e6f9f2b
update development version to v2.119.1.dev0
Dec 3, 2022
3cc428a
feature: Add DXB region to frameworks by DLC (#3387)
RadhikaB-97 Dec 5, 2022
a217f28
fix: support idempotency for framework and spark processors (#3460)
brockwade633 Dec 5, 2022
18a2d76
feature: Update registries with new region account number mappings. (…
kenny-ezirim Dec 6, 2022
20a4ade
feature: Adding support for SageMaker Training Compiler in PyTorch es…
Lokiiiiii Dec 7, 2022
9982f57
feature: Add Neo image uri config for Pytorch 1.12 (#3507)
HappyAmazonian Dec 7, 2022
ecee181
prepare release v2.120.0
Dec 7, 2022
cab178f
update development version to v2.120.1.dev0
Dec 7, 2022
a5124b3
feature: Algorithms Region Expansion OSU/DXB (#3508)
malav-shastri Dec 7, 2022
5bb0e1f
fix: Add constraints file for apache-airflow (#3510)
navinsoni Dec 7, 2022
881c113
fix: FrameworkProcessor S3 uploads (#3493)
brockwade633 Dec 8, 2022
593eafa
prepare release v2.121.0
Dec 8, 2022
33a0f2b
update development version to v2.121.1.dev0
Dec 8, 2022
aa45047
Fix: Differentiate SageMaker Training Compiler's PT DLCs from base PT…
Lokiiiiii Dec 8, 2022
3b1cd36
fix: Fix failing jumpstart cache unit tests (#3514)
evakravi Dec 8, 2022
ba7761a
fix: Pop out ModelPackageName from pipeline definition (#3472)
qidewenwhen Dec 9, 2022
86d7dc3
prepare release v2.121.1
Dec 9, 2022
602579b
update development version to v2.121.2.dev0
Dec 9, 2022
36d6134
fix: Skip Bad Transform Test (#3521)
amzn-choeric Dec 9, 2022
a8aaf6f
change: Update for Tensorflow Serving 2.11 inference DLCs (#3509)
hballuru Dec 9, 2022
e8df8a1
prepare release v2.121.2
Dec 12, 2022
d0df280
update development version to v2.121.3.dev0
Dec 12, 2022
be8c416
feature: Add OSU region to frameworks for DLC (#3532)
kace Dec 12, 2022
fc4f033
fix: Remove content type image/jpg from analysis configuration schema…
xgchena Dec 12, 2022
ceafb18
fix: unpin packaging version (#3533)
claytonparnell Dec 13, 2022
2d35567
fix: the Hyperband support fix for the HPO (#3516)
repushko Dec 13, 2022
d7867dd
feature: Feature Store dataset builder, delete_record, get_record, li…
mizanfiu Dec 14, 2022
1fba9b9
prepare release v2.122.0
Dec 14, 2022
975c32d
update development version to v2.122.1.dev0
Dec 14, 2022
46fcc16
feature: Add SageMaker Experiment (#3536)
qidewenwhen Dec 14, 2022
1b76191
feature: Add support for TF2.9.2 training images (#3178)
tejaschumbalkar Dec 14, 2022
9d051ea
prepare release v2.123.0
Dec 15, 2022
fc86e7a
update development version to v2.123.1.dev0
Dec 15, 2022
136b87b
feature: Added doc update for dataset builder (#3539)
mizanfiu Dec 15, 2022
70f88f4
feature: Add disable_profiler field in config and propagate changes (…
mariumof Dec 15, 2022
06ae740
Use Async Inference Config when available for endpoint update (#3124)
shreyapandit Dec 15, 2022
70c80de
feature: Add p4de to smddp supported instance types (#3483)
carolynwang Dec 15, 2022
bd742a6
documentation: smdistributed libraries release notes (#3543)
mchoi8739 Dec 15, 2022
318462d
feature: Doc update for TableFormatEnum (#3542)
mizanfiu Dec 15, 2022
8bcb760
prepare release v2.124.0
Dec 16, 2022
3f116a3
update development version to v2.124.1.dev0
Dec 16, 2022
2112d92
fix: Correct SageMaker Clarify API docstrings by changing JSONPath to…
xgchena Dec 16, 2022
3b01689
feature: add RandomSeed to support reproducible HPO (#3519)
timyber Dec 16, 2022
ffb9475
prepare release v2.125.0
Dec 19, 2022
c557a77
update development version to v2.125.1.dev0
Dec 19, 2022
4326568
build(deps): bump apache-airflow in /requirements/extras
dependabot[bot] Dec 19, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,6 @@ env/
.vscode/
**/tmp
.python-version
**/_repack_model.py
**/_repack_script_launcher.sh
**/_repack_script_launcher.sh
tests/data/**/_repack_model.py
tests/data/experiment/sagemaker-dev-1.0.tar.gz
114 changes: 114 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,119 @@
# Changelog

## v2.125.0 (2022-12-19)

### Features

* add RandomSeed to support reproducible HPO

### Bug Fixes and Other Changes

* Correct SageMaker Clarify API docstrings by changing JSONPath to JMESPath

## v2.124.0 (2022-12-16)

### Features

* Doc update for TableFormatEnum
* Add p4de to smddp supported instance types
* Add disable_profiler field in config and propagate changes
* Added doc update for dataset builder

### Bug Fixes and Other Changes

* Use Async Inference Config when available for endpoint update

### Documentation Changes

* smdistributed libraries release notes

## v2.123.0 (2022-12-15)

### Features

* Add support for TF2.9.2 training images
* Add SageMaker Experiment

## v2.122.0 (2022-12-14)

### Features

* Feature Store dataset builder, delete_record, get_record, list_feature_group
* Add OSU region to frameworks for DLC

### Bug Fixes and Other Changes

* the Hyperband support fix for the HPO
* unpin packaging version
* Remove content type image/jpg from analysis configuration schema

## v2.121.2 (2022-12-12)

### Bug Fixes and Other Changes

* Update for Tensorflow Serving 2.11 inference DLCs
* Revert "fix: type hint of PySparkProcessor __init__"
* Skip Bad Transform Test

## v2.121.1 (2022-12-09)

### Bug Fixes and Other Changes

* Pop out ModelPackageName from pipeline definition
* Fix failing jumpstart cache unit tests

## v2.121.0 (2022-12-08)

### Features

* Algorithms Region Expansion OSU/DXB

### Bug Fixes and Other Changes

* FrameworkProcessor S3 uploads
* Add constraints file for apache-airflow

## v2.120.0 (2022-12-07)

### Features

* Add Neo image uri config for Pytorch 1.12
* Adding support for SageMaker Training Compiler in PyTorch estimator starting 1.12
* Update registries with new region account number mappings.
* Add DXB region to frameworks by DLC

### Bug Fixes and Other Changes

* support idempotency for framework and spark processors

## v2.119.0 (2022-12-03)

### Features

* Add Code Owners file
* Added transform with monitoring pipeline step in transformer
* Update TF 2.9 and TF 2.10 inference DLCs
* make estimator accept json file as modelparallel config
* SageMaker Training Compiler does not support p4de instances
* Add support for SparkML v3.3

### Bug Fixes and Other Changes

* Fix bug forcing uploaded tar to be named sourcedir
* Update local_requirements.txt PyYAML version
* refactoring : using with statement
* Allow Py 3.7 for MMS Test Docker env
* fix PySparkProcessor __init__ params type
* type hint of PySparkProcessor __init__
* Return ARM XGB/SKLearn tags if `image_scope` is `inference_graviton`
* Update scipy to 1.7.3 to support M1 development envs
* Fixing type hints for Spark processor that has instance type/count params in reverse order
* Add DeepAR ap-northeast-3 repository.
* Fix AsyncInferenceConfig documentation typo
* fix ml_inf to ml_inf1 in Neo multi-version support
* Fix type annotations
* add neo mvp region accounts

## v2.118.0 (2022-12-01)

### Features
Expand Down
1 change: 1 addition & 0 deletions CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* @aws/sagemaker-ml-frameworks
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.118.1.dev0
2.125.1.dev0
12 changes: 12 additions & 0 deletions doc/api/prep_data/feature_store.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,15 @@ Inputs
.. autoclass:: sagemaker.feature_store.inputs.FeatureValue
:members:
:show-inheritance:

.. autoclass:: sagemaker.feature_store.inputs.TableFormatEnum
:members:
:show-inheritance:


Dataset Builder
***************

.. autoclass:: sagemaker.feature_store.dataset_builder.DatasetBuilder
:members:
:show-inheritance:
4 changes: 2 additions & 2 deletions doc/api/training/sdp_versions/latest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ depending on the version of the library you use.
<https://docs.aws.amazon.com/sagemaker/latest/dg/data-parallel-use-api.html#data-parallel-use-python-skd-api>`_
for more information.

Version 1.4.0, 1.4.1, 1.5.0 (Latest)
====================================
Version 1.4.0, 1.4.1, 1.5.0, 1.6.0 (Latest)
===========================================

.. toctree::
:maxdepth: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,51 @@ Release Notes
New features, bug fixes, and improvements are regularly made to the SageMaker
distributed data parallel library.

SageMaker Distributed Data Parallel 1.5.0 Release Notes
SageMaker Distributed Data Parallel 1.6.0 Release Notes
=======================================================

*Date: Dec. 15. 2022*

**New Features**

* New optimized SMDDP AllGather collective to complement the sharded data parallelism technique
in the SageMaker model parallelism library. For more information, see `Sharded data parallelism with SMDDP Collectives
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-smddp-collectives>`_
in the *Amazon SageMaker Developer Guide*.
* Added support for Amazon EC2 ``ml.p4de.24xlarge`` instances. You can run data parallel training jobs
on ``ml.p4de.24xlarge`` instances with the SageMaker data parallelism library’s AllReduce collective.

**Improvements**

* General performance improvements of the SMDDP AllReduce collective communication operation.

**Migration to AWS Deep Learning Containers**

This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):

- SageMaker training container for PyTorch v1.12.1

.. code::

763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.12.1-gpu-py38-cu113-ubuntu20.04-sagemaker


Binary file of this version of the library for `custom container
<https://docs.aws.amazon.com/sagemaker/latest/dg/data-parallel-use-api.html#data-parallel-bring-your-own-container>`_ users:

.. code::

https://smdataparallel.s3.amazonaws.com/binary/pytorch/1.12.1/cu113/2022-12-05/smdistributed_dataparallel-1.6.0-cp38-cp38-linux_x86_64.whl


----

Release History
===============

SageMaker Distributed Data Parallel 1.5.0 Release Notes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

*Date: Jul. 26. 2022*

**Currency Updates**
Expand Down Expand Up @@ -38,12 +80,6 @@ Binary file of this version of the library for `custom container

https://smdataparallel.s3.amazonaws.com/binary/pytorch/1.12.0/cu113/2022-07-01/smdistributed_dataparallel-1.5.0-cp38-cp38-linux_x86_64.whl


----

Release History
===============

SageMaker Distributed Data Parallel 1.4.1 Release Notes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,60 @@ New features, bug fixes, and improvements are regularly made to the SageMaker
distributed model parallel library.


SageMaker Distributed Model Parallel 1.11.0 Release Notes
SageMaker Distributed Model Parallel 1.13.0 Release Notes
=========================================================

*Date: Dec. 15. 2022*

**New Features**

* Sharded data parallelism now supports a new backend for collectives called *SMDDP Collectives*.
For supported scenarios, SMDDP Collectives are on by default for the AllGather operation.
For more information, see
`Sharded data parallelism with SMDDP Collectives
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-extended-features-pytorch-sharded-data-parallelism.html#model-parallel-extended-features-pytorch-sharded-data-parallelism-smddp-collectives>`_
in the *Amazon SageMaker Developer Guide*.
* Introduced FlashAttention for DistributedTransformer to improve memory usage and computational
performance of models such as GPT2, GPTNeo, GPTJ, GPTNeoX, BERT, and RoBERTa.

**Bug Fixes**

* Fixed initialization of ``lm_head`` in DistributedTransformer to use a provided range
for initialization, when weights are not tied with the embeddings.

**Improvements**

* When a module has no parameters, we have introduced an optimization to execute
such a module on the same rank as its parent during pipeline parallelism.

**Migration to AWS Deep Learning Containers**

This version passed benchmark testing and is migrated to the following AWS Deep Learning Containers (DLC):

- SageMaker training container for PyTorch v1.12.1

.. code::

763104351884.dkr.ecr.<region>.amazonaws.com/pytorch-training:1.12.1-gpu-py38-cu113-ubuntu20.04-sagemaker


Binary file of this version of the library for `custom container
<https://docs.aws.amazon.com/sagemaker/latest/dg/model-parallel-sm-sdk.html#model-parallel-bring-your-own-container>`_ users:

- For PyTorch 1.12.0

.. code::

https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.12.1/build-artifacts/2022-12-08-21-34/smdistributed_modelparallel-1.13.0-cp38-cp38-linux_x86_64.whl

----

Release History
===============

SageMaker Distributed Model Parallel 1.11.0 Release Notes
---------------------------------------------------------

*Date: August. 17. 2022*

**New Features**
Expand Down Expand Up @@ -41,12 +92,7 @@ Binary file of this version of the library for `custom container

.. code::

https://sagemaker-distributed-model-parallel.s3.us-west-2.amazonaws.com/pytorch-1.12.0/build-artifacts/2022-08-12-16-58/smdistributed_modelparallel-1.11.0-cp38-cp38-linux_x86_64.whl

----

Release History
===============
https://sagemaker-distribu

SageMaker Distributed Model Parallel 1.10.1 Release Notes
---------------------------------------------------------
Expand Down
4 changes: 2 additions & 2 deletions doc/api/training/smp_versions/latest.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ depending on which version of the library you need to use.
To use the library, reference the
**Common API** documentation alongside the framework specific API documentation.

Version 1.11.0 (Latest)
===========================================
Version 1.11.0, 1.13.0 (Latest)
===============================

To use the library, reference the Common API documentation alongside the framework specific API documentation.

Expand Down
10 changes: 10 additions & 0 deletions doc/experiments/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
############################
Amazon SageMaker Experiments
############################

The SageMaker Python SDK supports to track and organize your machine learning workflow across SageMaker with jobs, such as Processing, Training and Transform, or locally.

.. toctree::
:maxdepth: 2

sagemaker.experiments
20 changes: 20 additions & 0 deletions doc/experiments/sagemaker.experiments.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Experiments
============

Run
-------------

.. autoclass:: sagemaker.experiments.Run
:members:

.. automethod:: sagemaker.experiments.load_run

.. automethod:: sagemaker.experiments.list_runs

.. autoclass:: sagemaker.experiments.SortByType
:members:
:undoc-members:

.. autoclass:: sagemaker.experiments.SortOrderType
:members:
:undoc-members:
10 changes: 10 additions & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,16 @@ Orchestrate your SageMaker training and inference workflows with Airflow and Kub
workflows/index


****************************
Amazon SageMaker Experiments
****************************
You can use Amazon SageMaker Experiments to track machine learning experiments.

.. toctree::
:maxdepth: 2

experiments/index

*************************
Amazon SageMaker Debugger
*************************
Expand Down
4 changes: 3 additions & 1 deletion requirements/extras/test_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,13 @@ contextlib2==21.6.0
awslogs==0.14.0
black==22.3.0
stopit==1.1.2
apache-airflow==2.4.1
# Update tox.ini to have correct version of airflow constraints file
apache-airflow==2.4.3
apache-airflow-providers-amazon==4.0.0
attrs==22.1.0
fabric==2.6.0
requests==2.27.1
sagemaker-experiments==0.1.35
Jinja2==3.0.3
pandas>=1.3.5,<1.5
scikit-learn==1.0.2
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def read_requirements(filename):
# Declare minimal set for installation
required_packages = [
"attrs>=20.3.0,<23",
"boto3>=1.26.20,<2.0",
"boto3>=1.26.28,<2.0",
"google-pasta",
"numpy>=1.9.0,<2.0",
"protobuf>=3.1,<4.0",
Expand Down
Loading