Feat: Optimize() validations across TRT, VLLM, Neuron container optimizations #4927

gwang111 · 2024-11-13T21:30:40Z

Issue #, if available:

Description of changes:

Testing done:

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

I have read the CONTRIBUTING doc
I certify that the changes I am introducing will be backward compatible, and I have discussed concerns about this, if any, with the Python SDK team
I used the commit message format described in CONTRIBUTING
I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

I have added tests that prove my fix is effective or that my feature works (if appropriate)
I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
I have checked that my tests are not configured for a specific region or account (if appropriate)
I have used unique_name_from_base to create resource names in integ tests (if appropriate)
If adding any dependency in requirements.txt files, I have spell checked and ensured they exist in PyPi

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

gwang111 · 2024-11-13T23:43:25Z

src/sagemaker/serve/builder/model_builder.py

            Model: A deployable ``Model`` object.
        """

+        # TODO: ideally these dictionaries need to be sagemaker_core shapes


TODO: Ideally validations should happen at the start of the flow, not after everything is set up in the wrapper. Lets fast follow this and gradually shift the validations to helper functions in the outer scope here

Lokiiiiii · 2024-11-14T20:07:15Z

src/sagemaker/serve/validations/optimization.py

+logger = logging.getLogger(__name__)
+
+
+class OptimizationContainer(Enum):


We can use these for internal organization. But we should not burden customers with knowing how our optimizations work, which containers are used under which scenarios, etc.

Consider the following error messages -

Optimizations that use Compilation and Speculative Decoding are not currently supported on GPU & Neuron instances.

Optimizations that use Compilation and Quantization are not currently supported on Neuron instances.

Optimizations that use Quantization:SmoothQuant are only supported with Compilation on GPU instances.

Sharding is mutually exclusive and only supported on GPU instances.

The general theme we want to maintain is

Optimizations {A,B,C} are {not|only} supported with optimizations {X,Y,Z} on instances {L,M,N}

src/sagemaker/serve/validations/optimization.py

cj-zhang · 2024-11-18T18:20:07Z

serve UTs are failing, I see TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'. Can you look into this please?

src/sagemaker/serve/validations/optimization.py

tests/unit/sagemaker/serve/builder/test_model_builder.py

changes for blackbird - model sharding add more tests fix sharded model flag add optimization validations fix formatting and msging fixing validation bugs add UTs simplify logic update messaging formatting fix UTs add more UTs fix validations update ruleset update formatting update validation logic update bug fixes Disable network isolation if using sharded models. check sharding + network iso pre optimization add more UTs for sharding add more UTs

mufaddal-rohawala · 2024-11-19T19:56:31Z

src/sagemaker/serve/validations/optimization.py

+}
+
+
+def _validate_optimization_configuration(


this is such a big method, we should look to optimise this?

Should be able to view the UTs here: https://github.com/aws/sagemaker-python-sdk/pull/4927/files#diff-5f58a16d071696e26bf4081f3f01294336f6c669407a9474ed7c0747af0e629fR3479. and Ack, we will look to optimize post reinvent

gwang111 requested a review from Lokiiiiii November 13, 2024 21:30

gwang111 requested a review from a team as a code owner November 13, 2024 21:30

gwang111 requested a review from mufaddal-rohawala November 13, 2024 21:30

gwang111 temporarily deployed to auto-approve November 13, 2024 21:30 — with GitHub Actions Inactive

gwang111 removed the request for review from mufaddal-rohawala November 13, 2024 21:30

gwang111 temporarily deployed to auto-approve November 13, 2024 21:37 — with GitHub Actions Inactive

gwang111 force-pushed the garywan-blackbird branch from 49bd078 to bf55587 Compare November 13, 2024 21:56

gwang111 temporarily deployed to auto-approve November 13, 2024 21:57 — with GitHub Actions Inactive

gwang111 commented Nov 13, 2024

View reviewed changes

Lokiiiiii suggested changes Nov 14, 2024

View reviewed changes

gwang111 force-pushed the garywan-blackbird branch from bf55587 to 3d04384 Compare November 15, 2024 17:19

gwang111 temporarily deployed to auto-approve November 15, 2024 17:20 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 15, 2024 23:32 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 15, 2024 23:37 — with GitHub Actions Inactive

gwang111 force-pushed the garywan-blackbird branch from f1d3649 to 57123c9 Compare November 16, 2024 00:28

gwang111 temporarily deployed to auto-approve November 16, 2024 00:28 — with GitHub Actions Inactive

gwang111 force-pushed the garywan-blackbird branch from a4dba03 to d1074eb Compare November 16, 2024 00:38

gwang111 temporarily deployed to auto-approve November 16, 2024 00:38 — with GitHub Actions Inactive

gwang111 force-pushed the garywan-blackbird branch from ccad6cd to 76a4102 Compare November 16, 2024 05:49

gwang111 temporarily deployed to auto-approve November 16, 2024 05:49 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 16, 2024 05:55 — with GitHub Actions Inactive

Lokiiiiii suggested changes Nov 18, 2024

View reviewed changes

src/sagemaker/serve/validations/optimization.py Outdated Show resolved Hide resolved

src/sagemaker/serve/validations/optimization.py Outdated Show resolved Hide resolved

src/sagemaker/serve/validations/optimization.py Outdated Show resolved Hide resolved

gwang111 temporarily deployed to auto-approve November 18, 2024 18:26 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 18, 2024 18:30 — with GitHub Actions Inactive

Lokiiiiii suggested changes Nov 18, 2024

View reviewed changes

gwang111 force-pushed the garywan-blackbird branch from f999ace to 06ed0e3 Compare November 18, 2024 20:07

gwang111 temporarily deployed to auto-approve November 18, 2024 20:07 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 18, 2024 20:08 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 18, 2024 20:18 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 18, 2024 20:40 — with GitHub Actions Inactive

gwang111 force-pushed the garywan-blackbird branch from c9487bd to daa3bc0 Compare November 18, 2024 20:42

gwang111 temporarily deployed to auto-approve November 18, 2024 20:42 — with GitHub Actions Inactive

Lokiiiiii suggested changes Nov 18, 2024

View reviewed changes

tests/unit/sagemaker/serve/builder/test_model_builder.py Outdated Show resolved Hide resolved

tests/unit/sagemaker/serve/builder/test_model_builder.py Outdated Show resolved Hide resolved

gwang111 force-pushed the garywan-blackbird branch from daa3bc0 to 3e97708 Compare November 18, 2024 21:22

gwang111 temporarily deployed to auto-approve November 18, 2024 21:23 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 18, 2024 21:35 — with GitHub Actions Inactive

gwang111 force-pushed the garywan-blackbird branch from 9a779fe to 7e8e237 Compare November 18, 2024 21:45

gwang111 temporarily deployed to auto-approve November 18, 2024 21:46 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 18, 2024 23:47 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 19, 2024 02:08 — with GitHub Actions Inactive

gwang111 temporarily deployed to auto-approve November 19, 2024 02:22 — with GitHub Actions Inactive

Ashish Gupta and others added 2 commits November 19, 2024 05:48

fix rebase issues

0707798

gwang111 force-pushed the garywan-blackbird branch from 29a2242 to 0707798 Compare November 19, 2024 06:28

gwang111 temporarily deployed to auto-approve November 19, 2024 06:28 — with GitHub Actions Inactive

Merge branch 'master' into garywan-blackbird

dac3edd

gwang111 temporarily deployed to auto-approve November 19, 2024 16:45 — with GitHub Actions Inactive

cj-zhang approved these changes Nov 19, 2024

View reviewed changes

mufaddal-rohawala reviewed Nov 19, 2024

View reviewed changes

Lokiiiiii approved these changes Nov 19, 2024

View reviewed changes

mufaddal-rohawala approved these changes Nov 19, 2024

View reviewed changes

mufaddal-rohawala merged commit efd6c80 into aws:master Nov 19, 2024
13 of 14 checks passed

		logger = logging.getLogger(__name__)


		class OptimizationContainer(Enum):

		}


		def _validate_optimization_configuration(

Feat: Optimize() validations across TRT, VLLM, Neuron container optimizations #4927

Feat: Optimize() validations across TRT, VLLM, Neuron container optimizations #4927

Uh oh!

Conversation

gwang111 commented Nov 13, 2024

Merge Checklist

General

Tests

Uh oh!

gwang111 Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

Lokiiiiii Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cj-zhang commented Nov 18, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mufaddal-rohawala Nov 19, 2024

Choose a reason for hiding this comment

Uh oh!

gwang111 Nov 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Lokiiiiii Nov 14, 2024 •

edited

Loading

gwang111 Nov 19, 2024 •

edited

Loading