feat: fixing instance to config auto resolution support #5251
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
feat: fixing instance to config auto resolution support
The JumpStart inference configuration auto-detection problem occurred because users had to manually specify the config_name parameter when deploying models to different instance types, leading to deployment failures when GPU configurations were incorrectly applied to Neuron instances or vice versa. The root cause was that four critical artifact retrieval functions (image_uris.py, model_uris.py, environment_variables.py, and resource_requirements.py) lacked auto-detection logic, causing the system to default to the first available configuration instead of selecting the appropriate one for the target instance type. I implemented a universal auto-detection algorithm that extracts the instance type family, discovers all available inference configurations, and uses the JumpStart ranking system to select the highest priority configuration that supports the instance type. This solution automatically routes ml.inf2.24xlarge instances to the neuron configuration and ml.g5.12xlarge instances to the tgi configuration, eliminating manual configuration requirements while maintaining backward compatibility. To verify the implementation, I created 19 comprehensive unit tests across two test files (test_auto_detection.pywith 13 tests and test_config_auto_detection.py with 6 tests) that cover image URI selection, model artifact retrieval, environment variables, resource requirements, ranking system priority, edge cases, and integration patterns - all tests pass, confirming that users can now deploy JumpStart models by specifying only the model_id and instance_type without needing knowledge of specific inference configurations.
General
Tests
unique_name_from_base
to create resource names in integ tests (if appropriate)By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.