-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Describe the bug
Hi all!
I'm using Sagemaker and overriding the S3 resource to use LakeFS similar to how is shown here where I'm using different AWS credentials for the S3 endpoint. Mostly, all is going well however I have run into an issue when running any processing/training that uses the 'download_folder' function from the utils package under the hood.
I'm providing my 'sagemaker_session' to the following function with my custom S3 resource however the download folder code is then instantiating a new S3 resource that does not have the correct keys or endpoint configuration.
sagemaker-python-sdk/src/sagemaker/utils.py
Line 388 in 533f30a
s3 = boto_session.resource("s3", region_name=boto_session.region_name) |
I might be missing something but could the sagemaker_session.s3_resource be used above instead? When a Session is created, it looks like it's initializing the same s3 resource by default
sagemaker-python-sdk/src/sagemaker/session.py
Line 336 in 533f30a
self.s3_resource = self.boto_session.resource("s3", region_name=self.boto_region_name) |
To reproduce
Create a sagemaker session and overwrite the S3 recourse after initializing the session.
s3_resource = boto3.resource('s3',
endpoint_url=different_endpoint_url,
aws_access_key_id=different_aws_access_key_id,
aws_secret_access_key=different_aws_secret_access_key)
session = sagemaker.Session(
boto3.Session(),
s3_endpoint_url=different_endpoint_url,
)
session.s3_resource = s3_resource
utils.download_folder('bucket', 'path', 'target', session)
Expected behavior
The session's s3 resource that was assigned would be used.
System information
A description of your system. Please provide:
- SageMaker Python SDK version: 2.217.0
- Framework name (eg. PyTorch) or algorithm (eg. KMeans):
- Framework version:
- Python version: 3.8
- CPU or GPU:
- Custom Docker image (Y/N):
Additional context
Add any other context about the problem here.