Skip to content

Utils' 'download_folder' function doesn't use sagemaker session's s3 resource #4663

@DarrenStack

Description

@DarrenStack

Describe the bug
Hi all!

I'm using Sagemaker and overriding the S3 resource to use LakeFS similar to how is shown here where I'm using different AWS credentials for the S3 endpoint. Mostly, all is going well however I have run into an issue when running any processing/training that uses the 'download_folder' function from the utils package under the hood.

I'm providing my 'sagemaker_session' to the following function with my custom S3 resource however the download folder code is then instantiating a new S3 resource that does not have the correct keys or endpoint configuration.

s3 = boto_session.resource("s3", region_name=boto_session.region_name)

I might be missing something but could the sagemaker_session.s3_resource be used above instead? When a Session is created, it looks like it's initializing the same s3 resource by default

self.s3_resource = self.boto_session.resource("s3", region_name=self.boto_region_name)

To reproduce
Create a sagemaker session and overwrite the S3 recourse after initializing the session.

s3_resource = boto3.resource('s3',
    endpoint_url=different_endpoint_url,
    aws_access_key_id=different_aws_access_key_id,
    aws_secret_access_key=different_aws_secret_access_key)

session = sagemaker.Session(
    boto3.Session(),
    s3_endpoint_url=different_endpoint_url,
)

session.s3_resource = s3_resource

utils.download_folder('bucket', 'path', 'target', session)

Expected behavior
The session's s3 resource that was assigned would be used.

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.217.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans):
  • Framework version:
  • Python version: 3.8
  • CPU or GPU:
  • Custom Docker image (Y/N):

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions