Skip to content

MIDRC-1226 STREAMING-UNSIGNED-PAYLOAD-TRAILER support#102

Merged
paulineribeyre merged 25 commits intomasterfrom
debug-chunk
Feb 27, 2026
Merged

MIDRC-1226 STREAMING-UNSIGNED-PAYLOAD-TRAILER support#102
paulineribeyre merged 25 commits intomasterfrom
debug-chunk

Conversation

@paulineribeyre
Copy link
Collaborator

@paulineribeyre paulineribeyre commented Feb 25, 2026

Link to JIRA ticket if there is one: https://ctds-planx.atlassian.net/browse/MIDRC-1226

New Features

  • The S3 endpoint now supports the STREAMING-UNSIGNED-PAYLOAD-TRAILER method used by the AWS CLI

Breaking Changes

  • The status endpoint is not reachable at / anymore, only at /_status

Bug Fixes

Improvements

Dependency updates

Deployment changes

@github-actions
Copy link

The style in this PR agrees with black. ✔️

This formatting comment was generated automatically by a script in uc-cdis/wool.

@paulineribeyre paulineribeyre requested a review from nss10 February 25, 2026 19:03

def _image_to_regex(image: str) -> str:
"""
r"""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix for SyntaxWarning: invalid escape sequence '\Z'

@github-actions
Copy link

Test summary after running integration tests

filepath passed failed SUBTOTAL
tests/test_gen3_workflow.py 12 1 13
TOTAL 12 1 13

Test summary after rerunning failed integration tests

filepath failed SUBTOTAL
tests/test_gen3_workflow.py 1 1
TOTAL 1 1

Please find the detailed integration test report here

Please find the detailed integration test report after rerunning failed tests here

Please find the Github Action logs here

@github-actions
Copy link

filepath passed SUBTOTAL
tests/test_gen3_workflow.py 13 13
TOTAL 13 13

Please find the detailed integration test report here

Please find the Github Action logs here

Copy link
Contributor

@nss10 nss10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of questions before approving

# get the name of the user's bucket and ensure the user is making a call to their own bucket
logger.info(f"Incoming S3 request from user '{user_id}': '{request.method} {path}'")
user_bucket = aws_utils.get_safe_name_from_hostname(user_id)
if request.method == "GET" and path == "s3":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Can someone make a request to get the names of all the buckets in an account when they interact via gen3-workflow? Would it be possible to add a test around it? And will the path variable be exactly equal to s3.

Also, what would happen when someone tries ls s3://{someone else's bucket}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the path is exactly "s3". Right now listing the buckets doesn't work, i tried to get it working and only return the user's bucket, but it wasn't easy and i thought it's not worth the trouble, since "ls s3://{user_bucket}" works.

"ls s3://{someone else's bucket}" would return a 403 through the same check as other calls (L188).

I'll have a look at adding a unit test for ls

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out the path is only "s3" when hitting "/s3", and it's otherwise empty. I added a commit to reject all "list all buckets" requests + a unit test.

"ls s3://{user_bucket}" and "ls s3://{someone else's bucket}" are already tested by existing unit tests, for example this one

logger.debug(
f"Received a failure status code from AWS: {response.status_code}. {response.text=}"
)
logger.debug(f"Received a failure status code from AWS: {response.status_code}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we get in the {response.text=} field? is not useful?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's also logged on L347 so it's duplicated, except in the case of a 404, in which case we don't need extra details

@coveralls
Copy link

coveralls commented Feb 27, 2026

Pull Request Test Coverage Report for Build 22506519115

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 16 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.5%) to 88.34%

Files with Coverage Reduction New Missed Lines %
routes/s3.py 16 86.43%
Totals Coverage Status
Change from base Build 22202944189: -0.5%
Covered Lines: 644
Relevant Lines: 729

💛 - Coveralls

@paulineribeyre paulineribeyre requested a review from nss10 February 27, 2026 22:15
Copy link
Contributor

@nss10 nss10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Just one more question. I can approve right after that :)

# "All buckets" listing requests also land here and are not supported, since users can only
# access their own bucket.
if request.method == "GET" and path in ("", "s3"):
err_msg = f"If you are using the S3 endpoint: 's3 ls' not supported, use 's3 ls s3://<your bucket>' instead. If you are trying to reach the Gen3-Workflow API, try '/_status'."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay! I see we are explicitly asking them to try /_status if they need it. Can't we still support get_status when path is empty? Do you think that would be complicated?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It gets confusing because if you use "aws s3 ls" against the S3 endpoint exposed at root, the AWS CLI can't parse the response from "get status"



@pytest.mark.parametrize("client", [{"get_url": True}], indirect=True)
def test_s3_endpoint_list_buckets(s3_client):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@github-actions
Copy link

filepath passed SUBTOTAL
tests/test_gen3_workflow.py 13 13
TOTAL 13 13

Please find the detailed integration test report here

Please find the Github Action logs here

@paulineribeyre paulineribeyre merged commit 78a6f7a into master Feb 27, 2026
13 checks passed
@paulineribeyre paulineribeyre deleted the debug-chunk branch March 4, 2026 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants