Skip to content

Refactor authz resource tree#103

Merged
paulineribeyre merged 31 commits intomasterfrom
authz-refactor
Mar 2, 2026
Merged

Refactor authz resource tree#103
paulineribeyre merged 31 commits intomasterfrom
authz-refactor

Conversation

@paulineribeyre
Copy link
Collaborator

@paulineribeyre paulineribeyre commented Feb 26, 2026

Link to JIRA ticket if there is one:

New Features

Breaking Changes

Bug Fixes

Improvements

Dependency updates

Deployment changes

@github-actions
Copy link

The style in this PR agrees with black. ✔️

This formatting comment was generated automatically by a script in uc-cdis/wool.

@paulineribeyre paulineribeyre changed the base branch from debug-chunk to master February 27, 2026 23:04
@coveralls
Copy link

coveralls commented Feb 27, 2026

Pull Request Test Coverage Report for Build 22594355466

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 40 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-0.2%) to 88.102%

Files with Coverage Reduction New Missed Lines %
auth.py 3 89.89%
routes/storage.py 7 86.0%
routes/ga4gh_tes.py 13 86.71%
routes/s3.py 17 84.83%
Totals Coverage Status
Change from base Build 22507255459: -0.2%
Covered Lines: 659
Relevant Lines: 748

💛 - Coveralls

Copy link
Contributor

@nss10 nss10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments and questions. Rest looks great. :tada

token_claims = await auth.get_token_claims()
user_id = token_claims.get("sub")
await auth.authorize(
"delete", [f"/services/workflow/gen3-workflow/tasks/{user_id}"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant

Suggested change
"delete", [f"/services/workflow/gen3-workflow/tasks/{user_id}"]
"delete", [f"/services/workflow/gen3-workflow/storage/{user_id}"]

token_claims = await auth.get_token_claims()
user_id = token_claims.get("sub")
await auth.authorize(
"delete", [f"/services/workflow/gen3-workflow/tasks/{user_id}"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above!

Suggested change
"delete", [f"/services/workflow/gen3-workflow/tasks/{user_id}"]
"delete", [f"/services/workflow/gen3-workflow/storage/{user_id}"]

logger.info(
f"Ensuring user '{user_id}' has access to their own tasks and storage"
)
resource_path1 = f"/services/workflow/gen3-workflow/tasks/{user_id}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I'm nit picking right now, but, if it is not too much trouble, can we update the function to have less code duplication and variable names like tasks_path and storage_path instead of resource_path{1,2}?

Maybe something like

base = "/services/workflow/gen3-workflow"
resources_to_create = [
    (
        f"{base}/tasks",
        tasks_path,
        f"Represents workflow tasks owned by user '{username}'",
    ),
    (
        f"{base}/storage",
        storage_path,
        f"Represents task storage owned by user '{username}'",
    ),
]

for parent_path, resource_path, description in resources_to_create:
    logger.debug("Attempting to create resource '%s' in Arborist", resource_path)
    await self.arborist_client.create_resource(
        parent_path,
        {"name": user_id, "description": description},
        create_parents=True,
    )

Again, I understand this is nitpicking, and we may not update this function at all in the future, so it is fine if you choose to stick to your design.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO sometimes trying too hard to reduce code duplication just makes the code more convoluted and harder to read. Your version is not really shorter, and i think it's hard to understand at a glance, so i'll stick to mine

_, kms_key_arn = get_existing_kms_key_for_bucket(bucket_name)
if not kms_key_arn:
err_msg = "Bucket misconfigured. Hit the `GET /storage/info` endpoint and try again."
err_msg = "Bucket misconfigured. Hit the `GET /storage/setup` endpoint and try again."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always felt, the GET /storage/info is doing more than what its name says. This is better :D

@pytest.mark.parametrize(
"access_token_patcher", [{"user_id": NEW_TEST_USER_ID}], indirect=True
)
async def test_storage_info(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
async def test_storage_info(
async def test_storage_setup(

"/storage/setup", headers={"Authorization": f"bearer {TEST_USER_TOKEN}"}
)
bucket_name = res.json()["bucket"]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also do an assert to mock_arborist_request after line 307 to ensure that the arborist call with the correct path is also being hit

        method="POST",
        path=f"/resource/services/workflow/gen3-workflow/storage",
        body=f'{{"name":"{NEW_TEST_USER_ID}","description":"Represents task storage owned by user \'test-username-{NEW_TEST_USER_ID}\'"}}',
        authorized=True,
    )

Comment on lines +181 to +182
# S3 bucket. It could be supported by hitting the "GET task" endpoint to get the list of
# files for a specific task that a user has access to in another user's bucket.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this is a little confusing to me.

It could be supported by hitting the "GET task" endpoint to get the list of
# files for a specific task that a user has access to in another user's bucket

Did you mean, It could be supported in the future by hitting the "GET task", or am I misunderstanding this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes in the future

"logs": [{"system_logs": ["blah"]}],
"tags": {
"_AUTHZ": f"/users/OTHER_USER/gen3-workflow/tasks/TASK_ID_PLACEHOLDER"
"_AUTHZ": f"/services/workflow/gen3-workflow/tasks/OTHER_USER/TASK_ID_PLACEHOLDER"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing important, just an unnecessary f-string format

Suggested change
"_AUTHZ": f"/services/workflow/gen3-workflow/tasks/OTHER_USER/TASK_ID_PLACEHOLDER"
"_AUTHZ": "/services/workflow/gen3-workflow/tasks/OTHER_USER/TASK_ID_PLACEHOLDER"


## Other Gen3-Workflow functionality
- To download inputs and upload outputs, the Funnel workers need `create` access to resource `/services/workflow/gen3-workflow/tasks` on service `gen3-workflow`, like end-users.
- To empty or delete their own S3 bucket, a user needs `delete` access to the resource `/services/workflow/gen3-workflow/user-bucket` on the `gen3-workflow` service -- a special privilege useful for automated testing but not intended for the average user.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we don't need to specify this any more -- -- a special privilege useful for automated testing but not intended for the average user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to be a general user feature? We aren't sure about the scalability of this action. (here)[https://github.com/uc-cdis/gen3-workflow/blob/f5ae28fd6a7157fdbd0ba0fe74c9722ef7a1239a/gen3workflow/aws_utils.py#L427C5-L428C37]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I forgot about that

With the refactor, it's not a special privilege anymore since i got rid of the "user-bucket" resource. You need the same access as when you use the s3 endpoint to delete a file, and all users have that access.

Since end-users can access the endpoint and have permission to delete files, stating in the docs that it shouldn't be used doesn't help - end users don't read those docs. We should just update that code when we have a chance, it's a low priority https://ctds-planx.atlassian.net/browse/MIDRC-1233

{
"id": "gen3-workflow-deleter",
"action": {"service": "gen3-workflow", "method": "delete"},
"id": "gen3_workflow_admin_action",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we only give gen3_workflow_reader_action and gen3_workflow_creator_action instead of gen3_workflow_admin_action to all the users?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users should be able to cancel tasks and delete inputs/outputs from s3 to lower costs for example

@paulineribeyre paulineribeyre requested a review from nss10 March 2, 2026 20:28
@github-actions
Copy link

github-actions bot commented Mar 2, 2026

filepath error SUBTOTAL
tests/test_gen3_workflow.py 13 13
TOTAL 13 13

Please find the detailed integration test report here

Please find the Github Action logs here

@paulineribeyre paulineribeyre merged commit 3720331 into master Mar 2, 2026
12 of 13 checks passed
@paulineribeyre paulineribeyre deleted the authz-refactor branch March 2, 2026 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants