Conversation
|
The style in this PR agrees with This formatting comment was generated automatically by a script in uc-cdis/wool. |
1e10af2 to
ce0b098
Compare
4ed49d4 to
1402e9e
Compare
Pull Request Test Coverage Report for Build 22594355466Details
💛 - Coveralls |
nss10
left a comment
There was a problem hiding this comment.
Left some comments and questions. Rest looks great. :tada
gen3workflow/routes/storage.py
Outdated
| token_claims = await auth.get_token_claims() | ||
| user_id = token_claims.get("sub") | ||
| await auth.authorize( | ||
| "delete", [f"/services/workflow/gen3-workflow/tasks/{user_id}"] |
There was a problem hiding this comment.
I think you meant
| "delete", [f"/services/workflow/gen3-workflow/tasks/{user_id}"] | |
| "delete", [f"/services/workflow/gen3-workflow/storage/{user_id}"] |
gen3workflow/routes/storage.py
Outdated
| token_claims = await auth.get_token_claims() | ||
| user_id = token_claims.get("sub") | ||
| await auth.authorize( | ||
| "delete", [f"/services/workflow/gen3-workflow/tasks/{user_id}"] |
There was a problem hiding this comment.
Same as above!
| "delete", [f"/services/workflow/gen3-workflow/tasks/{user_id}"] | |
| "delete", [f"/services/workflow/gen3-workflow/storage/{user_id}"] |
| logger.info( | ||
| f"Ensuring user '{user_id}' has access to their own tasks and storage" | ||
| ) | ||
| resource_path1 = f"/services/workflow/gen3-workflow/tasks/{user_id}" |
There was a problem hiding this comment.
I know I'm nit picking right now, but, if it is not too much trouble, can we update the function to have less code duplication and variable names like tasks_path and storage_path instead of resource_path{1,2}?
Maybe something like
base = "/services/workflow/gen3-workflow"
resources_to_create = [
(
f"{base}/tasks",
tasks_path,
f"Represents workflow tasks owned by user '{username}'",
),
(
f"{base}/storage",
storage_path,
f"Represents task storage owned by user '{username}'",
),
]
for parent_path, resource_path, description in resources_to_create:
logger.debug("Attempting to create resource '%s' in Arborist", resource_path)
await self.arborist_client.create_resource(
parent_path,
{"name": user_id, "description": description},
create_parents=True,
)Again, I understand this is nitpicking, and we may not update this function at all in the future, so it is fine if you choose to stick to your design.
There was a problem hiding this comment.
IMO sometimes trying too hard to reduce code duplication just makes the code more convoluted and harder to read. Your version is not really shorter, and i think it's hard to understand at a glance, so i'll stick to mine
| _, kms_key_arn = get_existing_kms_key_for_bucket(bucket_name) | ||
| if not kms_key_arn: | ||
| err_msg = "Bucket misconfigured. Hit the `GET /storage/info` endpoint and try again." | ||
| err_msg = "Bucket misconfigured. Hit the `GET /storage/setup` endpoint and try again." |
There was a problem hiding this comment.
I always felt, the GET /storage/info is doing more than what its name says. This is better :D
tests/test_misc.py
Outdated
| @pytest.mark.parametrize( | ||
| "access_token_patcher", [{"user_id": NEW_TEST_USER_ID}], indirect=True | ||
| ) | ||
| async def test_storage_info( |
There was a problem hiding this comment.
| async def test_storage_info( | |
| async def test_storage_setup( |
| "/storage/setup", headers={"Authorization": f"bearer {TEST_USER_TOKEN}"} | ||
| ) | ||
| bucket_name = res.json()["bucket"] | ||
|
|
There was a problem hiding this comment.
Can we also do an assert to mock_arborist_request after line 307 to ensure that the arborist call with the correct path is also being hit
method="POST",
path=f"/resource/services/workflow/gen3-workflow/storage",
body=f'{{"name":"{NEW_TEST_USER_ID}","description":"Represents task storage owned by user \'test-username-{NEW_TEST_USER_ID}\'"}}',
authorized=True,
)
gen3workflow/routes/s3.py
Outdated
| # S3 bucket. It could be supported by hitting the "GET task" endpoint to get the list of | ||
| # files for a specific task that a user has access to in another user's bucket. |
There was a problem hiding this comment.
Okay, this is a little confusing to me.
It could be supported by hitting the "GET task" endpoint to get the list of
# files for a specific task that a user has access to in another user's bucket
Did you mean, It could be supported in the future by hitting the "GET task", or am I misunderstanding this?
There was a problem hiding this comment.
Yes in the future
tests/conftest.py
Outdated
| "logs": [{"system_logs": ["blah"]}], | ||
| "tags": { | ||
| "_AUTHZ": f"/users/OTHER_USER/gen3-workflow/tasks/TASK_ID_PLACEHOLDER" | ||
| "_AUTHZ": f"/services/workflow/gen3-workflow/tasks/OTHER_USER/TASK_ID_PLACEHOLDER" |
There was a problem hiding this comment.
Nothing important, just an unnecessary f-string format
| "_AUTHZ": f"/services/workflow/gen3-workflow/tasks/OTHER_USER/TASK_ID_PLACEHOLDER" | |
| "_AUTHZ": "/services/workflow/gen3-workflow/tasks/OTHER_USER/TASK_ID_PLACEHOLDER" |
|
|
||
| ## Other Gen3-Workflow functionality | ||
| - To download inputs and upload outputs, the Funnel workers need `create` access to resource `/services/workflow/gen3-workflow/tasks` on service `gen3-workflow`, like end-users. | ||
| - To empty or delete their own S3 bucket, a user needs `delete` access to the resource `/services/workflow/gen3-workflow/user-bucket` on the `gen3-workflow` service -- a special privilege useful for automated testing but not intended for the average user. |
There was a problem hiding this comment.
Do you think we don't need to specify this any more -- -- a special privilege useful for automated testing but not intended for the average user.
There was a problem hiding this comment.
Is this going to be a general user feature? We aren't sure about the scalability of this action. (here)[https://github.com/uc-cdis/gen3-workflow/blob/f5ae28fd6a7157fdbd0ba0fe74c9722ef7a1239a/gen3workflow/aws_utils.py#L427C5-L428C37]
There was a problem hiding this comment.
Thanks I forgot about that
With the refactor, it's not a special privilege anymore since i got rid of the "user-bucket" resource. You need the same access as when you use the s3 endpoint to delete a file, and all users have that access.
Since end-users can access the endpoint and have permission to delete files, stating in the docs that it shouldn't be used doesn't help - end users don't read those docs. We should just update that code when we have a chance, it's a low priority https://ctds-planx.atlassian.net/browse/MIDRC-1233
| { | ||
| "id": "gen3-workflow-deleter", | ||
| "action": {"service": "gen3-workflow", "method": "delete"}, | ||
| "id": "gen3_workflow_admin_action", |
There was a problem hiding this comment.
Shouldn't we only give gen3_workflow_reader_action and gen3_workflow_creator_action instead of gen3_workflow_admin_action to all the users?
There was a problem hiding this comment.
Users should be able to cancel tasks and delete inputs/outputs from s3 to lower costs for example
Link to JIRA ticket if there is one:
New Features
Breaking Changes
/storage/infoto/storage/setupsince it now also does user authz setup in addition to bucket setupBug Fixes
Improvements
Dependency updates
Deployment changes