Skip to content

Pmansour/snow 1927956 wif#2195

Closed
sfc-gh-pfus wants to merge 29 commits intomainfrom
pmansour/SNOW-1927956-wif
Closed

Pmansour/snow 1927956 wif#2195
sfc-gh-pfus wants to merge 29 commits intomainfrom
pmansour/SNOW-1927956-wif

Conversation

@sfc-gh-pfus
Copy link

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes #NNNN

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am modifying authorization mechanisms
    • I am adding new credentials
    • I am modifying OCSP code
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

    Please write a short description of how your code change solves the related issue.

  4. (Optional) PR for stored-proc connector:

@github-actions
Copy link

github-actions bot commented Mar 5, 2025


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

packages = find_namespace:
install_requires =
asn1crypto>0.24.0,<2.0.0
boto3>=1.0
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have risk assesment for these new libraries?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not -- what's the process for getting that?

FWIW they're the standard AWS client libraries and are maintained by AWS, so I suspect it'll be fine, but if you can point me at a process I'd be happy to follow that.

return ValueError(f"Unknown attestation provider '{attestation.provider}'")


class AuthByWorkloadIdentity(AuthByPlugin):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this class meant to be used in multithreaded environment? It is not thread safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please elaborate on this part? I believe it's using the same pattern as all the other authenticator plugins.

I am not sure if authenticator plugins generally need to be thread-safe, or if the other ones are (or aren't), or if this one is different.


# Set of authenticators allowing empty user.
empty_user_allowed_authenticators = {OAUTH_AUTHENTICATOR, NO_AUTH_AUTHENTICATOR}
empty_user_allowed_authenticators = {OAUTH_AUTHENTICATOR, NO_AUTH_AUTHENTICATOR, WORKLOAD_IDENTITY_AUTHENTICATOR}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I believe external browser and PAT also don't need user... @sfc-gh-mmishchenko wdyt?

Copy link
Contributor

@sfc-gh-mmishchenko sfc-gh-mmishchenko Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth checking, but that's not probably a subject of this review.

)


def create_autodetect_attestation(account: str, token: str | None = None) -> Union[WorkloadIdentityAttestation, None]:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having in mind 3s timeouts, that function is not a good idea IMHO :( What is the use case here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use-case is somebody writing the same application that uses the driver, and allowing it to run in multiple environments (e.g. on GCP, AWS, Azure, Kubernetes). It's a big part of the usability push here.

Of course this mode is optional, and customers who want to avoid potential startup delays should specify an explicit CSP provider. But this is a baseline for customers who just want to write once and run everywhere without configuration, and are OK with the potential latency hit.

def fake_sign_aws_req(request: AWSRequest):
"""Produces a fake Sigv4 signature on an AWS request."""
request.headers.add_header("X-Amz-Date", datetime.time().isoformat())
request.headers.add_header("X-Amz-Security-Token", "<TOKEN>")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this <TOKEN> string somehow special?

Copy link
Contributor

@sfc-gh-pmansour sfc-gh-pmansour Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a random string for the test. The real (non-stubbed) implementation will put a real token here. I'm just trying to make my stub as close to reality as possible.

)


def verify_aws_token(token: str):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a token or the whole HTTP response?

Copy link
Contributor

@sfc-gh-pmansour sfc-gh-pmansour Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're calling this a "token" for consistency purposes with Azure and GCP, but in reality the attestation we produce for AWS isn't exactly a token (though for authentication purposes it fulfills the same purpose). In reality it's a description of a signed HTTP request that our backend can execute and get the caller's identity. This is described in go/wif

assert decoded_token["method"] == "POST"

headers = decoded_token["headers"]
assert set(headers.keys()) == set(["Host", "X-Snowflake-Audience", "X-Amz-Date", "X-Amz-Security-Token", "Authorization"])
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to verify all headers? Maybe instead we can check if the relevant header exists? But what is the point of this check at all?

Copy link
Contributor

@sfc-gh-pmansour sfc-gh-pmansour Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's exactly what this is doing -- it's checking that the relevant headers exist (aka the set of keys is the expected set), but not checking their values.

The only header whose value is being verified in the test is the "X-Snowflake-Audience" header, because we're setting it ourselves in the driver code.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 6, 2025
@sfc-gh-pmansour
Copy link
Contributor

Will create a new PR shortly with a full description, all completed TODOs, a short accompanying doc and a broader list of reviewers.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants