Skip to content

Conversation

@roryjbd
Copy link

@roryjbd roryjbd commented Sep 4, 2025

resolves #1234
Issue raised for updating docs
dbt-labs/docs.getdbt.com#7866

Problem

Allows for authentication to Snowflake using Workload Identity Federation

Solution

Allows the necessary parameters to be passed to the snowflake connector connect method.
Updates to latest supported snowflake-connector-python package.

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX

@roryjbd roryjbd requested a review from a team as a code owner September 4, 2025 15:14
@cla-bot
Copy link

cla-bot bot commented Sep 4, 2025

Thanks for your pull request, and welcome to our community! We require contributors to sign our Contributor License Agreement and we don't seem to have your signature on file. Check out this article for more information on why we have a CLA.

In order for us to review and merge your code, please submit the Individual Contributor License Agreement form attached above above. If you have questions about the CLA, or if you believe you've received this message in error, please reach out through a comment on this PR.

CLA has not been signed by users: @roryjbd

@roryjbd
Copy link
Author

roryjbd commented Sep 4, 2025

CLA now signed

@cla-bot cla-bot bot added the cla:yes The PR author has signed the CLA label Sep 4, 2025
Copy link
Contributor

@colin-rogers-dbt colin-rogers-dbt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple questions but this looks good, will work on getting OIDC set up this week to test

@colin-rogers-dbt
Copy link
Contributor

Also can you add the environment variables your tests need to the github action?

env:
SNOWFLAKE_TEST_ACCOUNT: ${{ secrets.SNOWFLAKE_TEST_ACCOUNT }}

@roryjbd
Copy link
Author

roryjbd commented Sep 9, 2025

Also can you add the environment variables your tests need to the github action?

env:
SNOWFLAKE_TEST_ACCOUNT: ${{ secrets.SNOWFLAKE_TEST_ACCOUNT }}

I've just updated the integration tests to use SNOWFLAKE_TEST_USER instead of SNOWFLAKE_TEST_WIF_USER - so no additional env vars required now. It'll be possible to add WIF as an additonal auth method to existing user that you have for integration tests. Just alter user instead of creating a new user.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2025

Thank you for your pull request! We could not find a changelog entry for this change in the dbt-snowflake package. For details on how to document a change, see the Contributing Guide.

@roryjbd
Copy link
Author

roryjbd commented Sep 9, 2025

@colin-rogers-dbt Let me sort the f-string and changelog entry!

@bsalerno
Copy link

Anything else needed to get this over the line? I'm happy to help, very interested in getting this working!

@roryjbd
Copy link
Author

roryjbd commented Sep 18, 2025 via email

@rorydonaldson
Copy link

Hi @colin-rogers-dbt - have you had a chance to look at this again? I've been using my (alter ego roryjbd) fork in work and all is running ok. But keen to get this released.

@anentropic
Copy link

@sfc-gh-pmansour I'm intrigued by the new parameter, but I read the docs here https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-api and I still don't understand what it is/how it is used

Are there more docs somewhere?

I'm currently in the process of converting all my project code to use AWS WIF auth

@sfc-gh-pmansour
Copy link

@sfc-gh-pmansour I'm intrigued by the new parameter, but I read the docs here https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-api and I still don't understand what it is/how it is used

Are there more docs somewhere?

I'm currently in the process of converting all my project code to use AWS WIF auth

There's a little bit more about this parameter recorded here: https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-api#:~:text=workload_identity_impersonation_path

For example, if you have a GCE VM with WIF configured, with an attached service account [email protected], you can use this parameter like this:

workload_identity_impersonation_path=["[email protected]", "[email protected]"]

What would happen is:

  1. Client gets a token for the attached service account (A)
  2. Client uses that token to impersonate B then C
  3. Client uses the token for C to authenticate to Snowflake

Snowflake only sees service account C. All the others are exchanged entirely on the client-side.

This lets you support a more dynamic binding of your workloads, without necessarily tying your Snowflake user to your execution identity. For example, a single workload can use WIF to authenticate to Snowflake as many different users (based on different impersonated Service Accounts), and many workloads can authenticate to Snowflake as the same user (based on impersonating that Service Account).

@anentropic
Copy link

@sfc-gh-pmansour I'm intrigued by the new parameter, but I read the docs here https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-api and I still don't understand what it is/how it is used
Are there more docs somewhere?
I'm currently in the process of converting all my project code to use AWS WIF auth

There's a little bit more about this parameter recorded here: https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-api#:~:text=workload_identity_impersonation_path

For example, if you have a GCE VM with WIF configured, with an attached service account [email protected], you can use this parameter like this:

workload_identity_impersonation_path=["[email protected]", "[email protected]"]

What would happen is:

  1. Client gets a token for the attached service account (A)
  2. Client uses that token to impersonate B then C
  3. Client uses the token for C to authenticate to Snowflake

Snowflake only sees service account C. All the others are exchanged entirely on the client-side.

This lets you support a more dynamic binding of your workloads, without necessarily tying your Snowflake user to your execution identity. For example, a single workload can use WIF to authenticate to Snowflake as many different users (based on different impersonated Service Accounts), and many workloads can authenticate to Snowflake as the same user (based on impersonating that Service Account).

I feel like I'm still missing some important part.

It's not clear to me exactly what type of thing the entries in the list are.

Is there a concept or feature on the AWS side that this corresponds to? I guess there is something I have to set up there to relate these entries back to the role that was attached to WIF auth in Snowflake?

@sfc-gh-pmansour
Copy link

@sfc-gh-pmansour I'm intrigued by the new parameter, but I read the docs here https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-api and I still don't understand what it is/how it is used
Are there more docs somewhere?
I'm currently in the process of converting all my project code to use AWS WIF auth

There's a little bit more about this parameter recorded here: https://docs.snowflake.com/en/developer-guide/python-connector/python-connector-api#:~:text=workload_identity_impersonation_path
For example, if you have a GCE VM with WIF configured, with an attached service account [email protected], you can use this parameter like this:

workload_identity_impersonation_path=["[email protected]", "[email protected]"]

What would happen is:

  1. Client gets a token for the attached service account (A)
  2. Client uses that token to impersonate B then C
  3. Client uses the token for C to authenticate to Snowflake

Snowflake only sees service account C. All the others are exchanged entirely on the client-side.
This lets you support a more dynamic binding of your workloads, without necessarily tying your Snowflake user to your execution identity. For example, a single workload can use WIF to authenticate to Snowflake as many different users (based on different impersonated Service Accounts), and many workloads can authenticate to Snowflake as the same user (based on impersonating that Service Account).

I feel like I'm still missing some important part.

It's not clear to me exactly what type of thing the entries in the list are.

Is there a concept or feature on the AWS side that this corresponds to? I guess there is something I have to set up there to relate these entries back to the role that was attached to WIF auth in Snowflake?

The actual data type is always a string, but the semantic "type of thing" depends on the WIF provider you're using:

  • For AWS, these are IAM role ARNs, e.g. ["arn:aws:iam::123456789012:role/role-1", "arn:aws:iam::123456789012:role/role-2"]
  • For GCP, these can be either service account emails (e.g. ["[email protected]", ...]) or service account uniqueIds (e.g. ["7432810928", "38457809230"])
  • For Azure, this feature is not supported as Azure doesn't support impersonation but lets you solve a similar problem with user-assigned managed identities.

@anentropic
Copy link

@sfc-gh-pmansour

thanks, I am still struggling a bit to understand concretely what this means though

you say "Snowflake only sees service account C"... so I guess the prior steps are happening between snowflake-connector-python and AWS ?

is it a list of "assumed" roles? i.e. first assume role A, which allows you to assume B, which allows you to assume C, which is the role configured in Snowflake for WIF auth?

never mind, I should have looked in the source, yes looks like it's that:
https://github.com/snowflakedb/snowflake-connector-python/blob/main/src/snowflake/connector/wif_util.py#L161-L171

and so the benefit is I don't have to be running a task as role C, it just has to be a role which has permission to ultimately assume role C

I guess "impersonation" is GCP lingo which was making it a bit obscure as an AWS user

@roryjbd
Copy link
Author

roryjbd commented Oct 24, 2025

@colin-rogers-dbt should hopefully be good to go now. @sfc-gh-pmansour will get an issue raised for updating to v4.0 and then hopefully can work on that this weekend

@roryjbd
Copy link
Author

roryjbd commented Oct 24, 2025

test passing in AWS EC2:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla:yes The PR author has signed the CLA

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Add support for Snowflake Workload Identity Federation

6 participants