Skip to content

Conversation

@sfc-gh-turbaszek
Copy link
Contributor

Please answer these questions before submitting your pull requests. Thanks!

  1. What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.

    Fixes #NNNN

  2. Fill out the following pre-review checklist:

    • I am adding a new automated test(s) to verify correctness of my new code
    • I am adding new logging messages
    • I am adding a new telemetry message
    • I am modifying authorization mechanisms
    • I am adding new credentials
    • I am modifying OCSP code
    • I am adding a new dependency
  3. Please describe how your code solves the related issue.

    Please write a short description of how your code change solves the related issue.

  4. (Optional) PR for stored-proc connector:

sfc-gh-rebeling and others added 30 commits July 30, 2025 15:57
adjust no_auth usage for async
Co-authored-by: Jenkins User <900904>
Comment on lines 248 to 379
name: Test FIPS linux-3.8-${{ matrix.cloud-provider }}
name: Test FIPS linux-3.9-${{ matrix.cloud-provider }}
needs: build
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
cloud-provider: [aws]
steps:
- uses: actions/checkout@v4
- name: Setup parameters file
shell: bash
env:
PARAMETERS_SECRET: ${{ secrets.PARAMETERS_SECRET }}
run: |
gpg --quiet --batch --yes --decrypt --passphrase="$PARAMETERS_SECRET" \
.github/workflows/parameters/public/parameters_${{ matrix.cloud-provider }}.py.gpg > test/parameters.py
- name: Setup private key file
shell: bash
env:
PYTHON_PRIVATE_KEY_SECRET: ${{ secrets.PYTHON_PRIVATE_KEY_SECRET }}
run: |
gpg --quiet --batch --yes --decrypt --passphrase="$PYTHON_PRIVATE_KEY_SECRET" \
.github/workflows/parameters/public/rsa_keys/rsa_key_python_${{ matrix.cloud-provider }}.p8.gpg > test/rsa_key_python_${{ matrix.cloud-provider }}.p8
- name: Download wheel(s)
uses: actions/download-artifact@v4
with:
name: manylinux_x86_64_py3.8
name: manylinux_x86_64_py3.9
path: dist
- name: Show wheels downloaded
run: ls -lh dist
shell: bash
- name: Run tests
run: ./ci/test_fips_docker.sh
env:
PYTHON_VERSION: 3.8
PYTHON_VERSION: 3.9
cloud_provider: ${{ matrix.cloud-provider }}
PYTEST_ADDOPTS: --color=yes --tb=short
TOX_PARALLEL_NO_SPINNER: 1
shell: bash
- uses: actions/upload-artifact@v4
with:
include-hidden-files: true
name: coverage_linux-fips-3.8-${{ matrix.cloud-provider }}
name: coverage_linux-fips-3.9-${{ matrix.cloud-provider }}
path: |
.coverage
coverage.xml
- uses: actions/upload-artifact@v4
with:
include-hidden-files: true
name: junit_linux-fips-3.9-${{ matrix.cloud-provider }}
path: |
junit.*.xml
test-lambda:
name: Test Lambda linux-${{ matrix.python-version }}-${{ matrix.cloud-provider }}
needs: build
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]
# TODO: temporarily reduce number of jobs: SNOW-2311643
# python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"]

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
Comment on lines +464 to +527
python-version: ["3.13"]
cloud-provider: [aws, azure, gcp]
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Display Python version
run: python -c "import sys; print(sys.version)"
- name: Set up Java
uses: actions/setup-java@v4 # for wiremock
with:
java-version: 11
distribution: 'temurin'
java-package: 'jre'
- name: Fetch Wiremock
shell: bash
run: curl https://repo1.maven.org/maven2/org/wiremock/wiremock-standalone/3.11.0/wiremock-standalone-3.11.0.jar --output .wiremock/wiremock-standalone.jar
- name: Setup parameters file
shell: bash
env:
PARAMETERS_SECRET: ${{ secrets.PARAMETERS_SECRET }}
run: |
gpg --quiet --batch --yes --decrypt --passphrase="$PARAMETERS_SECRET" \
.github/workflows/parameters/public/parameters_${{ matrix.cloud-provider }}.py.gpg > test/parameters.py
- name: Setup private key file
shell: bash
env:
PYTHON_PRIVATE_KEY_SECRET: ${{ secrets.PYTHON_PRIVATE_KEY_SECRET }}
run: |
gpg --quiet --batch --yes --decrypt --passphrase="$PYTHON_PRIVATE_KEY_SECRET" \
.github/workflows/parameters/public/rsa_keys/rsa_key_python_${{ matrix.cloud-provider }}.p8.gpg > test/rsa_key_python_${{ matrix.cloud-provider }}.p8
- name: Download wheel(s)
uses: actions/download-artifact@v4
with:
name: ${{ matrix.os.download_name }}_py${{ matrix.python-version }}
path: dist
- name: Show wheels downloaded
run: ls -lh dist
shell: bash
- name: Upgrade setuptools, pip and wheel
run: python -m pip install -U setuptools pip wheel
- name: Install tox
run: python -m pip install tox>=4
- name: Run tests
run: python -m tox run -e aio
env:
PYTHON_VERSION: ${{ matrix.python-version }}
cloud_provider: ${{ matrix.cloud-provider }}
PYTEST_ADDOPTS: --color=yes --tb=short
TOX_PARALLEL_NO_SPINNER: 1
shell: bash
- name: Combine coverages
run: python -m tox run -e coverage --skip-missing-interpreters false
shell: bash
- uses: actions/upload-artifact@v4
with:
name: coverage_aio_${{ matrix.os.download_name }}-${{ matrix.python-version }}-${{ matrix.cloud-provider }}
path: |
.tox/.coverage
.tox/coverage.xml
test-unsupporeted-aio:
name: Test unsupported asyncio ${{ matrix.os.download_name }}-${{ matrix.python-version }}
runs-on: ${{ matrix.os.image_name }}
strategy:
fail-fast: false
matrix:
os:
- image_name: ubuntu-latest
download_name: manylinux_x86_64
python-version: [ "3.9", ]
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}
Comment on lines +148 to +151
{
k: v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******"
for (k, v) in body["data"].items()
},

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.

Copilot Autofix

AI 3 days ago

To fix the problem, we must ensure that sensitive data such as passwords, passcodes, authentication tokens, and related credentials are never logged, regardless of whitelist configuration. The best way to do this is to explicitly identify and mask all such keys in the data before logging, overriding any whitelist logic. We should define a set of sensitive keys ({"PASSCODE", ...}) and always redact these in any log output. The edit is localized to the debug log statement, so add a hardcoded redaction for sensitive keys regardless of the whitelist, e.g., always replacing the value of keys like "PASSCODE" with "******".

In src/snowflake/connector/aio/auth/_auth.py:

  • Add a definition near the log statement for a set of sensitive keys.
  • Change the dictionary comprehension for logging so that if a key is in the set of sensitive keys (case-insensitive), the value is always masked as "******", regardless of the whitelist. For non-sensitive keys, continue with the previous logic.
  • No new dependencies are required.

Only code in the shown snippet (lines 141-152) needs modification.


Suggested changeset 1
src/snowflake/connector/aio/auth/_auth.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/src/snowflake/connector/aio/auth/_auth.py b/src/snowflake/connector/aio/auth/_auth.py
--- a/src/snowflake/connector/aio/auth/_auth.py
+++ b/src/snowflake/connector/aio/auth/_auth.py
@@ -143,10 +143,14 @@
         if session_parameters:
             body["data"]["SESSION_PARAMETERS"] = session_parameters
 
+        # Always redact sensitive keys (e.g. PASSCODE), even if they're in the whitelist
+        _SENSITIVE_AUTH_KEYS = {"PASSCODE", "PASSWORD", "TOKEN", "SECRET"}
         logger.debug(
             "body['data']: %s",
             {
-                k: v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******"
+                k: "******"
+                if k.upper() in _SENSITIVE_AUTH_KEYS
+                else (v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******")
                 for (k, v) in body["data"].items()
             },
         )
EOF
@@ -143,10 +143,14 @@
if session_parameters:
body["data"]["SESSION_PARAMETERS"] = session_parameters

# Always redact sensitive keys (e.g. PASSCODE), even if they're in the whitelist
_SENSITIVE_AUTH_KEYS = {"PASSCODE", "PASSWORD", "TOKEN", "SECRET"}
logger.debug(
"body['data']: %s",
{
k: v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******"
k: "******"
if k.upper() in _SENSITIVE_AUTH_KEYS
else (v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******")
for (k, v) in body["data"].items()
},
)
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +367 to +370
"please open the URL manually then paste the "
"URL you are redirected to into the terminal:\n"
f"{authorization_request}"
)

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (secret)
as clear text.

Copilot Autofix

AI about 19 hours ago

To address this issue, avoid printing the entire authorization request URL containing sensitive OAuth parameters (especially client_id) directly to STDOUT. Instead, redact sensitive fields such as client_id from the printed URL. The best method is to parse and reconstruct the URL, replacing the value of any sensitive query parameters (e.g., client_id) with a placeholder or asterisks before displaying to the user.

How to fix:

  • In AuthByOauthCode._ask_authorization_callback_from_user, before printing the authorization_request string, parse its query parameters, identify any that are considered sensitive (at minimum client_id; extendable to others like client_secret if present), and replace their values with a masked version.
  • Reconstruct the URL with redacted values for display.

Where to fix:

  • The main change is in the body of _ask_authorization_callback_from_user in src/snowflake/connector/auth/oauth_code.py, where authorization_request is printed.
  • Introduce a helper function (within the class or file) named redact_query_params_from_url(url, sensitive_keys) -> str which redacts specified keys from the query string.

Details:

  • The helper can use urllib.parse to split and process the URL.
  • After redacting, use the masked URL in the print statement.

Imports/Requirements:

  • No new external dependencies are needed; only the standard urllib.parse (already imported).

Suggested changeset 1
src/snowflake/connector/auth/oauth_code.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/src/snowflake/connector/auth/oauth_code.py b/src/snowflake/connector/auth/oauth_code.py
--- a/src/snowflake/connector/auth/oauth_code.py
+++ b/src/snowflake/connector/auth/oauth_code.py
@@ -37,6 +37,18 @@
 BUF_SIZE = 16384
 
 
+def redact_query_params_from_url(url: str, sensitive_keys: set[str]) -> str:
+    """Redacts sensitive query params from a URL."""
+    import urllib.parse
+    parsed_url = urllib.parse.urlparse(url)
+    query_params = urllib.parse.parse_qs(parsed_url.query, keep_blank_values=True)
+    for key in sensitive_keys:
+        if key in query_params:
+            query_params[key] = ["*****"]
+    redacted_query = urllib.parse.urlencode(query_params, doseq=True)
+    redacted_url = parsed_url._replace(query=redacted_query).geturl()
+    return redacted_url
+
 def _get_query_params(
     url: str,
 ) -> dict[str, list[str]]:
@@ -363,11 +375,15 @@
         connection: SnowflakeConnection,
     ) -> (str | None, str | None):
         logger.debug("requesting authorization redirected url from user")
+        redacted_url = redact_query_params_from_url(
+            authorization_request,
+            sensitive_keys={"client_id", "client_secret"}
+        )
         print(
             "We were unable to open a browser window for you, "
             "please open the URL manually then paste the "
             "URL you are redirected to into the terminal:\n"
-            f"{authorization_request}"
+            f"{redacted_url}"
         )
         received_redirected_request = input(
             "Enter the URL the OAuth flow redirected you to: "
EOF
@@ -37,6 +37,18 @@
BUF_SIZE = 16384


def redact_query_params_from_url(url: str, sensitive_keys: set[str]) -> str:
"""Redacts sensitive query params from a URL."""
import urllib.parse
parsed_url = urllib.parse.urlparse(url)
query_params = urllib.parse.parse_qs(parsed_url.query, keep_blank_values=True)
for key in sensitive_keys:
if key in query_params:
query_params[key] = ["*****"]
redacted_query = urllib.parse.urlencode(query_params, doseq=True)
redacted_url = parsed_url._replace(query=redacted_query).geturl()
return redacted_url

def _get_query_params(
url: str,
) -> dict[str, list[str]]:
@@ -363,11 +375,15 @@
connection: SnowflakeConnection,
) -> (str | None, str | None):
logger.debug("requesting authorization redirected url from user")
redacted_url = redact_query_params_from_url(
authorization_request,
sensitive_keys={"client_id", "client_secret"}
)
print(
"We were unable to open a browser window for you, "
"please open the URL manually then paste the "
"URL you are redirected to into the terminal:\n"
f"{authorization_request}"
f"{redacted_url}"
)
received_redirected_request = input(
"Enter the URL the OAuth flow redirected you to: "
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +685 to +687
c[0][1].startswith(
"https://test-account.snowflakecomputing.com:443"
)

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
https://test-account.snowflakecomputing.com:443
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI about 19 hours ago

To fix the issue, we should parse the URL before making assertions about its structure. Instead of checking if the URL string starts with "https://test-account.snowflakecomputing.com:443", the code should use urllib.parse.urlparse to break the URL into its components and then check that the scheme, netloc, hostname, and port are all precisely what we expect. This guarantees that the domain check cannot be bypassed by clever string-prefix tricks.

In the context of test_dashed_url_account_name, we need to:

  • Import urllib.parse.urlparse if not present.
  • In the assertion, parse each URL from the mocked calls and assert that scheme, hostname, and port equal what is expected ("https", "test-account.snowflakecomputing.com", and 443).

Modify only the relevant region (lines 717–724), leaving the rest of the code unchanged.

Suggested changeset 1
test/integ/aio_it/test_connection_async.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/test/integ/aio_it/test_connection_async.py b/test/integ/aio_it/test_connection_async.py
--- a/test/integ/aio_it/test_connection_async.py
+++ b/test/integ/aio_it/test_connection_async.py
@@ -714,10 +714,18 @@
             cnx.commit = cnx.rollback = lambda: asyncio.sleep(
                 0
             )  # Skip tear down, there's only a mocked rest api
+            from urllib.parse import urlparse
             assert any(
                 [
-                    c[0][1].startswith(
-                        "https://test-account.snowflakecomputing.com:443"
+                    (
+                        urlparse(c[0][1]).scheme == "https"
+                        and urlparse(c[0][1]).hostname == "test-account.snowflakecomputing.com"
+                        and (
+                            urlparse(c[0][1]).port == 443
+                            or (
+                                urlparse(c[0][1]).port is None and ":443" in urlparse(c[0][1]).netloc
+                            )
+                        )
                     )
                     for c in mocked_fetch.call_args_list
                 ]
EOF
@@ -714,10 +714,18 @@
cnx.commit = cnx.rollback = lambda: asyncio.sleep(
0
) # Skip tear down, there's only a mocked rest api
from urllib.parse import urlparse
assert any(
[
c[0][1].startswith(
"https://test-account.snowflakecomputing.com:443"
(
urlparse(c[0][1]).scheme == "https"
and urlparse(c[0][1]).hostname == "test-account.snowflakecomputing.com"
and (
urlparse(c[0][1]).port == 443
or (
urlparse(c[0][1]).port is None and ":443" in urlparse(c[0][1]).netloc
)
)
)
for c in mocked_fetch.call_args_list
]
Copilot is powered by AI and may make mistakes. Always verify output.
# azure_request_present = False
expected_token_prefix = "sig="
for line in caplog.text.splitlines():
if "blob.core.windows.net" in line and expected_token_prefix in line:

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
blob.core.windows.net
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 3 days ago

To correctly check whether the log line includes an Azure blob storage URL (i.e., whether the hostname is actually blob.core.windows.net or a subdomain such as *.blob.core.windows.net), parse any URLs found in the log line, and check if any of their parsed hostnames end with .blob.core.windows.net or are exactly blob.core.windows.net.

Since log lines may contain more than just URLs, search each line for substrings that look like URLs (e.g., using re.findall with a URL pattern), and then use urllib.parse.urlparse to extract and analyze the hostname.

Required changes:

  • Add an import for re and for urlparse (from urllib.parse import urlparse).
  • Replace the condition at line 97 so that, for each line in caplog.text.splitlines(), all URLs within the line are parsed, and their hostnames checked to ensure that hostname is exactly or ends with .blob.core.windows.net.
  • If any such URL is found and contains the expected token, trigger the assertion.

Suggested changeset 1
test/integ/aio_it/test_put_get_with_azure_token_async.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/test/integ/aio_it/test_put_get_with_azure_token_async.py b/test/integ/aio_it/test_put_get_with_azure_token_async.py
--- a/test/integ/aio_it/test_put_get_with_azure_token_async.py
+++ b/test/integ/aio_it/test_put_get_with_azure_token_async.py
@@ -15,6 +15,9 @@
 
 import pytest
 
+import re
+from urllib.parse import urlparse
+
 from snowflake.connector.constants import UTF8
 from snowflake.connector.file_transfer_agent import (
     SnowflakeAzureProgressPercentage,
@@ -94,7 +97,15 @@
     # azure_request_present = False
     expected_token_prefix = "sig="
     for line in caplog.text.splitlines():
-        if "blob.core.windows.net" in line and expected_token_prefix in line:
+        urls = re.findall(r'(https?://[^\s\'"\\]+)', line)
+        has_azure_blob_url = any(
+            (hostname := urlparse(url).hostname)
+            and (
+                hostname == "blob.core.windows.net" or hostname.endswith(".blob.core.windows.net")
+            )
+            for url in urls
+        )
+        if has_azure_blob_url and expected_token_prefix in line:
             # azure_request_present = True
             # getattr is used to stay compatible with old driver - before SECRET_STARRED_MASK_STR was added
             assert (
EOF
@@ -15,6 +15,9 @@

import pytest

import re
from urllib.parse import urlparse

from snowflake.connector.constants import UTF8
from snowflake.connector.file_transfer_agent import (
SnowflakeAzureProgressPercentage,
@@ -94,7 +97,15 @@
# azure_request_present = False
expected_token_prefix = "sig="
for line in caplog.text.splitlines():
if "blob.core.windows.net" in line and expected_token_prefix in line:
urls = re.findall(r'(https?://[^\s\'"\\]+)', line)
has_azure_blob_url = any(
(hostname := urlparse(url).hostname)
and (
hostname == "blob.core.windows.net" or hostname.endswith(".blob.core.windows.net")
)
for url in urls
)
if has_azure_blob_url and expected_token_prefix in line:
# azure_request_present = True
# getattr is used to stay compatible with old driver - before SECRET_STARRED_MASK_STR was added
assert (
Copilot is powered by AI and may make mistakes. Always verify output.
Comment on lines +146 to +148
pass


Check failure

Code scanning / CodeQL

Use of weak cryptographic key High test

Creation of an RSA key uses
1024
bits, which is below 2048 and considered breakable.

Copilot Autofix

AI 3 days ago

To fix this vulnerability, update the rsa.generate_private_key() call to use a key_size of at least 2048 bits in place of the current 1024. This change will ensure the generated RSA key meets current security best practices and cryptographic standards. You do not need to change any logic or functionality; simply replace key_size=1024 with key_size=2048 or greater in the affected function. No new imports or definitions are required, and the fix should be made in test/unit/test_ocsp.py within the create_x509_cert function.


Suggested changeset 1
test/unit/test_ocsp.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/test/unit/test_ocsp.py b/test/unit/test_ocsp.py
--- a/test/unit/test_ocsp.py
+++ b/test/unit/test_ocsp.py
@@ -149,7 +149,7 @@
 def create_x509_cert(hash_algorithm):
     # Generate a private key
     private_key = rsa.generate_private_key(
-        public_exponent=65537, key_size=1024, backend=default_backend()
+        public_exponent=65537, key_size=2048, backend=default_backend()
     )
 
     # Generate a public key
EOF
@@ -149,7 +149,7 @@
def create_x509_cert(hash_algorithm):
# Generate a private key
private_key = rsa.generate_private_key(
public_exponent=65537, key_size=1024, backend=default_backend()
public_exponent=65537, key_size=2048, backend=default_backend()
)

# Generate a public key
Copilot is powered by AI and may make mistakes. Always verify output.
sfc-gh-pcyrek and others added 14 commits October 28, 2025 14:37
Port PR: snowflakedb/Stored-Proc-Python-Connector#225

I noticed that get_results_from_sfqid assumes that fetchall returns a tuple when that's not necessarily the case. Fixing it here + adding a test that fails without the fix.

```
E                   AssertionError: assert [{'multiple s...ccessfully.'}] == [{'1': 1}]
E                     At index 0 diff: {'multiple statement execution': 'Multiple statements executed successfully.'} != {'1': 1}
E                     Full diff:
E                     - [{'1': 1}]
E                     + [{'multiple statement execution': 'Multiple statements executed successfully.'}]
```
Co-authored-by: Berkay Öztürk <[email protected]>

(cherry picked from commit 87c623e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.