- 
                Notifications
    You must be signed in to change notification settings 
- Fork 524
[DO NOT MERGE] Add support for async io #2608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
….17.1 (#2483) Co-authored-by: Jenkins User <900904> Co-authored-by: github-actions <[email protected]>
(cherry picked from commit 6f48c9e)
        
          
                .github/workflows/build_test.yml
              
                Outdated
          
        
      | name: Test FIPS linux-3.8-${{ matrix.cloud-provider }} | ||
| name: Test FIPS linux-3.9-${{ matrix.cloud-provider }} | ||
| needs: build | ||
| runs-on: ubuntu-latest | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| cloud-provider: [aws] | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: Setup parameters file | ||
| shell: bash | ||
| env: | ||
| PARAMETERS_SECRET: ${{ secrets.PARAMETERS_SECRET }} | ||
| run: | | ||
| gpg --quiet --batch --yes --decrypt --passphrase="$PARAMETERS_SECRET" \ | ||
| .github/workflows/parameters/public/parameters_${{ matrix.cloud-provider }}.py.gpg > test/parameters.py | ||
| - name: Setup private key file | ||
| shell: bash | ||
| env: | ||
| PYTHON_PRIVATE_KEY_SECRET: ${{ secrets.PYTHON_PRIVATE_KEY_SECRET }} | ||
| run: | | ||
| gpg --quiet --batch --yes --decrypt --passphrase="$PYTHON_PRIVATE_KEY_SECRET" \ | ||
| .github/workflows/parameters/public/rsa_keys/rsa_key_python_${{ matrix.cloud-provider }}.p8.gpg > test/rsa_key_python_${{ matrix.cloud-provider }}.p8 | ||
| - name: Download wheel(s) | ||
| uses: actions/download-artifact@v4 | ||
| with: | ||
| name: manylinux_x86_64_py3.8 | ||
| name: manylinux_x86_64_py3.9 | ||
| path: dist | ||
| - name: Show wheels downloaded | ||
| run: ls -lh dist | ||
| shell: bash | ||
| - name: Run tests | ||
| run: ./ci/test_fips_docker.sh | ||
| env: | ||
| PYTHON_VERSION: 3.8 | ||
| PYTHON_VERSION: 3.9 | ||
| cloud_provider: ${{ matrix.cloud-provider }} | ||
| PYTEST_ADDOPTS: --color=yes --tb=short | ||
| TOX_PARALLEL_NO_SPINNER: 1 | ||
| shell: bash | ||
| - uses: actions/upload-artifact@v4 | ||
| with: | ||
| include-hidden-files: true | ||
| name: coverage_linux-fips-3.8-${{ matrix.cloud-provider }} | ||
| name: coverage_linux-fips-3.9-${{ matrix.cloud-provider }} | ||
| path: | | ||
| .coverage | ||
| coverage.xml | ||
| - uses: actions/upload-artifact@v4 | ||
| with: | ||
| include-hidden-files: true | ||
| name: junit_linux-fips-3.9-${{ matrix.cloud-provider }} | ||
| path: | | ||
| junit.*.xml | ||
| test-lambda: | ||
| name: Test Lambda linux-${{ matrix.python-version }}-${{ matrix.cloud-provider }} | ||
| needs: build | ||
| runs-on: ubuntu-latest | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"] | ||
| # TODO: temporarily reduce number of jobs: SNOW-2311643 | ||
| # python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"] | 
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
| python-version: ["3.13"] | ||
| cloud-provider: [aws, azure, gcp] | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: Set up Python | ||
| uses: actions/setup-python@v4 | ||
| with: | ||
| python-version: ${{ matrix.python-version }} | ||
| - name: Display Python version | ||
| run: python -c "import sys; print(sys.version)" | ||
| - name: Set up Java | ||
| uses: actions/setup-java@v4 # for wiremock | ||
| with: | ||
| java-version: 11 | ||
| distribution: 'temurin' | ||
| java-package: 'jre' | ||
| - name: Fetch Wiremock | ||
| shell: bash | ||
| run: curl https://repo1.maven.org/maven2/org/wiremock/wiremock-standalone/3.11.0/wiremock-standalone-3.11.0.jar --output .wiremock/wiremock-standalone.jar | ||
| - name: Setup parameters file | ||
| shell: bash | ||
| env: | ||
| PARAMETERS_SECRET: ${{ secrets.PARAMETERS_SECRET }} | ||
| run: | | ||
| gpg --quiet --batch --yes --decrypt --passphrase="$PARAMETERS_SECRET" \ | ||
| .github/workflows/parameters/public/parameters_${{ matrix.cloud-provider }}.py.gpg > test/parameters.py | ||
| - name: Setup private key file | ||
| shell: bash | ||
| env: | ||
| PYTHON_PRIVATE_KEY_SECRET: ${{ secrets.PYTHON_PRIVATE_KEY_SECRET }} | ||
| run: | | ||
| gpg --quiet --batch --yes --decrypt --passphrase="$PYTHON_PRIVATE_KEY_SECRET" \ | ||
| .github/workflows/parameters/public/rsa_keys/rsa_key_python_${{ matrix.cloud-provider }}.p8.gpg > test/rsa_key_python_${{ matrix.cloud-provider }}.p8 | ||
| - name: Download wheel(s) | ||
| uses: actions/download-artifact@v4 | ||
| with: | ||
| name: ${{ matrix.os.download_name }}_py${{ matrix.python-version }} | ||
| path: dist | ||
| - name: Show wheels downloaded | ||
| run: ls -lh dist | ||
| shell: bash | ||
| - name: Upgrade setuptools, pip and wheel | ||
| run: python -m pip install -U setuptools pip wheel | ||
| - name: Install tox | ||
| run: python -m pip install tox>=4 | ||
| - name: Run tests | ||
| run: python -m tox run -e aio | ||
| env: | ||
| PYTHON_VERSION: ${{ matrix.python-version }} | ||
| cloud_provider: ${{ matrix.cloud-provider }} | ||
| PYTEST_ADDOPTS: --color=yes --tb=short | ||
| TOX_PARALLEL_NO_SPINNER: 1 | ||
| shell: bash | ||
| - name: Combine coverages | ||
| run: python -m tox run -e coverage --skip-missing-interpreters false | ||
| shell: bash | ||
| - uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: coverage_aio_${{ matrix.os.download_name }}-${{ matrix.python-version }}-${{ matrix.cloud-provider }} | ||
| path: | | ||
| .tox/.coverage | ||
| .tox/coverage.xml | ||
| test-unsupporeted-aio: | ||
| name: Test unsupported asyncio ${{ matrix.os.download_name }}-${{ matrix.python-version }} | ||
| runs-on: ${{ matrix.os.image_name }} | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| os: | ||
| - image_name: ubuntu-latest | ||
| download_name: manylinux_x86_64 | ||
| python-version: [ "3.9", ] | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: Set up Python | ||
| uses: actions/setup-python@v4 | ||
| with: | ||
| python-version: ${{ matrix.python-version }} | 
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
| { | ||
| k: v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******" | ||
| for (k, v) in body["data"].items() | ||
| }, | 
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
          
            
              
                
              
            
            Show autofix suggestion
            Hide autofix suggestion
          
      Copilot Autofix
AI 3 days ago
To fix the problem, we must ensure that sensitive data such as passwords, passcodes, authentication tokens, and related credentials are never logged, regardless of whitelist configuration. The best way to do this is to explicitly identify and mask all such keys in the data before logging, overriding any whitelist logic. We should define a set of sensitive keys ({"PASSCODE", ...}) and always redact these in any log output. The edit is localized to the debug log statement, so add a hardcoded redaction for sensitive keys regardless of the whitelist, e.g., always replacing the value of keys like "PASSCODE" with "******".
In src/snowflake/connector/aio/auth/_auth.py:
- Add a definition near the log statement for a set of sensitive keys.
- Change the dictionary comprehension for logging so that if a key is in the set of sensitive keys (case-insensitive), the value is always masked as "******", regardless of the whitelist. For non-sensitive keys, continue with the previous logic.
- No new dependencies are required.
Only code in the shown snippet (lines 141-152) needs modification.
- 
    
    
    Copy modified lines R146-R147 
- 
    
    
    Copy modified lines R151-R153 
| @@ -143,10 +143,14 @@ | ||
| if session_parameters: | ||
| body["data"]["SESSION_PARAMETERS"] = session_parameters | ||
|  | ||
| # Always redact sensitive keys (e.g. PASSCODE), even if they're in the whitelist | ||
| _SENSITIVE_AUTH_KEYS = {"PASSCODE", "PASSWORD", "TOKEN", "SECRET"} | ||
| logger.debug( | ||
| "body['data']: %s", | ||
| { | ||
| k: v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******" | ||
| k: "******" | ||
| if k.upper() in _SENSITIVE_AUTH_KEYS | ||
| else (v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******") | ||
| for (k, v) in body["data"].items() | ||
| }, | ||
| ) | 
| "please open the URL manually then paste the " | ||
| "URL you are redirected to into the terminal:\n" | ||
| f"{authorization_request}" | ||
| ) | 
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
          
            
              
                
              
            
            Show autofix suggestion
            Hide autofix suggestion
          
      Copilot Autofix
AI about 19 hours ago
To address this issue, avoid printing the entire authorization request URL containing sensitive OAuth parameters (especially client_id) directly to STDOUT. Instead, redact sensitive fields such as client_id from the printed URL. The best method is to parse and reconstruct the URL, replacing the value of any sensitive query parameters (e.g., client_id) with a placeholder or asterisks before displaying to the user.
How to fix:
- In AuthByOauthCode._ask_authorization_callback_from_user, before printing theauthorization_requeststring, parse its query parameters, identify any that are considered sensitive (at minimumclient_id; extendable to others likeclient_secretif present), and replace their values with a masked version.
- Reconstruct the URL with redacted values for display.
Where to fix:
- The main change is in the body of _ask_authorization_callback_from_userinsrc/snowflake/connector/auth/oauth_code.py, whereauthorization_requestis printed.
- Introduce a helper function (within the class or file) named redact_query_params_from_url(url, sensitive_keys) -> strwhich redacts specified keys from the query string.
Details:
- The helper can use urllib.parseto split and process the URL.
- After redacting, use the masked URL in the print statement.
Imports/Requirements:
- No new external dependencies are needed; only the standard urllib.parse(already imported).
- 
    
    
    Copy modified lines R40-R51 
- 
    
    
    Copy modified lines R378-R381 
- 
    
    
    Copy modified line R386 
| @@ -37,6 +37,18 @@ | ||
| BUF_SIZE = 16384 | ||
|  | ||
|  | ||
| def redact_query_params_from_url(url: str, sensitive_keys: set[str]) -> str: | ||
| """Redacts sensitive query params from a URL.""" | ||
| import urllib.parse | ||
| parsed_url = urllib.parse.urlparse(url) | ||
| query_params = urllib.parse.parse_qs(parsed_url.query, keep_blank_values=True) | ||
| for key in sensitive_keys: | ||
| if key in query_params: | ||
| query_params[key] = ["*****"] | ||
| redacted_query = urllib.parse.urlencode(query_params, doseq=True) | ||
| redacted_url = parsed_url._replace(query=redacted_query).geturl() | ||
| return redacted_url | ||
|  | ||
| def _get_query_params( | ||
| url: str, | ||
| ) -> dict[str, list[str]]: | ||
| @@ -363,11 +375,15 @@ | ||
| connection: SnowflakeConnection, | ||
| ) -> (str | None, str | None): | ||
| logger.debug("requesting authorization redirected url from user") | ||
| redacted_url = redact_query_params_from_url( | ||
| authorization_request, | ||
| sensitive_keys={"client_id", "client_secret"} | ||
| ) | ||
| print( | ||
| "We were unable to open a browser window for you, " | ||
| "please open the URL manually then paste the " | ||
| "URL you are redirected to into the terminal:\n" | ||
| f"{authorization_request}" | ||
| f"{redacted_url}" | ||
| ) | ||
| received_redirected_request = input( | ||
| "Enter the URL the OAuth flow redirected you to: " | 
| c[0][1].startswith( | ||
| "https://test-account.snowflakecomputing.com:443" | ||
| ) | 
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
https://test-account.snowflakecomputing.com:443
          
            
              
                
              
            
            Show autofix suggestion
            Hide autofix suggestion
          
      Copilot Autofix
AI about 19 hours ago
To fix the issue, we should parse the URL before making assertions about its structure. Instead of checking if the URL string starts with "https://test-account.snowflakecomputing.com:443", the code should use urllib.parse.urlparse to break the URL into its components and then check that the scheme, netloc, hostname, and port are all precisely what we expect. This guarantees that the domain check cannot be bypassed by clever string-prefix tricks.
In the context of test_dashed_url_account_name, we need to:
- Import urllib.parse.urlparseif not present.
- In the assertion, parse each URL from the mocked calls and assert that scheme, hostname, and port equal what is expected ("https","test-account.snowflakecomputing.com", and443).
Modify only the relevant region (lines 717–724), leaving the rest of the code unchanged.
- 
    
    
    Copy modified line R717 
- 
    
    
    Copy modified lines R720-R728 
| @@ -714,10 +714,18 @@ | ||
| cnx.commit = cnx.rollback = lambda: asyncio.sleep( | ||
| 0 | ||
| ) # Skip tear down, there's only a mocked rest api | ||
| from urllib.parse import urlparse | ||
| assert any( | ||
| [ | ||
| c[0][1].startswith( | ||
| "https://test-account.snowflakecomputing.com:443" | ||
| ( | ||
| urlparse(c[0][1]).scheme == "https" | ||
| and urlparse(c[0][1]).hostname == "test-account.snowflakecomputing.com" | ||
| and ( | ||
| urlparse(c[0][1]).port == 443 | ||
| or ( | ||
| urlparse(c[0][1]).port is None and ":443" in urlparse(c[0][1]).netloc | ||
| ) | ||
| ) | ||
| ) | ||
| for c in mocked_fetch.call_args_list | ||
| ] | 
| # azure_request_present = False | ||
| expected_token_prefix = "sig=" | ||
| for line in caplog.text.splitlines(): | ||
| if "blob.core.windows.net" in line and expected_token_prefix in line: | 
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
blob.core.windows.net
          
            
              
                
              
            
            Show autofix suggestion
            Hide autofix suggestion
          
      Copilot Autofix
AI 3 days ago
To correctly check whether the log line includes an Azure blob storage URL (i.e., whether the hostname is actually blob.core.windows.net or a subdomain such as *.blob.core.windows.net), parse any URLs found in the log line, and check if any of their parsed hostnames end with .blob.core.windows.net or are exactly blob.core.windows.net.
Since log lines may contain more than just URLs, search each line for substrings that look like URLs (e.g., using re.findall with a URL pattern), and then use urllib.parse.urlparse to extract and analyze the hostname.
Required changes:
- Add an import for reand forurlparse(from urllib.parse import urlparse).
- Replace the condition at line 97 so that, for each line in caplog.text.splitlines(), all URLs within the line are parsed, and their hostnames checked to ensure thathostnameis exactly or ends with.blob.core.windows.net.
- If any such URL is found and contains the expected token, trigger the assertion.
- 
    
    
    Copy modified lines R18-R20 
- 
    
    
    Copy modified lines R100-R108 
| @@ -15,6 +15,9 @@ | ||
|  | ||
| import pytest | ||
|  | ||
| import re | ||
| from urllib.parse import urlparse | ||
|  | ||
| from snowflake.connector.constants import UTF8 | ||
| from snowflake.connector.file_transfer_agent import ( | ||
| SnowflakeAzureProgressPercentage, | ||
| @@ -94,7 +97,15 @@ | ||
| # azure_request_present = False | ||
| expected_token_prefix = "sig=" | ||
| for line in caplog.text.splitlines(): | ||
| if "blob.core.windows.net" in line and expected_token_prefix in line: | ||
| urls = re.findall(r'(https?://[^\s\'"\\]+)', line) | ||
| has_azure_blob_url = any( | ||
| (hostname := urlparse(url).hostname) | ||
| and ( | ||
| hostname == "blob.core.windows.net" or hostname.endswith(".blob.core.windows.net") | ||
| ) | ||
| for url in urls | ||
| ) | ||
| if has_azure_blob_url and expected_token_prefix in line: | ||
| # azure_request_present = True | ||
| # getattr is used to stay compatible with old driver - before SECRET_STARRED_MASK_STR was added | ||
| assert ( | 
| pass | ||
|  | ||
|  | 
Check failure
Code scanning / CodeQL
Use of weak cryptographic key High test
1024
          
            
              
                
              
            
            Show autofix suggestion
            Hide autofix suggestion
          
      Copilot Autofix
AI 3 days ago
To fix this vulnerability, update the rsa.generate_private_key() call to use a key_size of at least 2048 bits in place of the current 1024. This change will ensure the generated RSA key meets current security best practices and cryptographic standards. You do not need to change any logic or functionality; simply replace key_size=1024 with key_size=2048 or greater in the affected function. No new imports or definitions are required, and the fix should be made in test/unit/test_ocsp.py within the create_x509_cert function.
- 
    
    
    Copy modified line R152 
| @@ -149,7 +149,7 @@ | ||
| def create_x509_cert(hash_algorithm): | ||
| # Generate a private key | ||
| private_key = rsa.generate_private_key( | ||
| public_exponent=65537, key_size=1024, backend=default_backend() | ||
| public_exponent=65537, key_size=2048, backend=default_backend() | ||
| ) | ||
|  | ||
| # Generate a public key | 
…g env variable (#2488) Co-authored-by: Maxim Mishchenko <[email protected]>
Port PR: snowflakedb/Stored-Proc-Python-Connector#225 I noticed that get_results_from_sfqid assumes that fetchall returns a tuple when that's not necessarily the case. Fixing it here + adding a test that fails without the fix. ``` E AssertionError: assert [{'multiple s...ccessfully.'}] == [{'1': 1}] E At index 0 diff: {'multiple statement execution': 'Multiple statements executed successfully.'} != {'1': 1} E Full diff: E - [{'1': 1}] E + [{'multiple statement execution': 'Multiple statements executed successfully.'}] ```
Co-authored-by: Berkay Öztürk <[email protected]> (cherry picked from commit 87c623e)
…er for async with SessionWithProxy
Please answer these questions before submitting your pull requests. Thanks!
What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes #NNNN
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Please write a short description of how your code change solves the related issue.
(Optional) PR for stored-proc connector: