-
Notifications
You must be signed in to change notification settings - Fork 255
feat: Trino connection support #2576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
paulteehan
wants to merge
24
commits into
main
Choose a base branch
from
platl-333-trino-connection-support
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+830
−27
Open
Changes from all commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
aa53622
Feat: Trino connection support
paulteehan 61a98db
Add soda-trino to reqs
paulteehan 62e37f4
Feat: Trino connections
paulteehan 2d9f233
Generate and test JWT auth
paulteehan 3bcdc85
Fix commenting; add last tested date
paulteehan 08ef1ce
Update uv.lock
paulteehan 8863d37
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] d63c324
Cleanup
paulteehan 21f23df
Formatting
paulteehan 8250bff
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 54367b6
Merge branch 'main' into platl-333-trino-connection-support
paulteehan 6658a44
Correct versions
paulteehan 1cdb9cb
Remove trino from CI
paulteehan 73fb082
Do not release Trino yet
paulteehan 4c48c7c
Cleanup data types
paulteehan 1cea257
Remove unnecessary file
paulteehan 3a7c2a9
Ignore keys in git
paulteehan ad61418
Correct to 3.10
paulteehan 39f04ba
Handle 'None' properly
paulteehan ecbd8ac
Formatting
paulteehan a87afa6
Cleanup
paulteehan e821fc7
Reset config to prevent premature installs
paulteehan 0d0bf89
Add SecretStr
paulteehan 33b081c
Add soda-trino to config
paulteehan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -153,3 +153,7 @@ reports/* | |
| *.lock | ||
| !uv.lock | ||
| .local/ | ||
|
|
||
| # keys | ||
| *.pem | ||
| *.jks | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,4 +10,5 @@ soda-snowflake | |
| soda-sqlserver | ||
| soda-synapse | ||
| soda-tests | ||
| soda-sparkdf | ||
| soda-sparkdf | ||
| soda-trino | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,3 @@ | ||
| #!/bin/bash | ||
|
|
||
| printf "["; find . -maxdepth 1 -mindepth 1 -type d -name "soda*" -printf '"%f",' | sed 's/,$//'; printf "]" | ||
| printf "["; find . -maxdepth 1 -mindepth 1 -type d -name "soda*" -not -name "soda-trino" -printf '"%f",' | sed 's/,$//'; printf "]" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,3 +1,3 @@ | ||
| #!/bin/bash | ||
|
|
||
| printf "[";find . -maxdepth 1 -mindepth 1 -type d -name "soda*" -not -name "soda-core" -not -name "soda-tests" -printf '"%f",' | sed 's/,$//;s/soda-//g';printf "]" | ||
| printf "[";find . -maxdepth 1 -mindepth 1 -type d -name "soda*" -not -name "soda-core" -not -name "soda-tests" -not -name "soda-trino" -printf '"%f",' | sed 's/,$//;s/soda-//g';printf "]" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| # Trino JWT Testing | ||
|
|
||
| Use the `docker-compose.yml` and associated config files to launch a local Trino instance configured for JWT authentication. | ||
|
|
||
| To generate the keys, run from within the `local_instance` directory: | ||
|
|
||
| ``` | ||
| openssl genrsa -out jwt-private.pem 2048 | ||
| openssl rsa -in jwt-private.pem -pubout -out jwt-public.pem | ||
| keytool -genkeypair -alias trino -keyalg RSA -keystore trino-config/keystore.jks \ | ||
| -storepass changeit -keypass changeit -dname "CN=localhost" -validity 365 | ||
| cp jwt-public.pem trino-config/ | ||
| ``` | ||
|
|
||
| To start Trino: | ||
|
|
||
| ``` | ||
| docker compose up | ||
| ``` | ||
| If the following runs without error, your instance is up | ||
| ``` | ||
| curl -k https://localhost:8443/v1/info " | ||
| ``` | ||
|
|
||
|
|
||
|
|
||
|
|
||
| To generate a JWT token: | ||
|
|
||
| ``` | ||
| jwt -sign - \ | ||
| -alg RS256 \ | ||
| -key jwt-private.pem <<'EOF' | ||
| { | ||
| "sub": "test-user", | ||
| "iss": "local", | ||
| "aud": "trino" | ||
| } | ||
| EOF | ||
| ``` | ||
|
|
||
| If the following runs without error, your token is valid: | ||
| ``` | ||
| curl -k https://localhost:8443/v1/query -H "Authorization: Bearer {token}" | ||
| ``` | ||
|
|
||
|
|
||
| Copy the token into your .env file along with these env vars | ||
| ``` | ||
| TRINO_HOST="localhost" | ||
| TRINO_PORT=8443 | ||
| TRINO_CATALOG="system" | ||
| TRINO_JWT_TOKEN="{token}" | ||
| ``` | ||
|
|
||
| Uncomment and run the `real_jwt_token` test in `test_trino.py`. The test should pass. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| services: | ||
| trino: | ||
| image: trinodb/trino:latest | ||
| ports: | ||
| - "8443:8443" | ||
| volumes: | ||
| - ./trino-config:/etc/trino |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| coordinator=true | ||
| node-scheduler.include-coordinator=true | ||
| http-server.http.enabled=false | ||
| http-server.https.enabled=true | ||
| http-server.https.port=8443 | ||
| http-server.https.keystore.path=/etc/trino/keystore.jks | ||
| http-server.https.keystore.key=changeit | ||
| http-server.authentication.type=jwt | ||
| http-server.authentication.jwt.key-file=/etc/trino/jwt-public.pem | ||
| discovery.uri=https://localhost:8443 | ||
| internal-communication.shared-secret=dev-secret | ||
| internal-communication.https.required=true | ||
| http-server.process-forwarded=true |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| -server | ||
| -Xmx1G | ||
| -XX:+UseG1GC | ||
| -XX:+ExitOnOutOfMemoryError |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| node.environment=local | ||
| node.data-dir=/data/trino |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| [project] | ||
| name = "soda-trino" | ||
| version = "4.0.7rc0" | ||
| description = "Soda Trino V4" | ||
| requires-python = ">=3.10" | ||
| license = {text = "Proprietary"} | ||
| authors = [ | ||
| {name = "Soda Data N.V.", email = "info@soda.io"} | ||
| ] | ||
| dependencies = [ | ||
| "soda-core==4.0.7rc0", | ||
| "trino>=0.336.0" | ||
| ] | ||
|
|
||
| [project.entry-points."soda.plugins.data_source.trino"] | ||
| TrinoDataSourceImpl = "soda_trino.common.data_sources.trino_data_source:TrinoDataSourceImpl" | ||
|
|
||
| [tool.uv.sources] | ||
| soda-core = { workspace = true } | ||
|
|
||
| [build-system] | ||
| requires = ["setuptools>=45", "wheel"] | ||
| build-backend = "setuptools.build_meta" | ||
|
|
||
| [tool.setuptools] | ||
| package-dir = {"" = "src"} |
35 changes: 35 additions & 0 deletions
35
soda-trino/src/soda_trino/common/data_sources/trino_data_source.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| import logging | ||
| from typing import Optional | ||
|
|
||
| from soda_core.common.data_source_connection import DataSourceConnection | ||
| from soda_core.common.data_source_impl import DataSourceImpl | ||
| from soda_core.common.logging_constants import soda_logger | ||
| from soda_core.common.sql_dialect import SqlDialect | ||
| from soda_trino.common.data_sources.trino_data_source_connection import ( | ||
| TrinoDataSource as TrinoDataSourceModel, | ||
| ) | ||
| from soda_trino.common.data_sources.trino_data_source_connection import ( | ||
| TrinoDataSourceConnection, | ||
| ) | ||
|
|
||
| logger: logging.Logger = soda_logger | ||
|
|
||
|
|
||
| # placeholder file | ||
|
|
||
|
|
||
| class TrinoDataSourceImpl(DataSourceImpl, model_class=TrinoDataSourceModel): | ||
| def __init__(self, data_source_model: TrinoDataSourceModel, connection: Optional[DataSourceConnection] = None): | ||
| super().__init__(data_source_model=data_source_model, connection=connection) | ||
|
|
||
| def _create_sql_dialect(self) -> SqlDialect: | ||
| return TrinoSqlDialect(data_source_impl=self) | ||
|
|
||
| def _create_data_source_connection(self) -> DataSourceConnection: | ||
| return TrinoDataSourceConnection( | ||
| name=self.data_source_model.name, connection_properties=self.data_source_model.connection_properties | ||
| ) | ||
|
|
||
|
|
||
| class TrinoSqlDialect(SqlDialect): | ||
| pass |
141 changes: 141 additions & 0 deletions
141
soda-trino/src/soda_trino/common/data_sources/trino_data_source_connection.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,141 @@ | ||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| from abc import ABC | ||
| from typing import Literal, Optional, Union | ||
|
|
||
| import requests | ||
| import trino | ||
| from pydantic import BaseModel, Field, IPvAnyAddress, SecretStr | ||
| from soda_core.common.logging_constants import soda_logger | ||
|
|
||
| logger: logging.Logger = soda_logger | ||
|
|
||
|
|
||
| from soda_core.common.data_source_connection import DataSourceConnection | ||
| from soda_core.model.data_source.data_source import DataSourceBase | ||
| from soda_core.model.data_source.data_source_connection_properties import ( | ||
| DataSourceConnectionProperties, | ||
| ) | ||
|
|
||
|
|
||
| class TrinoConnectionProperties(DataSourceConnectionProperties): | ||
| host: Union[str, IPvAnyAddress] = Field(..., description="Database host (hostname or IP address)") | ||
| catalog: str = Field(..., description="Database catalog") | ||
| port: int = Field(5432, description="Database port (1-65535)", ge=1, le=65535) | ||
| http_scheme: Literal["https", "http"] = Field("https", description="HTTP scheme") | ||
| http_headers: Optional[dict[str, str]] = Field(None, description="HTTP headers") | ||
| source: str = Field("soda-core", description="Trino-internal label for this connection") | ||
| client_tags: Optional[list[str]] = Field(None, description="Trino-internal tags as list of strings.") | ||
| verify: Optional[bool] = Field(True, description="Verify SSL certificate") | ||
|
|
||
|
|
||
| class TrinoUserPasswordConnectionProperties(TrinoConnectionProperties): | ||
| # Default if authType not specified | ||
| auth_type: Optional[Literal["BasicAuthentication"]] = Field( | ||
| "BasicAuthentication", description="Authentication type" | ||
| ) | ||
| user: str = Field(..., description="Database username") | ||
| password: SecretStr = Field(..., description="Database password") | ||
|
|
||
|
|
||
| class TrinoJWTConnectionProperties(TrinoConnectionProperties): | ||
| auth_type: Literal["JWTAuthentication"] = Field(description="Authentication type") | ||
| access_token: SecretStr = Field(..., description="JWT access token") | ||
| user: Optional[str] = Field(None, description="Database username") | ||
|
|
||
|
|
||
| class TrinoOauthPayload(BaseModel): | ||
| token_url: str = Field(..., description="Token URL") | ||
| client_id: str = Field(..., description="Client ID") | ||
| client_secret: SecretStr = Field(..., description="Client secret") | ||
| scope: Optional[str] = Field(None, description="Scope") | ||
| grant_type: Optional[str] = Field("client_credentials", description="Grant type") | ||
|
|
||
|
|
||
| class TrinoOauthConnectionProperties(TrinoConnectionProperties): | ||
| auth_type: Literal["OAuth2ClientCredentialsAuthentication"] = Field(description="Authentication type") | ||
| oauth: TrinoOauthPayload = Field(..., description="OAuth configuration") | ||
| user: Optional[str] = Field(None, description="Database username") | ||
|
|
||
|
|
||
| class TrinoNoAuthenticationConnectionProperties(TrinoConnectionProperties): | ||
| auth_type: Literal["NoAuthentication"] = Field(description="Authentication type") | ||
paulteehan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| class TrinoDataSource(DataSourceBase, ABC): | ||
| type: Literal["trino"] = Field("trino") | ||
|
|
||
| connection_properties: Union[ | ||
| TrinoUserPasswordConnectionProperties, | ||
| TrinoJWTConnectionProperties, | ||
| TrinoOauthConnectionProperties, | ||
| TrinoNoAuthenticationConnectionProperties, | ||
| ] = Field(..., alias="connection", description="Trino connection configuration") | ||
|
|
||
|
|
||
| class TrinoDataSourceConnection(DataSourceConnection): | ||
| def __init__(self, name: str, connection_properties: DataSourceConnectionProperties): | ||
| super().__init__(name, connection_properties) | ||
|
|
||
| def _create_connection( | ||
| self, | ||
| config: TrinoConnectionProperties, | ||
| ): | ||
| if isinstance(config, TrinoUserPasswordConnectionProperties): | ||
| self.auth = trino.auth.BasicAuthentication(config.user, config.password.get_secret_value()) | ||
| elif isinstance(config, TrinoJWTConnectionProperties): | ||
| self.auth = trino.auth.JWTAuthentication(token=config.access_token.get_secret_value()) | ||
| elif isinstance(config, TrinoOauthConnectionProperties): | ||
| # Use OAuth to get a JWT access token | ||
| # Note, this is a JWTAuthentication flow, not to be confused with OAuth2Authentication which launches a web browser | ||
| token = self._exchange_oauth_for_access_token(config.oauth) | ||
| self.auth = trino.auth.JWTAuthentication(token=token) | ||
| elif isinstance(config, TrinoNoAuthenticationConnectionProperties): | ||
| self.auth = None | ||
| else: | ||
| raise ValueError(f"Unrecognized Trino authentication type: {config.authType}") | ||
|
|
||
| connect_kwargs = { | ||
| "host": config.host, | ||
| "port": config.port, | ||
| "catalog": config.catalog, | ||
| "http_scheme": config.http_scheme, | ||
| "auth": self.auth, | ||
| "http_headers": config.http_headers, | ||
| "source": config.source, | ||
| "client_tags": config.client_tags, | ||
| "verify": config.verify, | ||
| } | ||
|
|
||
| if getattr(config, "user", None): | ||
| connect_kwargs["user"] = config.user | ||
| return trino.dbapi.connect(**connect_kwargs) | ||
|
|
||
| def _exchange_oauth_for_access_token(self, oauth: TrinoOauthPayload) -> str: | ||
| if not oauth: | ||
| raise ValueError("OAuth configuration is required for OAuth2ClientCredentialsAuthentication") | ||
|
|
||
| token_url = oauth.token_url | ||
| client_id = oauth.client_id | ||
| client_secret = oauth.client_secret.get_secret_value() | ||
| scope = oauth.scope | ||
| grant_type = oauth.grant_type | ||
|
|
||
| # OAuth credentials | ||
| payload = {"client_id": client_id, "client_secret": client_secret, "grant_type": grant_type} | ||
| if scope: | ||
| payload["scope"] = scope | ||
| response = requests.post(token_url, data=payload) | ||
| if response.status_code != 200: | ||
| raise ValueError(f"OAuth request failed: {response.status_code} {response.text}") | ||
|
|
||
| response_json = response.json() | ||
| expires_in = response_json.get("expires_in", 0) | ||
| scope = response_json.get("scope", "") | ||
| access_token = response_json["access_token"] | ||
| if access_token: | ||
| logger.info(f"Obtained OAuth access token, expires in '{expires_in}' seconds, granted scopes: '{scope}'") | ||
| return access_token | ||
| else: | ||
| raise ValueError(f"OAuth request did not return an access token: {response.status_code} {response.text}") | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.