Skip to content

Commit 707d742

Browse files
Merge pull request #195 from ZAK1504/feature/bigquery-adapter
Add BigQuery adapter implementation (Fixes #117)
2 parents 2157bc0 + 0b8eb90 commit 707d742

File tree

9 files changed

+804
-2
lines changed

9 files changed

+804
-2
lines changed
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
---
2+
sidebar_position: 7
3+
---
4+
5+
# BigQuery
6+
7+
`intugle` integrates with Google BigQuery, allowing you to read data from your datasets for profiling, analysis, and data product generation.
8+
9+
## Installation
10+
11+
To use `intugle` with BigQuery, you must install the optional dependencies:
12+
13+
```bash
14+
pip install "intugle[bigquery]"
15+
```
16+
17+
This installs the `google-cloud-bigquery` library.
18+
19+
## Configuration
20+
21+
To connect to your BigQuery project, you must provide connection credentials in a `profiles.yml` file at the root of your project. The adapter looks for a top-level `bigquery:` key.
22+
23+
**Example `profiles.yml`:**
24+
25+
```yaml
26+
bigquery:
27+
name: my_bigquery_source
28+
project_id: <your_gcp_project_id>
29+
dataset: <your_dataset_name>
30+
location: US # Optional, defaults to US
31+
credentials_path: /path/to/service-account-credentials.json # Optional
32+
```
33+
34+
### Authentication Options
35+
36+
1. **Service Account JSON File** (Recommended for production):
37+
- Set `credentials_path` to your service account JSON file.
38+
- The service account needs **BigQuery Data Viewer** and **BigQuery Job User** roles.
39+
40+
2. **Application Default Credentials** (For development):
41+
- Omit `credentials_path`.
42+
- Uses `gcloud auth application-default login`.
43+
- Or uses environment variable `GOOGLE_APPLICATION_CREDENTIALS`.
44+
45+
## Usage
46+
47+
### Reading Data from BigQuery
48+
49+
To include a BigQuery table or view in your `SemanticModel`, define it in your input dictionary with `type: "bigquery"` and use the `identifier` key to specify the table name.
50+
51+
:::caution Important
52+
The dictionary key for your dataset (e.g., `"my_table"`) must exactly match the table name specified in the `identifier`.
53+
:::
54+
55+
```python
56+
from intugle import SemanticModel
57+
58+
datasets = {
59+
"my_table": {
60+
"identifier": "my_table", # Must match the key above
61+
"type": "bigquery"
62+
},
63+
"another_view": {
64+
"identifier": "another_view",
65+
"type": "bigquery"
66+
}
67+
}
68+
69+
# Initialize the semantic model
70+
sm = SemanticModel(datasets, domain="Analytics")
71+
72+
# Build the model as usual
73+
sm.build()
74+
```
75+
76+
### Materializing Data Products
77+
78+
When you use the `DataProduct` class with a BigQuery connection, the resulting data product can be materialized as a new **table** or **view** directly within your target dataset.
79+
80+
:::caution
81+
**Beta Feature:** The DataProduct feature for BigQuery is currently in beta. If you encounter any issues, please raise them on our [GitHub issues page](https://github.com/Intugle/data-tools/issues).
82+
:::
83+
84+
```python
85+
from intugle import DataProduct
86+
87+
etl_model = {
88+
"name": "top_users",
89+
"fields": [
90+
{"id": "users.id", "name": "user_id"},
91+
{"id": "users.name", "name": "user_name"},
92+
]
93+
}
94+
95+
dp = DataProduct()
96+
97+
# Materialize as a view (default)
98+
dp.build(etl_model, materialize="view")
99+
100+
# Materialize as a table
101+
dp.build(etl_model, materialize="table")
102+
```
103+
104+
:::info Required Permissions
105+
To successfully materialise data products, the Service Account or User must have the following privileges:
106+
* `roles/bigquery.dataViewer` - Read table data
107+
* `roles/bigquery.jobUser` - Run queries
108+
* `roles/bigquery.dataEditor` - Create tables and views
109+
:::

docsite/docs/connectors/implementing-a-connector.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
sidebar_position: 7
2+
sidebar_position: 8
33
---
44

55
# Implementing a Connector

pyproject.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,10 @@ postgres = [
6969
"asyncpg>=0.30.0",
7070
"sqlglot>=27.20.0",
7171
]
72+
bigquery = [
73+
"google-cloud-bigquery>=3.11.0",
74+
"sqlglot>=27.20.0",
75+
]
7276
sqlserver = [
7377
"mssql-python>=0.13.1",
7478
"sqlglot>=27.20.0",

src/intugle/adapters/factory.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ def is_safe_plugin_name(plugin_name: str) -> bool:
3636
"intugle.adapters.types.mariadb.mariadb",
3737
"intugle.adapters.types.sqlserver.sqlserver",
3838
"intugle.adapters.types.sqlite.sqlite",
39+
"intugle.adapters.types.bigquery.bigquery",
3940
"intugle.adapters.types.oracle.oracle",
4041
]
4142

src/intugle/adapters/models.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ def get_dataset_data_type() -> type:
1919
if TYPE_CHECKING:
2020
import pandas as pd
2121

22+
from intugle.adapters.types.bigquery.models import BigQueryConfig
2223
from intugle.adapters.types.databricks.models import DatabricksConfig
2324
from intugle.adapters.types.duckdb.models import DuckdbConfig
2425
from intugle.adapters.types.mariadb.models import MariaDBConfig
@@ -28,7 +29,7 @@ def get_dataset_data_type() -> type:
2829
from intugle.adapters.types.sqlite.models import SqliteConfig
2930
from intugle.adapters.types.sqlserver.models import SQLServerConfig
3031

31-
DataSetData = pd.DataFrame | DuckdbConfig | SnowflakeConfig | DatabricksConfig | PostgresConfig | SQLServerConfig | SqliteConfig | OracleConfig | MariaDBConfig
32+
DataSetData = pd.DataFrame | DuckdbConfig | SnowflakeConfig | DatabricksConfig | PostgresConfig | SQLServerConfig | SqliteConfig | OracleConfig | MariaDBConfig | BigQueryConfig
3233
else:
3334
# At runtime, this is dynamically determined
3435
DataSetData = Any
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
# BigQuery adapter for Intugle

0 commit comments

Comments
 (0)