|
| 1 | +--- |
| 2 | +sidebar_position: 7 |
| 3 | +--- |
| 4 | + |
| 5 | +# BigQuery |
| 6 | + |
| 7 | +`intugle` integrates with Google BigQuery, allowing you to read data from your datasets for profiling, analysis, and data product generation. |
| 8 | + |
| 9 | +## Installation |
| 10 | + |
| 11 | +To use `intugle` with BigQuery, you must install the optional dependencies: |
| 12 | + |
| 13 | +```bash |
| 14 | +pip install "intugle[bigquery]" |
| 15 | +``` |
| 16 | + |
| 17 | +This installs the `google-cloud-bigquery` library. |
| 18 | + |
| 19 | +## Configuration |
| 20 | + |
| 21 | +To connect to your BigQuery project, you must provide connection credentials in a `profiles.yml` file at the root of your project. The adapter looks for a top-level `bigquery:` key. |
| 22 | + |
| 23 | +**Example `profiles.yml`:** |
| 24 | + |
| 25 | +```yaml |
| 26 | +bigquery: |
| 27 | + name: my_bigquery_source |
| 28 | + project_id: <your_gcp_project_id> |
| 29 | + dataset: <your_dataset_name> |
| 30 | + location: US # Optional, defaults to US |
| 31 | + credentials_path: /path/to/service-account-credentials.json # Optional |
| 32 | +``` |
| 33 | +
|
| 34 | +### Authentication Options |
| 35 | +
|
| 36 | +1. **Service Account JSON File** (Recommended for production): |
| 37 | + - Set `credentials_path` to your service account JSON file. |
| 38 | + - The service account needs **BigQuery Data Viewer** and **BigQuery Job User** roles. |
| 39 | + |
| 40 | +2. **Application Default Credentials** (For development): |
| 41 | + - Omit `credentials_path`. |
| 42 | + - Uses `gcloud auth application-default login`. |
| 43 | + - Or uses environment variable `GOOGLE_APPLICATION_CREDENTIALS`. |
| 44 | + |
| 45 | +## Usage |
| 46 | + |
| 47 | +### Reading Data from BigQuery |
| 48 | + |
| 49 | +To include a BigQuery table or view in your `SemanticModel`, define it in your input dictionary with `type: "bigquery"` and use the `identifier` key to specify the table name. |
| 50 | + |
| 51 | +:::caution Important |
| 52 | +The dictionary key for your dataset (e.g., `"my_table"`) must exactly match the table name specified in the `identifier`. |
| 53 | +::: |
| 54 | + |
| 55 | +```python |
| 56 | +from intugle import SemanticModel |
| 57 | +
|
| 58 | +datasets = { |
| 59 | + "my_table": { |
| 60 | + "identifier": "my_table", # Must match the key above |
| 61 | + "type": "bigquery" |
| 62 | + }, |
| 63 | + "another_view": { |
| 64 | + "identifier": "another_view", |
| 65 | + "type": "bigquery" |
| 66 | + } |
| 67 | +} |
| 68 | +
|
| 69 | +# Initialize the semantic model |
| 70 | +sm = SemanticModel(datasets, domain="Analytics") |
| 71 | +
|
| 72 | +# Build the model as usual |
| 73 | +sm.build() |
| 74 | +``` |
| 75 | + |
| 76 | +### Materializing Data Products |
| 77 | + |
| 78 | +When you use the `DataProduct` class with a BigQuery connection, the resulting data product can be materialized as a new **table** or **view** directly within your target dataset. |
| 79 | + |
| 80 | +:::caution |
| 81 | +**Beta Feature:** The DataProduct feature for BigQuery is currently in beta. If you encounter any issues, please raise them on our [GitHub issues page](https://github.com/Intugle/data-tools/issues). |
| 82 | +::: |
| 83 | + |
| 84 | +```python |
| 85 | +from intugle import DataProduct |
| 86 | +
|
| 87 | +etl_model = { |
| 88 | + "name": "top_users", |
| 89 | + "fields": [ |
| 90 | + {"id": "users.id", "name": "user_id"}, |
| 91 | + {"id": "users.name", "name": "user_name"}, |
| 92 | + ] |
| 93 | +} |
| 94 | +
|
| 95 | +dp = DataProduct() |
| 96 | +
|
| 97 | +# Materialize as a view (default) |
| 98 | +dp.build(etl_model, materialize="view") |
| 99 | +
|
| 100 | +# Materialize as a table |
| 101 | +dp.build(etl_model, materialize="table") |
| 102 | +``` |
| 103 | + |
| 104 | +:::info Required Permissions |
| 105 | +To successfully materialise data products, the Service Account or User must have the following privileges: |
| 106 | +* `roles/bigquery.dataViewer` - Read table data |
| 107 | +* `roles/bigquery.jobUser` - Run queries |
| 108 | +* `roles/bigquery.dataEditor` - Create tables and views |
| 109 | +::: |
0 commit comments