You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`intugle` integrates with Snowflake, allowing you to read data from Snowflake tables and deploy your `SemanticModel` as a **Semantic View** in your Snowflake account.
8
+
9
+
## Installation
10
+
11
+
To use `intugle` with Snowflake, you must install the optional dependencies:
12
+
13
+
```bash
14
+
pip install "intugle[snowflake]"
15
+
```
16
+
17
+
This installs the `snowflake-snowpark-python` library.
18
+
19
+
## Configuration
20
+
21
+
The Snowflake adapter can connect using credentials from a `profiles.yml` file or automatically use an active session when running inside a Snowflake Notebook.
22
+
23
+
### Connecting from an External Environment
24
+
25
+
When running `intugle` outside of a Snowflake Notebook, you must provide connection credentials in a `profiles.yml` file at the root of your project. The adapter looks for a top-level `snowflake:` key.
26
+
27
+
**Example `profiles.yml`:**
28
+
29
+
```yaml
30
+
snowflake:
31
+
type: snowflake
32
+
account: <your_snowflake_account>
33
+
user: <your_username>
34
+
password: <your_password>
35
+
role: <your_role>
36
+
warehouse: <your_warehouse>
37
+
database: <your_database>
38
+
schema: <your_schema>
39
+
```
40
+
41
+
### Connecting from a Snowflake Notebook
42
+
43
+
When your code is executed within a Snowflake Notebook, the adapter automatically detects and uses the notebook's active Snowpark session. **No configuration is required.**
44
+
45
+
## Usage
46
+
47
+
### Reading Data from Snowflake
48
+
49
+
To include a Snowflake table in your `SemanticModel`, define it in your input dictionary with `type: "snowflake"` and use the `identifier` key to specify the table name.
50
+
51
+
:::caution Important
52
+
The dictionary key for your dataset (e.g., `"CUSTOMERS"`) must exactly match the table name specified in the `identifier`.
53
+
:::
54
+
55
+
```python
56
+
from intugle import SemanticModel
57
+
58
+
datasets = {
59
+
"CUSTOMERS": {
60
+
"identifier": "CUSTOMERS", # Must match the key above
61
+
"type": "snowflake"
62
+
},
63
+
"ORDERS": {
64
+
"identifier": "ORDERS", # Must match the key above
65
+
"type": "snowflake"
66
+
}
67
+
}
68
+
69
+
# Initialize the semantic model
70
+
sm = SemanticModel(datasets, domain="E-commerce")
71
+
72
+
# Build the model as usual
73
+
sm.build()
74
+
```
75
+
76
+
### Materializing Data Products
77
+
78
+
When you use the `DataProduct` class with a Snowflake connection, the resulting data product will be materialized as a new table directly within your Snowflake schema.
79
+
80
+
### Deploying the Semantic Model
81
+
82
+
Once your semantic model is built, you can deploy it to Snowflake using the `deploy()` method. This process performs two actions:
83
+
1. **Syncs Metadata:** It updates the comments on your physical Snowflake tables and columns with the business glossaries from your `intugle` model.
84
+
2. **Creates Semantic View:** It constructs and executes a `CREATE OR REPLACE SEMANTIC VIEW` statement in your target database and schema.
To successfully deploy a semantic model, the Snowflake role you are using must have the following privileges:
96
+
* `USAGE` on the target database and schema.
97
+
* `CREATE SEMANTIC VIEW` on the target schema.
98
+
* `ALTER TABLE` permissions on the source tables to update their comments.
99
+
:::
100
+
101
+
:::tip Next Steps: Chat with your Data using Cortex Analyst
102
+
Now that you have deployed a Semantic View, you can use **Snowflake Cortex Analyst** to interact with your data using natural language. Cortex Analyst leverages the relationships and context defined in your Semantic View to answer questions without requiring you to write SQL.
103
+
104
+
To get started, navigate to **AI & ML -> Cortex Analyst** in the Snowflake UI and select your newly created view.
Copy file name to clipboardExpand all lines: docsite/docs/core-concepts/data-product/index.md
+9-3Lines changed: 9 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,11 @@ The primary input for the `DataProduct` is a product specification, which you de
26
26
27
27
## Usage Example
28
28
29
-
Using the `DataProduct` is straightforward. Once the `SemanticModel` has successfully built the semantic layer, you can immediately start creating data products.
29
+
Once the `SemanticModel` is built, you can use the `DataProduct` class to generate unified data products from the semantic layer. This allows you to select fields from across different tables, and `intugle` will automatically handle the joins and generate the final, unified dataset.
30
+
31
+
## Building a Data Product
32
+
33
+
To build a data product, you define an product specification model specifying the fields you want, any transformations, and filters.
30
34
31
35
```python
32
36
from intugle import DataProduct
@@ -77,9 +81,11 @@ print(data_product.to_df())
77
81
print(data_product.sql_query)
78
82
```
79
83
80
-
This workflow allows you to rapidly prototype and generate complex, unified datasets by simply describing what you need, letting the `DataProduct` handle the underlying SQL complexity.
84
+
:::info Materialization with Connectors
85
+
When using a database connector like **[Snowflake](../../connectors/snowflake)**, the `build()` method will materialize the data product as a new table directly within your connected database schema. For file-based sources, it is materialized as a view in an in-memory DuckDB database.
86
+
:::
81
87
82
-
For a detailed breakdown of all capabilities with more examples, please see the following pages:
88
+
The `DataProduct` class provides a powerful way to query your connected data without writing complex SQL manually. For more detailed operations, see the other guides in this section:
83
89
84
90
***[Basic Operations](./basic-operations.md)**: Learn how to select, alias, and limit fields.
85
91
***[Sorting](./sorting.md)**: See how to order your data products.
Copy file name to clipboardExpand all lines: docsite/docs/core-concepts/semantic-intelligence/semantic-model.md
+27-31Lines changed: 27 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,50 +5,46 @@ title: Semantic Model
5
5
6
6
# Semantic Model
7
7
8
-
The `SemanticModel` is the primary orchestrator of the data intelligence pipeline. It's the main user-facing class that manages many data sources and runs the end-to-end process of transforming them from raw, disconnected tables into a fully enriched and interconnected semantic layer.
8
+
The `SemanticModel` is the core class in `intugle`. It orchestrates the entire process of profiling, link prediction, and glossary generation to build a unified semantic layer over your data.
9
9
10
-
## Overview
11
-
12
-
At a high level, the `SemanticModel` is responsible for:
10
+
## Initialization
13
11
14
-
1.**Initializing and Managing Datasets**: It takes your raw data sources (for example, file paths) and wraps each one in a `DataSet` object.
15
-
2.**Executing the Semantic Model Pipeline**: It runs a series of analysis stages in a specific, logical order to build up a rich understanding of your data.
16
-
3.**Ensuring Resilience**: The pipeline avoids redundant work. It automatically saves its progress after each major stage, letting you resume an interrupted run without losing completed work.
12
+
You can initialize the `SemanticModel` in two ways, depending on your use case.
17
13
18
-
##Initialization
14
+
### Method 1: From a Dictionary (Recommended)
19
15
20
-
You can initialize the `SemanticModel` in two ways:
16
+
This is the simplest and most common method. You provide a dictionary where each key is a unique name for a dataset, and the value contains its configuration (like path and type).
21
17
22
-
1.**With a Dictionary of File-Based Sources**: This is the most common method. You give a dictionary where keys are the desired names for your datasets and values are dictionary configurations pointing to your data. The `path` can be a local file path or a remote URL (e.g., over HTTPS). Currently, `csv`, `parquet`, and `excel` file formats are supported.
sm = SemanticModel(data_input=data_sources, domain="e-commerce")
33
-
```
30
+
:::info Connecting to Data Sources
31
+
While these examples use local CSV files, `intugle` can connect to various data sources. See our **[Connectors documentation](../../connectors/snowflake)** for details on specific integrations like Snowflake.
32
+
:::
34
33
35
-
2. **With a List of `DataSet` Objects**: If you have already created `DataSet` objects, you can pass a list of them directly.
34
+
### Method 2: From a List of DataSet Objects
36
35
37
-
```python
38
-
from intugle.analysis.models import DataSet
39
-
from intugle import SemanticModel
36
+
For more advanced scenarios, you can initialize the `SemanticModel` with a list of pre-configured `DataSet` objects. This is useful if you have already instantiated `DataSet` objects for other purposes.
sm = SemanticModel(data_input=datasets, domain="e-commerce")
51
-
```
45
+
# Initialize the SemanticModel with the list of objects
46
+
sm = SemanticModel([dataset_allergies, dataset_patients], domain="Healthcare")
47
+
```
52
48
53
49
The `domain` parameter is an optional but highly recommended string that gives context to the underlying AI models, helping them generate more relevant business glossary terms.
0 commit comments