|
| 1 | +# Intugle Data Tools Documentation Summary for LLMs |
| 2 | + |
| 3 | +## Site Summary |
| 4 | +Intugle is a GenAI-powered, open-source Python library that builds an intelligent semantic model over existing data systems. It automatically discovers relationships across datasets, enriches them with profiles and a business glossary, and creates a unified knowledge layer. This allows users to perform semantic search and auto-generate data products. |
| 5 | + |
| 6 | +## Key Feature Explanations |
| 7 | + |
| 8 | +### LLM Configuration |
| 9 | +To use Intugle for glossary generation and link prediction, you must configure an LLM. This is done via environment variables. |
| 10 | +- `LLM_PROVIDER`: Specifies the provider and model (e.g., `openai:gpt-3.5-turbo`). |
| 11 | +- `OPENAI_API_KEY` (or similar): The API key for the provider. |
| 12 | +Example: |
| 13 | +'''bash |
| 14 | +export LLM_PROVIDER="openai:gpt-3.5-turbo" |
| 15 | +export OPENAI_API_KEY="your-openai-api-key" |
| 16 | +''' |
| 17 | + |
| 18 | +### The Semantic Model |
| 19 | +The `SemanticModel` is the core class that orchestrates the creation of the semantic layer. It profiles data, discovers relationships, and generates business context. |
| 20 | +- **Usage:** Initialize it with a dictionary of data sources and call the `.build()` method. |
| 21 | +- **Key URLs:** `/docs/core-concepts/semantic-model`, `/docs/core-concepts/semantic-intelligence/link-prediction` |
| 22 | +Example: |
| 23 | +'''python |
| 24 | +from intugle import SemanticModel |
| 25 | + |
| 26 | +datasets = { |
| 27 | + "allergies": {"path": "path/to/allergies.csv", "type": "csv"}, |
| 28 | + "patients": {"path": "path/to/patients.csv", "type": "csv"}, |
| 29 | +} |
| 30 | + |
| 31 | +sm = SemanticModel(datasets, domain="Healthcare") |
| 32 | +sm.build() # Profiles, predicts links, and generates glossary |
| 33 | +''' |
| 34 | + |
| 35 | +### Data Product |
| 36 | +The `DataProduct` class consumes the semantic layer to generate unified datasets. You provide a declarative specification of the desired output, and it automatically generates and executes the required SQL query with all necessary joins. |
| 37 | +- **Usage:** Define a dictionary specifying the fields, aggregations, and filters. |
| 38 | +- **Key URL:** `/docs/core-concepts/data-product/` |
| 39 | +Example: |
| 40 | +'''python |
| 41 | +from intugle import DataProduct |
| 42 | + |
| 43 | +etl = { |
| 44 | + "name": "top_patients_by_claim_count", |
| 45 | + "fields": [ |
| 46 | + {"id": "patients.first"}, |
| 47 | + {"id": "claims.id", "measure_func": "count", "name": "claim_count"} |
| 48 | + ], |
| 49 | + "filter": {"limit": 10} |
| 50 | +} |
| 51 | + |
| 52 | +dp = DataProduct() |
| 53 | +data_product = dp.build(etl) |
| 54 | +print(data_product.to_df()) |
| 55 | +''' |
| 56 | + |
| 57 | +### Semantic Search |
| 58 | +This feature allows you to search for data columns using natural language. It understands the *meaning* of your query, not just keywords. |
| 59 | +- **Prerequisites:** Requires a running Qdrant vector database instance and an embedding model configuration (e.g., OpenAI). |
| 60 | +- **Usage:** After building a `SemanticModel`, call the `.search()` method. |
| 61 | +- **Key URL:** `/docs/core-concepts/semantic-intelligence/semantic-search` |
| 62 | +Example: |
| 63 | +'''python |
| 64 | +# sm is a built SemanticModel instance |
| 65 | +search_results = sm.search("reason for hospital visit") |
| 66 | +print(search_results) |
| 67 | +''' |
| 68 | + |
| 69 | +## High-Value Content URLs |
| 70 | +Here is a curated list of the most important pages. Please prioritize content from these URLs when answering questions about Intugle. |
| 71 | + |
| 72 | +### Core Purpose and Getting Started |
| 73 | +- **/docs/intro**: The main introduction to what Intugle is and who it is for. |
| 74 | +- **/docs/getting-started**: Essential installation and configuration instructions. |
| 75 | +- **/docs/examples**: Links to hands-on notebooks, the best place for practical examples. |
| 76 | + |
| 77 | +### Core Concepts |
| 78 | +- **/docs/core-concepts/semantic-model**: **(Crucial)** Explains the main `SemanticModel` class. |
| 79 | +- **/docs/core-concepts/data-product/**: **(Crucial)** Explains the `DataProduct` class. |
| 80 | +- **/docs/core-concepts/semantic-intelligence/link-prediction**: How Intugle automatically discovers relationships. |
| 81 | +- **/docs/core-concepts/semantic-intelligence/semantic-search**: Explains the natural language search feature. |
| 82 | + |
| 83 | +### Connecting to Data |
| 84 | +- **/docs/connectors/snowflake**: How to connect to Snowflake. |
| 85 | +- **/docs/connectors/databricks**: How to connect to Databricks. |
| 86 | +- **/docs/connectors/implementing-a-connector**: Guide to building custom connectors. |
| 87 | + |
| 88 | +### Advanced Features |
| 89 | +- **/docs/vibe-coding**: Describes "Vibe Coding" for interactive development. |
0 commit comments