|
| 1 | +## Data Loader Module |
| 2 | + |
| 3 | +This module provides a framework for loading data from various external sources into DuckDB. It follows an abstract base class pattern to ensure consistent implementation across different data sources. |
| 4 | + |
| 5 | +### Building a New Data Loader |
| 6 | + |
| 7 | +The abstract class `ExternalDataLoader` defines the data loader interface. Each concrete implementation (e.g., `KustoDataLoader`, `MySQLDataLoader`) handles specific data source connections and data ingestion. |
| 8 | + |
| 9 | +To create a new data loader: |
| 10 | + |
| 11 | +1. Create a new class that inherits from `ExternalDataLoader` |
| 12 | +2. Implement the required abstract methods: |
| 13 | + - `list_params()`: Define required connection parameters |
| 14 | + - `__init__()`: Initialize connection to data source |
| 15 | + - `list_tables()`: List available tables/views |
| 16 | + - `ingest_data()`: Load data from source |
| 17 | + - `view_query_sample()`: Preview query results |
| 18 | + - `ingest_data_from_query()`: Load data from custom query |
| 19 | +3. Register the new class into `__init__.py` so that the front-end can automatically discover the new data loader. |
| 20 | + |
| 21 | +The UI automatically provide the query completion option to help user generate queries for the given data loader (from NL or partial queries). |
| 22 | + |
| 23 | +### Example Implementations |
| 24 | + |
| 25 | +- `KustoDataLoader`: Azure Data Explorer (Kusto) integration |
| 26 | +- `MySQLDataLoader`: MySQL database integration |
| 27 | + |
| 28 | +### Testing |
| 29 | + |
| 30 | +Ensure your implementation: |
| 31 | +- Handles connection errors gracefully |
| 32 | +- Properly sanitizes table names |
| 33 | +- Respects size limits for data ingestion |
| 34 | +- Returns consistent metadata format |
| 35 | + |
| 36 | +Launch the front-end and test the data loader. |
0 commit comments