You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+32Lines changed: 32 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,6 +15,38 @@ By combining these features, Data Tools helps you move from a collection of sepa
15
15
-**Extensible Pipeline Architecture**: Easily add custom analysis steps to the pipeline.
16
16
-**DataFrame Agnostic**: Uses a factory pattern to seamlessly handle different dataframe types (e.g., pandas).
17
17
18
+
## Installation and Setup
19
+
20
+
### Installation
21
+
22
+
To install the library and its dependencies, run the following command:
23
+
24
+
```bash
25
+
pip install data_tools
26
+
```
27
+
28
+
### LLM Configuration
29
+
30
+
This library uses LLMs for features like Business Glossary Generation. It supports any LLM provider compatible with LangChain's `init_chat_model` function. To configure your LLM provider, you need to set environment variables.
31
+
32
+
The `LLM_CONFIG` environment variable should be set to your desired model, optionally including the provider, in the format `provider:model_name`. If the provider is omitted, it will try to infer the provider.
33
+
34
+
**For OpenAI:**
35
+
36
+
```bash
37
+
# Provider is optional
38
+
export LLM_CONFIG="gpt-4"
39
+
export OPENAI_API_KEY="your-super-secret-key"
40
+
```
41
+
42
+
**For Google GenAI:**
43
+
44
+
```bash
45
+
export LLM_CONFIG="google_genai:gemini-pro"
46
+
export GOOGLE_API_KEY="your-google-api-key"
47
+
```
48
+
49
+
18
50
## Usage Examples
19
51
20
52
### Example 1: Automated Link Prediction (Primary Use Case)
0 commit comments