|
| 1 | +# LLM-Powered Sentiment Analysis |
| 2 | + |
| 3 | +QuantTradeAI can attach numeric sentiment scores to any dataset that includes a text column. The feature uses [LiteLLM](https://github.com/BerriAI/litellm) so you can swap providers and models without changing code. |
| 4 | + |
| 5 | +## Setup |
| 6 | + |
| 7 | +1. **Install QuantTradeAI (if not already installed)** |
| 8 | + ```bash |
| 9 | + poetry install |
| 10 | + ``` |
| 11 | +2. **Configure `features_config.yaml`** |
| 12 | + ```yaml |
| 13 | + sentiment: |
| 14 | + enabled: true |
| 15 | + provider: openai # e.g. openai, anthropic, huggingface, ollama |
| 16 | + model: gpt-3.5-turbo # model name for the chosen provider |
| 17 | + api_key_env_var: OPENAI_API_KEY |
| 18 | + extra: {} # optional LiteLLM params |
| 19 | + ``` |
| 20 | +3. **Set the API key** |
| 21 | + ```bash |
| 22 | + export OPENAI_API_KEY="sk-..." |
| 23 | + ``` |
| 24 | + |
| 25 | +### Switching Providers |
| 26 | +Change `provider`, `model`, and `api_key_env_var` in the YAML config. Example for Anthropic: |
| 27 | +```yaml |
| 28 | +sentiment: |
| 29 | + enabled: true |
| 30 | + provider: anthropic |
| 31 | + model: claude-instant-1 |
| 32 | + api_key_env_var: ANTHROPIC_API_KEY |
| 33 | +``` |
| 34 | +No code changes are required—update the YAML and corresponding environment variable. |
| 35 | +
|
| 36 | +## Usage |
| 37 | +
|
| 38 | +### Command Line |
| 39 | +```bash |
| 40 | +# with sentiment enabled in features_config.yaml |
| 41 | +export OPENAI_API_KEY="sk-..." |
| 42 | +poetry run quanttradeai train |
| 43 | +``` |
| 44 | +The training pipeline loads sentiment settings automatically and adds a `sentiment_score` column. |
| 45 | + |
| 46 | +### Python |
| 47 | +```python |
| 48 | +import pandas as pd |
| 49 | +from quanttradeai.data import DataProcessor |
| 50 | + |
| 51 | +# DataFrame must contain a 'text' column |
| 52 | +raw = pd.DataFrame({"text": ["Stock surges on earnings", "Lawsuit worries investors"]}) |
| 53 | +processor = DataProcessor("config/features_config.yaml") |
| 54 | +features = processor.process_data(raw) |
| 55 | +print(features[["text", "sentiment_score"]]) |
| 56 | +``` |
| 57 | + |
| 58 | +## Input & Output |
| 59 | +- **Input**: DataFrame with a `text` column |
| 60 | +- **Output**: New `sentiment_score` column with values in `[-1, 1]` |
| 61 | +- Added via the `generate_sentiment` step in the feature pipeline |
| 62 | + |
| 63 | +## Troubleshooting |
| 64 | +| Problem | Fix | |
| 65 | +|--------|-----| |
| 66 | +| `API key environment variable ... is not set` | Ensure the variable from `api_key_env_var` is exported | |
| 67 | +| Unsupported provider/model | Verify the provider/model pair is supported by LiteLLM | |
| 68 | +| `sentiment enabled but 'text' column not found` | Provide a `text` column in your input DataFrame | |
| 69 | +| Network/API errors | Check connectivity, credentials, or provider status | |
0 commit comments