Skip to content

Commit 0749243

Browse files
author
GOESTERN-1035771
committed
docs: add LLM demos documentation; feat(examples): add runnable mock-safe demos
1 parent 41e2c08 commit 0749243

File tree

6 files changed

+386
-0
lines changed

6 files changed

+386
-0
lines changed

docs/LLM_DEMOS.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
LLM Demos and Cache Documentation
2+
3+
Overview
4+
5+
This document describes the lightweight runnable demos added under `examples/`.
6+
They are intentionally safe to run in minimal environments: each demo tries to
7+
use the real library classes (OpenAIClassifier, FusionEnsemble, AutoFusion, etc.)
8+
and falls back to simple mocks when those classes or credentials are not
9+
available.
10+
11+
Files of interest (new/renamed)
12+
13+
- `examples/cache_usage_demo.py` — Demonstrates writing and discovering simple
14+
LLM cache JSON files. Writes a small mock cache under `cache/demo_llm_cache/`.
15+
16+
- `examples/ensemble_cache_interrupt_demo.py` — Creates small val/test sets and
17+
writes simple mock cache files under `cache/mock_ensemble/val` and
18+
`cache/mock_ensemble/test`.
19+
20+
- `examples/llm_cache_mock.py` — Existing example that exercises the
21+
`LLMPredictionCache` implementation. If the real `prediction_cache` module is
22+
available it will use it; otherwise it will raise at import-time. This demo
23+
is left as an integration-oriented example.
24+
25+
- `examples/minimal_precache_demo.py` — Minimal precache flow that tries to use
26+
a real LLM (OpenAI/DeepSeek) and falls back to `MockLLM`. It uses Fusion helper
27+
`_save_cached_llm_predictions` to write canonical cache JSON files.
28+
29+
- `examples/test_multilabel_autofusion.py` — Runnable multi-label AutoFusion
30+
demo; falls back to `MockAutoFusion` if `AutoFusionClassifier` is unavailable.
31+
32+
- `examples/test_singlelabel_ml.py` — Runnable single-label ML-only demo; uses
33+
`RoBERTaClassifier` if available, otherwise `MockML`.
34+
35+
- `examples/test_singlelabel_autofusion.py` — Runnable single-label AutoFusion
36+
demo with a safe fallback.
37+
38+
Why these demos exist
39+
40+
- Provide quick examples for contributors to run locally without needing
41+
expensive GPU access or API credentials.
42+
- Demonstrate cache file formats and helper functions for saving and
43+
discovering cached LLM predictions.
44+
- Provide reproducible scripted flows for CI smoke checks (syntax + import).
45+
46+
How to run (quick)
47+
48+
Run a single demo with Python:
49+
50+
```bash
51+
python examples/test_singlelabel_ml.py
52+
python examples/cache_usage_demo.py
53+
python examples/test_multilabel_autofusion.py
54+
```
55+
56+
Running under a minimal environment will activate mock fallbacks — this
57+
ensures the demos are useful even without model weights or API keys.
58+
59+
Cache helper summary
60+
61+
- `LLMPredictionCache` (in `textclassify/llm/prediction_cache.py`) provides
62+
low-level operations to store, find, and load cached predictions.
63+
- Fusion helpers (e.g. `_save_cached_llm_predictions`) save canonical JSON files
64+
used by the ensemble utilities. The demo scripts call these helpers when
65+
available and otherwise write simple JSON files with a `predictions` list.
66+
67+
Notes for maintainers
68+
69+
- Keep the demos import-safe: avoid executing heavy logic at module import time.
70+
Use `if __name__ == '__main__'` guards (already applied in the demos).
71+
- The demos intentionally keep output minimal and use `random` to generate
72+
deterministic-like behavior for quick inspection.
73+
- If you want a CI job that verifies the demos, add a basic step that runs
74+
`python -m py_compile examples/*.py` and optionally executes a small subset
75+
with `python -c 'import runpy; runpy.run_path("examples/test_singlelabel_ml.py")'`.
76+
77+
FAQ
78+
79+
Q: Do the demos require API keys?
80+
A: No — they fall back to mock implementations unless you configure real
81+
credentials and install optional dependencies.
82+
83+
Q: Where are cache files written?
84+
A: `cache/` under the repository root. Demo scripts use subfolders like
85+
`cache/demo_llm_cache` and `cache/mock_ensemble`.
86+
87+
Q: Should we commit cache files?
88+
A: No — cache files are runtime artifacts and should remain untracked. Add
89+
them to `.gitignore` if they are not already ignored.
90+
91+
If you'd like, I can also add a short CI job (GitHub Actions) that runs the
92+
syntax checks and executes one demo with mocks. Say the word and I'll add it.

examples/cache_usage_demo.py

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
"""Cache Usage Demo
2+
3+
Lightweight, runnable demo that shows how cached LLM predictions
4+
can be discovered, inspected, and used by the fusion helpers.
5+
This demo uses safe mocks when optional dependencies (OpenAI/DeepSeek)
6+
are not available so it can run in minimal environments.
7+
"""
8+
9+
import os
10+
import sys
11+
from pathlib import Path
12+
import pandas as pd
13+
import json
14+
import random
15+
16+
project_root = Path(__file__).resolve().parent.parent
17+
sys.path.insert(0, str(project_root))
18+
19+
# Minimal Mock LLM and Fusion helpers for demo purposes
20+
class MockLLM:
21+
def __init__(self):
22+
self.provider = 'mock'
23+
24+
def predict(self, train_df=None, test_df=None, **kwargs):
25+
texts = list(test_df['text']) if test_df is not None else []
26+
preds = [[random.choice([0,1])] for _ in texts]
27+
# return object with .predictions and .metadata to match real classifiers
28+
return type('R', (), {'predictions': preds, 'metadata': {'provider':'mock'}})()
29+
30+
31+
def demo_cache_usage():
32+
print('Cache usage demo (mock)')
33+
# Tiny datasets
34+
train_df = pd.DataFrame({'text':['train a', 'train b'], 'label':[1,0]})
35+
val_df = pd.DataFrame({'text':['val a','val b'], 'label':[1,0]})
36+
test_df = pd.DataFrame({'text':['test a','test b'], 'label':[1,0]})
37+
38+
# Try to import FusionEnsemble and LLMPredictionCache; fall back to mocks
39+
try:
40+
from textclassify.ensemble.fusion import FusionEnsemble
41+
from textclassify.llm.prediction_cache import LLMPredictionCache
42+
print('Loaded real FusionEnsemble and LLMPredictionCache')
43+
except Exception:
44+
FusionEnsemble = None
45+
LLMPredictionCache = None
46+
print('Using mock behavior (FusionEnsemble not available)')
47+
48+
# Use mock LLM to produce predictions and save simple JSON cache files
49+
llm = MockLLM()
50+
val_res = llm.predict(train_df=train_df, test_df=val_df)
51+
test_res = llm.predict(train_df=train_df, test_df=test_df)
52+
53+
cache_dir = project_root / 'cache' / 'demo_llm_cache'
54+
os.makedirs(cache_dir, exist_ok=True)
55+
val_file = cache_dir / 'validation_predictions.json'
56+
test_file = cache_dir / 'test_predictions.json'
57+
58+
with open(val_file, 'w') as f:
59+
json.dump({'predictions': val_res.predictions, 'provider': llm.provider}, f)
60+
with open(test_file, 'w') as f:
61+
json.dump({'predictions': test_res.predictions, 'provider': llm.provider}, f)
62+
63+
print('Wrote demo cache files:')
64+
print(' ', val_file)
65+
print(' ', test_file)
66+
67+
# If FusionEnsemble is available we could show how to load these files;
68+
# otherwise, just print discovery info.
69+
if LLMPredictionCache is not None:
70+
cache = LLMPredictionCache(cache_dir=str(cache_dir), verbose=False)
71+
print('Cache stats:', cache.get_cache_stats())
72+
else:
73+
print('Cache discovery (mock):', [str(p) for p in cache_dir.glob('*.json')])
74+
75+
76+
if __name__ == '__main__':
77+
demo_cache_usage()
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
"""Ensemble Cache Interrupt Demo
2+
3+
Creates small val/test sets and demonstrates saving LLM predictions to
4+
cache using Fusion utilities when available. Falls back to safe mocks.
5+
"""
6+
7+
import os
8+
import sys
9+
from pathlib import Path
10+
import pandas as pd
11+
import random
12+
13+
project_root = Path(__file__).resolve().parent.parent
14+
sys.path.insert(0, str(project_root))
15+
16+
class MockLLM:
17+
def __init__(self):
18+
self.provider = 'mock'
19+
20+
def predict(self, train_df=None, test_df=None, **kwargs):
21+
texts = list(test_df['text']) if test_df is not None else []
22+
preds = [[random.choice([0,1])] for _ in texts]
23+
return type('R', (), {'predictions': preds, 'metadata': {}})()
24+
25+
class MockML:
26+
def predict(self, df):
27+
return [0 for _ in range(len(df))]
28+
29+
30+
def main():
31+
print('Ensemble cache interrupt demo (mock)')
32+
df_val = pd.DataFrame({'text':[f'val {i}' for i in range(5)]})
33+
df_test = pd.DataFrame({'text':[f'test {i}' for i in range(5)]})
34+
35+
try:
36+
from textclassify.ensemble.fusion import FusionEnsemble
37+
fusion_available = True
38+
print('FusionEnsemble available')
39+
except Exception:
40+
fusion_available = False
41+
print('FusionEnsemble not available; using mocks')
42+
43+
llm = MockLLM()
44+
val_res = llm.predict(train_df=None, test_df=df_val)
45+
test_res = llm.predict(train_df=None, test_df=df_test)
46+
47+
cache_dir = project_root / 'cache' / 'mock_ensemble'
48+
os.makedirs(cache_dir / 'val', exist_ok=True)
49+
os.makedirs(cache_dir / 'test', exist_ok=True)
50+
51+
# Save simple JSON caches
52+
import json
53+
with open(cache_dir / 'val' / 'preds.json', 'w') as f:
54+
json.dump({'predictions': val_res.predictions, 'provider': 'mock'}, f)
55+
with open(cache_dir / 'test' / 'preds.json', 'w') as f:
56+
json.dump({'predictions': test_res.predictions, 'provider': 'mock'}, f)
57+
58+
print('Saved mock cache files under', cache_dir)
59+
60+
if __name__ == '__main__':
61+
main()
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
"""Runnable demo for multi-label AutoFusion (safe fallback)
2+
3+
This demo attempts to construct a minimal multi-label AutoFusion pipeline.
4+
If the real `AutoFusionClassifier` is unavailable it simulates the flow with
5+
simple mocks so the script can run in minimal environments.
6+
"""
7+
8+
import os
9+
import sys
10+
from pathlib import Path
11+
import pandas as pd
12+
import random
13+
14+
project_root = Path(__file__).resolve().parent.parent
15+
sys.path.insert(0, str(project_root))
16+
17+
class MockAutoFusion:
18+
def __init__(self, config):
19+
self.config = config
20+
self.label_columns = config.get('label_columns', [])
21+
self.multi_label = config.get('multi_label', True)
22+
23+
def fit(self, df):
24+
print('MockAutoFusion.fit() called with', len(df), 'rows')
25+
26+
def predict(self, df):
27+
preds = [[random.choice([0,1]) for _ in self.label_columns] for _ in range(len(df))]
28+
return type('R', (), {'predictions': preds, 'metadata': {}})()
29+
30+
31+
def main():
32+
print('Multi-label AutoFusion demo')
33+
34+
# tiny sample multi-label dataset
35+
df = pd.DataFrame({'text':[f'sample {i}' for i in range(10)], 'labelA':[1,0,0,1,0,1,0,0,1,0], 'labelB':[0,1,0,0,1,0,0,1,0,1]})
36+
train_df = df.sample(n=6, random_state=42).reset_index(drop=True)
37+
test_df = df.drop(train_df.index).reset_index(drop=True)
38+
39+
config = {'label_columns': ['labelA','labelB'], 'multi_label': True}
40+
41+
try:
42+
from textclassify.ensemble.auto_fusion import AutoFusionClassifier
43+
print('Using real AutoFusionClassifier')
44+
clf = AutoFusionClassifier(config=config)
45+
except Exception:
46+
print('AutoFusionClassifier not available; using MockAutoFusion')
47+
clf = MockAutoFusion(config)
48+
49+
clf.fit(train_df)
50+
res = clf.predict(test_df)
51+
print('Predictions sample:', res.predictions[:3])
52+
53+
if __name__ == '__main__':
54+
main()
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
"""Runnable demo for single-label AutoFusion (safe fallback)
2+
3+
Creates a tiny single-label dataset and demonstrates AutoFusion flow. Falls back
4+
to MockAutoFusion if the real class is not importable.
5+
"""
6+
7+
import os
8+
import sys
9+
from pathlib import Path
10+
import pandas as pd
11+
import random
12+
13+
project_root = Path(__file__).resolve().parent.parent
14+
sys.path.insert(0, str(project_root))
15+
16+
class MockAutoFusion:
17+
def __init__(self, config):
18+
self.config = config
19+
self.label_columns = config.get('label_columns', [])
20+
self.multi_label = False
21+
22+
def fit(self, df):
23+
print('MockAutoFusion.fit() called with', len(df), 'rows')
24+
25+
def predict(self, df):
26+
preds = [[random.choice([0,1]) for _ in self.label_columns] for _ in range(len(df))]
27+
return type('R', (), {'predictions': preds, 'metadata': {}})()
28+
29+
30+
def main():
31+
print('Single-label AutoFusion demo')
32+
33+
df = pd.DataFrame({'text':[f'sample {i}' for i in range(12)], 'label':[random.choice([0,1]) for _ in range(12)]})
34+
train_df = df.sample(n=8, random_state=42).reset_index(drop=True)
35+
test_df = df.drop(train_df.index).reset_index(drop=True)
36+
37+
config = {'label_columns': ['label'], 'multi_label': False}
38+
39+
try:
40+
from textclassify.ensemble.auto_fusion import AutoFusionClassifier
41+
print('Using real AutoFusionClassifier')
42+
clf = AutoFusionClassifier(config=config)
43+
except Exception:
44+
print('AutoFusionClassifier not available; using MockAutoFusion')
45+
clf = MockAutoFusion(config)
46+
47+
clf.fit(train_df)
48+
res = clf.predict(test_df)
49+
print('Predictions sample:', res.predictions[:5])
50+
51+
if __name__ == '__main__':
52+
main()

examples/test_singlelabel_ml.py

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
"""Runnable demo for single-label ML-only classifier (safe fallback)
2+
3+
This demo trains a tiny RoBERTa-based ML classifier if available, otherwise
4+
it uses a MockML to simulate training and prediction.
5+
"""
6+
7+
import os
8+
import sys
9+
from pathlib import Path
10+
import pandas as pd
11+
import random
12+
13+
project_root = Path(__file__).resolve().parent.parent
14+
sys.path.insert(0, str(project_root))
15+
16+
class MockML:
17+
def __init__(self):
18+
self.model_name = 'mock'
19+
self.label_columns = ['label']
20+
21+
def fit(self, df):
22+
print('MockML.fit() called with', len(df), 'rows')
23+
24+
def predict(self, df):
25+
return type('R', (), {'predictions': [[random.choice([0,1])] for _ in range(len(df))], 'metadata': {}})()
26+
27+
28+
def main():
29+
print('Single-label ML demo')
30+
# tiny dataset
31+
df = pd.DataFrame({'text':[f'sample {i}' for i in range(30)], 'label':[random.choice([0,1]) for _ in range(30)]})
32+
train_df = df.sample(n=20, random_state=42).reset_index(drop=True)
33+
test_df = df.drop(train_df.index).reset_index(drop=True)
34+
35+
try:
36+
from textclassify.ml.roberta_classifier import RoBERTaClassifier
37+
from textclassify.core.types import ModelConfig, ModelType
38+
print('Using real RoBERTaClassifier')
39+
cfg = ModelConfig(model_name='roberta-base', model_type=ModelType.TRADITIONAL_ML, parameters={})
40+
clf = RoBERTaClassifier(config=cfg, text_column='text', label_columns=['label'], multi_label=False, auto_save_results=False)
41+
except Exception:
42+
print('RoBERTaClassifier not available; using MockML')
43+
clf = MockML()
44+
45+
clf.fit(train_df)
46+
res = clf.predict(test_df)
47+
print('Sample predictions:', res.predictions[:5])
48+
49+
if __name__ == '__main__':
50+
main()

0 commit comments

Comments
 (0)