Skip to content

Commit 6618a49

Browse files
Phase 2
1 parent 08acf69 commit 6618a49

File tree

18 files changed

+820
-9
lines changed

18 files changed

+820
-9
lines changed

alert_historian/.env.example

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,10 @@ ALERT_HISTORIAN_FINDFIRST_USERNAME=jsmith
2020
ALERT_HISTORIAN_FINDFIRST_PASSWORD=test
2121
ALERT_HISTORIAN_SYNC_BATCH_SIZE=100
2222
ALERT_HISTORIAN_USE_DOMAIN_TAGS=true
23+
24+
# Narrative engine (Phase 2). If ALERT_HISTORIAN_OPENAI_API_KEY is unset, narrative is skipped.
25+
ALERT_HISTORIAN_CHROMA_PATH=./artifacts/chroma
26+
ALERT_HISTORIAN_EMBEDDING_MODEL=text-embedding-3-small
27+
ALERT_HISTORIAN_OPENAI_API_KEY=
28+
ALERT_HISTORIAN_LLM_MODEL=gpt-4o-mini
29+
ALERT_HISTORIAN_CHRONICLE_PATH=./artifacts/chronicle.md

alert_historian/README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,13 @@
1010
- Retry and classification policy for transient/permanent failures
1111
- Daily markdown report output
1212

13+
## Phase 2: Narrative Engine
14+
15+
- ChromaDB vector store for semantic retrieval of alert items
16+
- Evolving Chronicle (markdown timeline) maintained by LLM
17+
- Narrative Delta: links today's alerts to historical context in daily reports
18+
- Use `ALERT_HISTORIAN_OPENAI_API_KEY` to enable; `--no-narrative` to skip
19+
1320
## Quick start
1421

1522
```bash
@@ -28,6 +35,7 @@ python -m alert_historian ingest
2835
python -m alert_historian sync
2936
python -m alert_historian report
3037
python -m alert_historian run-once
38+
python -m alert_historian run-once --no-narrative # skip narrative engine
3139
```
3240

3341
## SonarQube local prep
@@ -47,12 +55,15 @@ Copy `.env.example` to `.env` and set:
4755
- `ALERT_HISTORIAN_FINDFIRST_BASE_URL`
4856
- `ALERT_HISTORIAN_FINDFIRST_USERNAME`
4957
- `ALERT_HISTORIAN_FINDFIRST_PASSWORD`
58+
- `ALERT_HISTORIAN_OPENAI_API_KEY` (optional) for narrative engine; when set, run-once produces enriched reports with Narrative Delta
5059

5160
## Output locations
5261

5362
- Canonical artifacts: `./artifacts/canonical-<run_id>.json`
5463
- State DB: `./state/alert_historian.db`
5564
- Daily reports: `./reports/daily/YYYY-MM-DD.md`
65+
- Chronicle: `./artifacts/chronicle.md` (when narrative enabled)
66+
- ChromaDB: `./artifacts/chroma/` (when narrative enabled)
5667

5768
## Smoke test against local FindFirst
5869

alert_historian/pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,12 @@ description = "Google Alerts narrative and FindFirst sync engine"
99
readme = "README.md"
1010
requires-python = ">=3.11"
1111
dependencies = [
12+
"chromadb>=0.4.0",
13+
"openai>=1.0.0",
1214
"pydantic>=2.7.0",
1315
"pydantic-settings>=2.2.1",
1416
"requests>=2.32.0",
17+
"tiktoken>=0.5.0",
1518
]
1619

1720
[project.optional-dependencies]
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
Metadata-Version: 2.4
2+
Name: alert-historian
3+
Version: 0.1.0
4+
Summary: Google Alerts narrative and FindFirst sync engine
5+
Requires-Python: >=3.11
6+
Description-Content-Type: text/markdown
7+
Requires-Dist: chromadb>=0.4.0
8+
Requires-Dist: openai>=1.0.0
9+
Requires-Dist: pydantic>=2.7.0
10+
Requires-Dist: pydantic-settings>=2.2.1
11+
Requires-Dist: requests>=2.32.0
12+
Requires-Dist: tiktoken>=0.5.0
13+
Provides-Extra: dev
14+
Requires-Dist: pytest>=8.2.0; extra == "dev"
15+
Requires-Dist: pytest-cov>=5.0.0; extra == "dev"
16+
17+
# alert_historian
18+
19+
`alert_historian` ingests Google Alerts, normalizes and deduplicates events, syncs links into FindFirst as bookmarks, and produces daily narrative reports.
20+
21+
## MVP v0.1
22+
23+
- Canonical payload schema for IMAP or JSON export ingestion
24+
- SQLite-backed checkpoints and sync attempt tracking
25+
- FindFirst sync client using existing auth, tag, and bookmark APIs
26+
- Retry and classification policy for transient/permanent failures
27+
- Daily markdown report output
28+
29+
## Quick start
30+
31+
```bash
32+
cd alert_historian
33+
python -m venv .venv
34+
source .venv/bin/activate
35+
pip install -e ".[dev]"
36+
cp .env.example .env
37+
python -m alert_historian run-once
38+
```
39+
40+
## Commands
41+
42+
```bash
43+
python -m alert_historian ingest
44+
python -m alert_historian sync
45+
python -m alert_historian report
46+
python -m alert_historian run-once
47+
```
48+
49+
## SonarQube local prep
50+
51+
Generate the coverage report used by SonarQube:
52+
53+
```bash
54+
pytest --cov=src/alert_historian --cov-report=xml:coverage.xml
55+
```
56+
57+
## Configuration
58+
59+
Copy `.env.example` to `.env` and set:
60+
61+
- `ALERT_HISTORIAN_INPUT_MODE=json|imap`
62+
- `ALERT_HISTORIAN_JSON_INPUT` when using JSON mode
63+
- `ALERT_HISTORIAN_FINDFIRST_BASE_URL`
64+
- `ALERT_HISTORIAN_FINDFIRST_USERNAME`
65+
- `ALERT_HISTORIAN_FINDFIRST_PASSWORD`
66+
67+
## Output locations
68+
69+
- Canonical artifacts: `./artifacts/canonical-<run_id>.json`
70+
- State DB: `./state/alert_historian.db`
71+
- Daily reports: `./reports/daily/YYYY-MM-DD.md`
72+
73+
## Smoke test against local FindFirst
74+
75+
1. Start FindFirst stack and ensure server is reachable.
76+
2. Set `.env` credentials to the local test user.
77+
3. Run:
78+
79+
```bash
80+
python -m alert_historian run-once
81+
```
82+
83+
If sync succeeds, bookmarks and tags appear in FindFirst, and report output is written to `reports/daily`.
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
README.md
2+
pyproject.toml
3+
src/alert_historian/__init__.py
4+
src/alert_historian/__main__.py
5+
src/alert_historian.egg-info/PKG-INFO
6+
src/alert_historian.egg-info/SOURCES.txt
7+
src/alert_historian.egg-info/dependency_links.txt
8+
src/alert_historian.egg-info/requires.txt
9+
src/alert_historian.egg-info/top_level.txt
10+
src/alert_historian/cli/__init__.py
11+
src/alert_historian/cli/main.py
12+
src/alert_historian/config/__init__.py
13+
src/alert_historian/config/settings.py
14+
src/alert_historian/ingestion/__init__.py
15+
src/alert_historian/ingestion/imap_adapter.py
16+
src/alert_historian/ingestion/json_export_adapter.py
17+
src/alert_historian/ingestion/normalize.py
18+
src/alert_historian/ingestion/pipeline.py
19+
src/alert_historian/ingestion/schema.py
20+
src/alert_historian/narrative/__init__.py
21+
src/alert_historian/narrative/chronicle.py
22+
src/alert_historian/narrative/delta.py
23+
src/alert_historian/narrative/vector_store.py
24+
src/alert_historian/reporting/__init__.py
25+
src/alert_historian/reporting/daily_report.py
26+
src/alert_historian/state/__init__.py
27+
src/alert_historian/state/store.py
28+
src/alert_historian/sync/__init__.py
29+
src/alert_historian/sync/engine.py
30+
src/alert_historian/sync/findfirst_client.py
31+
src/alert_historian/sync/mappers.py
32+
src/alert_historian/sync/retry.py
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
chromadb>=0.4.0
2+
openai>=1.0.0
3+
pydantic>=2.7.0
4+
pydantic-settings>=2.2.1
5+
requests>=2.32.0
6+
tiktoken>=0.5.0
7+
8+
[dev]
9+
pytest>=8.2.0
10+
pytest-cov>=5.0.0
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
alert_historian

alert_historian/src/alert_historian/cli/main.py

Lines changed: 93 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,18 @@
22
from datetime import datetime
33

44
from alert_historian.config.settings import get_settings
5+
from alert_historian.ingestion.pipeline import (
6+
load_canonical_from_artifact,
7+
payloads_to_pending_items,
8+
)
59
from alert_historian.ingestion.pipeline import ingest
10+
from alert_historian.narrative.chronicle import (
11+
create_openai_llm_client,
12+
load_chronicle,
13+
update_chronicle,
14+
)
15+
from alert_historian.narrative.delta import generate_delta
16+
from alert_historian.narrative.vector_store import AlertVectorStore
617
from alert_historian.reporting.daily_report import build_daily_report
718
from alert_historian.state.store import StateStore
819
from alert_historian.sync.engine import sync_pending_items
@@ -31,21 +42,91 @@ def run_sync(run_id: str | None = None) -> dict[str, int]:
3142
store.close()
3243

3344

34-
def run_report(run_id: str, inserted_count: int, sync_stats: dict[str, int]) -> str:
45+
def run_report(
46+
run_id: str,
47+
inserted_count: int,
48+
sync_stats: dict[str, int],
49+
narrative_delta: str | None = None,
50+
) -> str:
3551
settings = get_settings()
3652
store = StateStore(settings.state_db)
3753
try:
38-
path = build_daily_report(store, settings.reports_dir, run_id, inserted_count, sync_stats)
54+
path = build_daily_report(
55+
store,
56+
settings.reports_dir,
57+
run_id,
58+
inserted_count,
59+
sync_stats,
60+
narrative_delta=narrative_delta,
61+
)
3962
print(f"[report] path={path}")
4063
return str(path)
4164
finally:
4265
store.close()
4366

4467

45-
def run_once() -> int:
68+
def _run_narrative_pipeline(
69+
settings,
70+
run_id: str,
71+
today_items: list,
72+
) -> str:
73+
"""Run Chronicle update and Narrative Delta generation. Returns delta markdown."""
74+
artifact_path = settings.artifacts_dir / f"canonical-{run_id}.json"
75+
if not artifact_path.exists():
76+
return ""
77+
78+
vector_store = AlertVectorStore(
79+
persist_path=settings.chroma_path,
80+
api_key=settings.openai_api_key,
81+
embedding_model=settings.embedding_model,
82+
)
83+
vector_store.upsert_items(today_items)
84+
85+
query_text = " ".join(
86+
f"{item.title} {item.snippet}" for item in today_items[:10]
87+
).strip() or "recent alerts"
88+
past_context = vector_store.query(query_text, n_results=10)
89+
90+
chronicle_path = settings.chronicle_path
91+
llm_client = create_openai_llm_client(
92+
api_key=settings.openai_api_key,
93+
model=settings.llm_model,
94+
)
95+
96+
new_context = "\n\n".join(
97+
f"[{item.day}] {item.topic}: {item.title}\n{item.snippet[:200]}"
98+
for item in today_items[:15]
99+
)
100+
update_chronicle(chronicle_path, new_context, llm_client)
101+
chronicle_content = load_chronicle(chronicle_path)
102+
103+
return generate_delta(
104+
today_items,
105+
past_context,
106+
chronicle_content,
107+
llm_client,
108+
)
109+
110+
111+
def run_once(no_narrative: bool = False) -> int:
46112
run_id, inserted = run_ingest()
47113
stats = run_sync(run_id)
48-
run_report(run_id, inserted, stats)
114+
115+
narrative_delta: str | None = None
116+
if not no_narrative:
117+
settings = get_settings()
118+
if settings.openai_api_key:
119+
artifact_path = settings.artifacts_dir / f"canonical-{run_id}.json"
120+
if artifact_path.exists():
121+
payloads = load_canonical_from_artifact(artifact_path)
122+
today_items = payloads_to_pending_items(payloads)
123+
if today_items:
124+
try:
125+
narrative_delta = _run_narrative_pipeline(settings, run_id, today_items)
126+
except Exception as e:
127+
print(f"[narrative] skipped: {e}")
128+
129+
run_report(run_id, inserted, stats, narrative_delta=narrative_delta)
49130
return 0
50131

51132

@@ -55,7 +136,12 @@ def main() -> int:
55136
sub.add_parser("ingest")
56137
sub.add_parser("sync")
57138
sub.add_parser("report")
58-
sub.add_parser("run-once")
139+
run_once_parser = sub.add_parser("run-once")
140+
run_once_parser.add_argument(
141+
"--no-narrative",
142+
action="store_true",
143+
help="Skip narrative engine (Chronicle, Delta) even when API key is set",
144+
)
59145
args = parser.parse_args()
60146

61147
if args.command == "ingest":
@@ -69,5 +155,6 @@ def main() -> int:
69155
run_report(run_id, inserted_count=0, sync_stats={})
70156
return 0
71157
if args.command in ("run-once", None):
72-
return run_once()
158+
no_narrative = getattr(args, "no_narrative", False)
159+
return run_once(no_narrative=no_narrative)
73160
return 0

alert_historian/src/alert_historian/config/settings.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,12 @@ class Settings(BaseSettings):
3636
sync_batch_size: int = Field(default=100, alias="ALERT_HISTORIAN_SYNC_BATCH_SIZE")
3737
use_domain_tags: bool = Field(default=True, alias="ALERT_HISTORIAN_USE_DOMAIN_TAGS")
3838

39+
chroma_path: Path = Field(default=Path("./artifacts/chroma"), alias="ALERT_HISTORIAN_CHROMA_PATH")
40+
embedding_model: str = Field(default="text-embedding-3-small", alias="ALERT_HISTORIAN_EMBEDDING_MODEL")
41+
openai_api_key: str = Field(default="", alias="ALERT_HISTORIAN_OPENAI_API_KEY")
42+
llm_model: str = Field(default="gpt-4o-mini", alias="ALERT_HISTORIAN_LLM_MODEL")
43+
chronicle_path: Path = Field(default=Path("./artifacts/chronicle.md"), alias="ALERT_HISTORIAN_CHRONICLE_PATH")
44+
3945

4046
@lru_cache
4147
def get_settings() -> Settings:

0 commit comments

Comments
 (0)