TL;DR: A prototype demonstrating how Graph-enhanced RAG outperforms vector RAG for product/journey analytics by leveraging temporal user behavior graphs.
Product teams ask questions like:
- "Why do users churn after viewing certain products?"
- "What's the typical conversion path for high-LTV customers?"
- "Which product categories do users explore together?"
Traditional analytics (SQL, BI tools) are rigid; vector RAG lacks temporal/sequential understanding.
Customer Journey GraphRAG combines:
- Temporal behavior graph (users → sessions → events → products)
- Path-aware retrieval for journey pattern extraction
- LLM reasoning over journey context for product insights
┌─────────────────────────────────────────────────────────────────────────────┐
│ Customer Journey GraphRAG │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌─────────────────┐ ┌──────────────────────────┐ │
│ │ User Query │───▶│ Query Router │────▶│ Retrieval Engine │ │
│ │ │ │ (Intent Parse) │ │ │ │
│ └──────────────┘ └─────────────────┘ │ ┌────────────────────┐ │ │
│ │ │ GraphRAG │ │ │
│ │ │ • Path extraction │ │ │
│ │ │ • Pattern analysis │ │ │
│ │ │ • Cohort compare │ │ │
│ │ └────────────────────┘ │ │
│ │ ┌────────────────────┐ │ │
│ │ │ Naive RAG │ │ │
│ │ │ • FAISS search │ │ │
│ │ │ • Top-K docs │ │ │
│ │ └────────────────────┘ │ │
│ └──────────────────────────┘ │
│ │ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ Journey Graph (NetworkX) │ │
│ │ │ │
│ │ [User] ──STARTED──▶ [Session] ──CONTAINS──▶ [Event] ──NEXT──▶ ...│ │
│ │ │ │ │ │ │
│ │ segment time ──INVOLVES──▶[Product]│ │
│ │ ltv events │ category │ │
│ │ churned event_type price │ │
│ │ │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────▼─────────────┐ │
│ │ LLM (Llama 3.1 8B) │ │
│ │ via Groq API │ │
│ └────────────┬─────────────┘ │
│ │ │
│ ┌────────────────────────────────────────────────────────▼─────────────┐ │
│ │ Product Insight Response │ │
│ │ "82% of churned users followed: home → 2-3 product views → exit..." │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
├─────────────────────────────────────────────────────────────────────────────┤
│ FastAPI Backend (port 8000) │ Streamlit UI (port 8501) │
└─────────────────────────────────────────────────────────────────────────────┘
| Component | Technology |
|---|---|
| Graph | NetworkX (in-memory DiGraph) |
| Embeddings | sentence-transformers (all-MiniLM-L6-v2) |
| Vector Store | FAISS (IndexFlatIP) |
| LLM | Llama 3.1 8B via Groq API |
| Backend | FastAPI + uvicorn |
| Frontend | Streamlit |
| Package Manager | uv |
# Clone repository
git clone <repo-url>
cd "Graph-RAG for Customer Journey Intelligence"
# Install uv if needed
pip install uv
# Install dependencies
uv sync# Copy example env file
cp .env.example .env
# Edit .env and add your Groq API key
# Get one at: https://console.groq.com/# Generate synthetic clickstream data
uv run python -m src.generate_data
# Build the journey graph
uv run python -m src.build_graph# Build vector embeddings index
uv run python -m src.naive_rag# Terminal 1: Start API server
uv run uvicorn src.api:app --reload --port 8000
# Terminal 2: Start Streamlit UI
uv run streamlit run src/app.pyOpen http://localhost:8501 in your browser.
| File | Rows | Description |
|---|---|---|
data/users.csv |
~5,000 | User profiles with segments and LTV |
data/products.csv |
~800 | Product catalog across 6 categories |
data/events.csv |
85,529 | Clickstream events across ~20,000 sessions |
| Segment | % Users | Avg LTV | Churn Rate | Behavior |
|---|---|---|---|---|
| high_value | 15% | $500 | 10% | 5-12 events/session, 60% purchase prob |
| medium | 50% | $150 | 30% | 3-7 events/session, 30% purchase prob |
| low | 35% | $30 | 60% | 1-4 events/session, 10% purchase prob |
Production Note: This demo uses synthetic data. For real-world deployment, replace with actual clickstream data from Snowplow, Segment, or Amplitude via Airbyte/Fivetran ETL pipelines.
Tested on synthetic e-commerce data: 5,000 users, 85,529 events across 20,000 sessions
Naive RAG:
Churned users exhibit a high frequency of search actions. Out of the 10 sessions, all users performed a search action... The similarity scores among these sessions are relatively high (0.403-0.411), indicating similar behavior patterns.
GraphRAG:
The typical journey of churned users appears to be a simple exit after either searching or viewing a product page:
search → exit(9 occurrences, 9.0%) andpage_view → exit(9 occurrences, 9.0%). These patterns account for 18% of all churned user journeys each.
Winner: GraphRAG ✅
Key difference: GraphRAG identified search → exit and page_view → exit each accounting for 18% of churned journeys (9 of 50 each) — explicit path frequencies Naive RAG never surfaces.
Naive RAG:
Before purchasing electronics, users tend to view: Beauty (4/10 sessions), Books (2/10), Fashion (2/10). The similarity metric (0.390-0.353) suggests sessions are similar but not identical.
GraphRAG:
The most common last event type before exit is page_view (40%), followed by click (21%), and add_to_cart (12%). Average events before exit is 4.74, indicating moderate exploration before decision.
Winner: GraphRAG ✅
Key difference: GraphRAG provides cohort-level funnel structure: 40% page_view, 21% click, 12% add_to_cart with 4.74 avg events before exit — grounded in graph aggregates rather than 10-session samples.
Naive RAG:
High-LTV users exhibit increased engagement with multiple categories, deeper product exploration, and higher click-through rates... Since the provided data only includes high-LTV users, we cannot directly compare the two cohorts.
GraphRAG:
High-LTV users have significantly higher average events per session (5.72 vs 2.31), higher conversion rate (8.0% vs 1.9%), and more purchase events (4 vs 1). They engage in more complex navigation flows with multiple page views, clicks, and add-to-cart events.
Winner: GraphRAG ✅
Key difference: GraphRAG provides direct comparative stats: 5.72 vs 2.31 events/session, 8.0% vs 1.9% conversion, 4 vs 1 purchases — quantifying the behavioral gap between cohorts.
Naive RAG:
Users who view fashion products tend to exit after viewing multiple times without adding to cart... The provided data only includes 10 sessions, which may not be representative of the entire user base.
GraphRAG:
The most common exit patterns are
search → exit(9 occurrences, 9.0%) andpage_view → exit(9 occurrences, 9.0%). These patterns indicate users are dropping off at the earliest stages. Page_view is a common pre-exit event (9 occurrences), and search appears before exit in 7.0% of journeys.
Winner: GraphRAG ✅
Key difference: GraphRAG is explicit about data limits and provides global exit patterns (search → exit and page_view → exit each at 9.0%) rather than over-fitting a small sample.
Naive RAG:
100% of users start with page_view. 60% click on a category. 40% add to cart. 20% complete a purchase. The funnel drops significantly at "Add to Cart" stage.
GraphRAG:
60% of exiting users drop off immediately after a search.
search → exitaccounts for 18% of exits,search → page_viewfor 6%. Only 2% of exiting users click on a product after searching.
Winner: GraphRAG ✅
Key difference: GraphRAG derives drop-off from aggregated path frequencies, showing 60% drop immediately after search and pinpointing search as the dominant funnel leak.
| Metric | Naive RAG | GraphRAG | Notes |
|---|---|---|---|
| Avg response time | ~1.16s | ~1.28s | Comparable performance |
| Specific statistics | 0–1 per query (10-session samples) | 3–5 per query (cohort-level) | Percentages, counts, averages |
| Pattern identification | High-level stage descriptions | Explicit sequences with frequencies | search → exit, page_view → exit |
| Cohort comparison | Qualitative, incomplete | Quantitative with metrics | Events/session, conversion rates |
| Business value | Low | High | Actionable product insights |
- Temporal reasoning: Understands sequences ("before", "after", "leads to")
- Pattern extraction: Identifies common paths with frequencies (e.g., 18% of churned users follow
search → exit) - Quantitative insights: Provides percentages, counts, averages across full cohorts
- Cohort comparison: Statistical comparison (5.72 vs 2.31 events/session)
- Multi-hop queries: Handles "X → Y → Z" reasoning impossible for text chunks
| Method | Endpoint | Description |
|---|---|---|
| GET | /health |
Health check |
| GET | /stats |
Graph statistics |
| GET | /presets |
Preset queries list |
| POST | /query/graphrag |
GraphRAG query |
| POST | /query/naive |
Naive RAG query |
| POST | /query/compare |
Side-by-side comparison |
graphrag-customer-journey/
├── pyproject.toml # uv dependencies
├── requirements.txt # pip dependencies (for Docker)
├── Dockerfile # HF Spaces / local Docker
├── start.sh # Container startup script
├── .env.example # API key template
├── .gitignore # Excludes data/, graph/*.pkl, .env
├── README_HFSpaces.md # HF Spaces version (with YAML frontmatter)
├── README.md # This file (full documentation)
├── README_DEPLOYMENT.md # Docker & HF Spaces deployment guide
├── docs/
│ └── query_results.md # Full GraphRAG vs Naive RAG comparison
├── data/ # Generated data (gitignored)
│ ├── users.csv
│ ├── products.csv
│ └── events.csv
├── graph/ # Artifacts (gitignored)
│ ├── journey_graph.pkl
│ └── naive_rag_index.pkl
└── src/
├── __init__.py
├── generate_data.py # Synthetic data generator
├── build_graph.py # Graph construction
├── retrieval.py # GraphRAG retrieval engine
├── naive_rag.py # Vector RAG baseline
├── llm.py # Groq/Llama integration
├── api.py # FastAPI endpoints
└── app.py # Streamlit UI
Note:
data/*.csvandgraph/*.pklare gitignored. Rungenerate_data.pyandbuild_graph.pylocally to create them.
See docs/query_results.md for complete GraphRAG vs Naive RAG comparison across all 5 preset queries with full LLM responses.
- Graph paths enable temporal reasoning impossible for chunk-based RAG
- Journey pattern extraction reveals product opportunities (cross-sell, funnel optimization)
- Cohort comparison with statistics vs generic descriptions
- Applicable to any product with user behavioral data (e-commerce, SaaS, fintech)
- Growth/Retention: Identify churn signals, optimize onboarding flows
- Personalization: Journey-based recommendations
- Product Analytics: Understand feature adoption paths, drop-off points
- Experimentation: Analyze journey changes across A/B test variants
See README_DEPLOYMENT.md for HuggingFace Spaces and local Docker deployment instructions.
HuggingFace Spaces Note: When deploying to HF Spaces, use
README_HFSpaces.md(contains required YAML frontmatter). This file (README.md) is the canonical GitHub documentation.
-
Scale & infra: This prototype runs on NetworkX with ~5k users / 85k events to keep it inspectable. A production system at 10M+ users / billions of edges would use a graph database (e.g., Neo4j, TigerGraph, DGraph) or a distributed graph engine, plus sharded storage and caching for hot subgraphs.
-
Evaluation: The comparison is qualitative and based on real query outputs. A next step would be to define a held-out query set and measure retrieval precision@k and human-rated answer usefulness for GraphRAG vs vector RAG.
-
Latency & monitoring: Latencies here are acceptable for an interactive demo (~1–2s including LLM), but the prototype does not include explicit
/metricsor production monitoring. In a real system, I'd log per-stage timings (graph traversal, embedding, LLM) and expose them via a metrics endpoint.
MIT