Undertaker-Ai

Undertaker-Ai is a sophisticated interface designed to interact with a Microsoft GraphRAG knowledge index. This project is specifically configured to analyze and explore the narrative depths of the light novel series "86 - Eighty Six". By leveraging Graph Retrieval-Augmented Generation (GraphRAG), the application enables users to perform complex query reasoning over structured data extracted from unstructured text.

Key Features

1. Advanced Search Capabilities

The application supports two distinct modes of inquiry to analyze the knowledge base:

Global Search (Map-Reduce):
- Designed for broad questions that require aggregating information from across the entire dataset.
- Mechanism: Uses a map-reduce approach to query community summaries, generating a comprehensive answer that synthesizes themes and widespread facts.
- Use Case: "What are the major themes of the Eighty Six series?" or "How does the war affect the San Magnolia Republic?"
Local Search (Neighbourhood):
- Optimized for specific questions about distinct entities (characters, locations, organizations).
- Mechanism: Navigates to a specific entity's node and explores its immediate neighbors (connected relationships and text units) to provide granular details.
- Use Case: "Who is Shinei Nouzen?" or "Describe the Juggernaut mecha."

2. Interactive Knowledge Graph

Visualize the underlying data structure using PyVis:

Dynamic filtering: Adjust the minimal edge weight to filter out weak connections and focus on strong relationships.
Node limitation: Control the maximum number of nodes displayed to prevent visual clutter and ensure performance.
Physics engine: Nodes automatically arrange themselves using a force-directed layout for optimal readability.

3. Transparent Sourcing

Every answer generated by the system includes:

Context Data: The specific text chunks and community reports used by the LLM.
Traceability: Allows users to verify the information against the source material.

Technical Architecture & Configuration

This project is built on Python 3.10+ and integrates several powerful libraries.

Core Dependencies

graphrag: Microsoft's library for structured GraphRAG pipelines.
streamlit: The web framework powering the user interface.
pandas: For efficient data manipulation of entities and relationships.
networkx & pyvis: For graph modeling and interactive rendering.

Configuration (`settings.yaml`)

The project uses a strict configuration file to manage the GraphRAG pipeline. Key settings include:

LLM & Embeddings: Configured to use OpenAI-compatible endpoints (e.g., OpenRouter).
- default_chat_model: Handles answer generation and graph extraction.
- default_embedding_model: Generates vector embeddings for text units (text-embedding-3-small).
Data Ingestion:
- Input: Text files located in input/.
- Chunking: Text is split into 1200-token chunks with 100-token overlap to maintain context.
Storage: uses lancedb for vector storage and local file system for artifacts.

Installation & Setup

Prerequisites

Python 3.10 or higher.
Git.
An API Key for an OpenAI-compatible LLM provider (e.g., OpenRouter, OpenAI).

Step 1: Clone the Repository

git clone https://github.com/PhucHuwu/Undertaker-Ai.git
cd Undertaker-Ai

Step 2: Install Dependencies

pip install -r requirements.txt

Step 3: Configure Environment

Create a .env file in the root directory to store your credentials. This avoids hardcoding sensitive keys in settings.yaml.

# .env file
GRAPHRAG_API_KEY=your_actual_api_key
GRAPHRAG_CHAT_MODEL=your_preferred_model_name

Operations Guide

1. Data Indexing (If needed)

If you have raw text files in the input/ folder but no index in output/, you must run the indexing pipeline first:

python -m graphrag.index --root .

This process extracts entities, relationships, and communities/claims, which can take time depending on the dataset size.

2. Launching the App

Start the Streamlit interface:

streamlit run app.py

3. Usage

Dashboard: The sidebar shows the status of the index loading.
Chat Interface: Select "Global" or "Local" search, type your query, and view the AI-generated response along with context.
Visualization: Switch tabs to view the node-link diagram of the characters and events.

Troubleshooting

Common Issues

"Output directory not found":
- Cause: The GraphRAG indexing pipeline has not been run or completed successfully.
- Solution: Run the indexing command mentioned in the "Data Indexing" section.
API Errors / Authentication Failures:
- Cause: Incorrect API Key or Model Name in .env.
- Solution: Verify your .env file matches the variable names expected by settings.yaml. Check your API provider's dashboard for quota limits.
Graph Visualization is Empty:
- Cause: The "Minimum Edge Weight" filter might be too high.
- Solution: Lower the slider in the visualization tab to reveal weaker connections.

Directory Structure

app.py: The main application entry point.
settings.yaml: The master configuration file for the GraphRAG pipeline.
input/: Directory for raw source text files (*.txt).
output/: Directory where the indexed artifacts (Parquet files, LanceDB) are stored.
prompts/: Custom prompt templates used to guide the LLM during extraction and search.
cache/: Local cache to speed up subsequent runs and reduce API costs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Undertaker-Ai

Key Features

1. Advanced Search Capabilities

2. Interactive Knowledge Graph

3. Transparent Sourcing

Technical Architecture & Configuration

Core Dependencies

Configuration (`settings.yaml`)

Installation & Setup

Prerequisites

Step 1: Clone the Repository

Step 2: Install Dependencies

Step 3: Configure Environment

Operations Guide

1. Data Indexing (If needed)

2. Launching the App

3. Usage

Troubleshooting

Common Issues

Directory Structure

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
input		input
.gitignore		.gitignore
README.md		README.md
app.py		app.py
graph_preview.png		graph_preview.png
requirements.txt		requirements.txt
settings.yaml		settings.yaml

Folders and files

Latest commit

History

Repository files navigation

Undertaker-Ai

Key Features

1. Advanced Search Capabilities

2. Interactive Knowledge Graph

3. Transparent Sourcing

Technical Architecture & Configuration

Core Dependencies

Configuration (settings.yaml)

Installation & Setup

Prerequisites

Step 1: Clone the Repository

Step 2: Install Dependencies

Step 3: Configure Environment

Operations Guide

1. Data Indexing (If needed)

2. Launching the App

3. Usage

Troubleshooting

Common Issues

Directory Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

Configuration (`settings.yaml`)