Prophet: AI-driven Knowledge Graph System

Prophet is a modular framework for building and experimenting with knowledge graph-based RAG (Retrieval-Augmented Generation) systems. It enables structured knowledge extraction, retrieval, and reasoning using graph-based techniques.

Modules Overview

1. Bodhi - Knowledge Graph Extraction

Bodhi is a dedicated LLM based Knowledge Graph Extraction Pipeline for constructing knowledge graphs from unstructured documents. It supports various formats, including:

PDFs
Text files (.txt, .md, .csv, etc.)

Bodhi extracts entities, relationships, and constructs knowledge graphs in NetworkX format, serving as the foundation for downstream retrieval and inference.

2. Odysseus - Static Graph Retrieval Engine

Odysseus is a plugin-based graph retrieval engine designed for preprocessing and indexing knowledge graphs. It currently supports:

GraphRAG: Global Search - Summarizes entire knowledge graphs for broad query coverage.
GraphRAG: Local Search - Focuses on retrieving highly relevant graph substructures for detailed analysis.

Odysseus precomputes retrieval structures, optimizing runtime efficiency for downstream queries.

3. Alchemist - Runtime Graph Retrieval Engine

Alchemist serves as the dynamic counterpart to Odysseus, executing retrieval queries in real-time. Like Odysseus, it supports:

Global Search - Retrieves broad contextual information.
Local Search - Extracts specific, high-relevance subgraphs.

Alchemist ensures flexible and adaptive retrieval, integrating various retrieval strategies to enhance response accuracy.

4. Sanchayam - Storage Management Module

Sanchayam is a plugin-based storage manager enabling seamless integration of both object storage and file system-based storage solutions. It acts as the central data store for Prophet, ensuring efficient access and persistence of extracted knowledge graphs and retrieval artifacts.

Environment setup

This project is built with Anaconda. To replicate the conda environment run the following command (NB: Anaconda installation is required)

conda env create -f environment.yml

After activating the Conda environment, install the latest PyTorch with CUDA 12.4 manually:

pip install torch --index-url https://download.pytorch.org/whl/cu124

Basic configurations

config.yml file

As of now, config.yml file allows to following configurations:

LLM to be used in the pipeline
Maximum size limit for each text unit extracted from the source document
Storage backend : Prophet supports both object storage and file system storage.
- Defaults to local file system storage. Uses python os libraries in the backend.
- Object storage backend is under development
Storage directory : Custom storage directory path for saving the pipeline artifacts

Steps to run the system

Document processing

To prepare knowledge graph from new documents, store the document in infra/data/ directory
Update the main section of the Prophet.py with the new file name example:

    init_state = {"source_path":"<langgraph_application_structure.md>"} # Source name with extension
    ...
    odysseus = Odysseus(sources=["langgraph_application_structure"]) # Source name with without extension

Then run the Prophet.py

python Prophet.py

This will prepare knowledge graph and related vector databases for the given document. After all the static processing pipeline, zeromq based server will be initiated and start listening to port 5555. The Alchemist engine would connect with this server during runtime.

Document querying

Make sure Odysseus server is up
Update the Alchemist/engine.py file with the query to be answered

...
state = {"query":"Explain concept of checkpointers?"} 
...

Then run the Alchemist engine (Should run as python module)

python -m Alchemist.engine

Roadmap

Support for multiple retrieval strategies beyond GraphRAG
MinIO storage backend integrations within Sanchayam
Implement additional knowledge graph extraction techniques

License

Prophet is open-source and licensed under MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prophet: AI-driven Knowledge Graph System

Modules Overview

1. Bodhi - Knowledge Graph Extraction

2. Odysseus - Static Graph Retrieval Engine

3. Alchemist - Runtime Graph Retrieval Engine

4. Sanchayam - Storage Management Module

Environment setup

Basic configurations

config.yml file

Steps to run the system

Document processing

Document querying

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Alchemist		Alchemist
Bodhi		Bodhi
Odysseus		Odysseus
Sanchayam		Sanchayam
infra		infra
.gitignore		.gitignore
LICENSE		LICENSE
Prophet.py		Prophet.py
README.md		README.md
__init__.py		__init__.py
config.yml		config.yml
environment.yml		environment.yml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Prophet: AI-driven Knowledge Graph System

Modules Overview

1. Bodhi - Knowledge Graph Extraction

2. Odysseus - Static Graph Retrieval Engine

3. Alchemist - Runtime Graph Retrieval Engine

4. Sanchayam - Storage Management Module

Environment setup

Basic configurations

config.yml file

Steps to run the system

Document processing

Document querying

Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages