A Knowledge Graph-Powered Conversational AI built to answer legal and policy-related queries from the eBay User Agreement using Neo4j, LLMs (Meta LLaMA 3B), and Memory via FAISS.
π Website - DEMO
Reading long user agreements is painful. This project creates an intelligent chatbot that:
- Understands natural language queries
- Retrieves facts from a Neo4j-based Knowledge Graph
- Enhances responses using memory of past conversations
- Uses open-source LLMs to generate grounded, concise, and transparent answers
- User submits a query via Streamlit UI.
- Named Entities are extracted using SpaCy + RE.
- Matching triples are fetched from Neo4j KG.
- Memory module (FAISS) adds past Q&A context.
- Prompt is dynamically injected and sent to LLaMA-3B.
- Response is streamed and displayed to the user.
- Text source: eBay User Agreement PDF
- Preprocessing: cleaned and tokenized using SpaCy
- NER & RE: Custom rules + pre-trained SpaCy models
- Triplets: Extracted using pattern matching and OpenIE-style RE
- Storage: JSON + CSV β Loaded into Neo4j (local or Aura Free)
- Tools:
graph_builder.py,KG_creation.ipynb
- Input query is processed for Named Entities.
- Synonyms are expanded using Sentence Transformers.
- KG is queried using Cypher to retrieve matching triplets.
- Top-k results ranked based on entity similarity & relevance.
- Implemented in
retriever.pyusingmatch (s)-[r]->(o)pattern.
- Format: [Triples] + [Memory] β Context Window
- Model: Meta LLaMA-3B (Instruct-tuned)
- Sent via HF endpoint with streaming
π Example
System: You are a legal assistant for eBay User Agreement.
Context:
* \[User] may terminate the agreement with 30 days notice.
* \[eBay] may restrict access for violation.
Memory:
* Q: What if I break the policy? A: Your access may be restricted.
Question: Can I end the agreement anytime?
Answer:
eBay allows termination with 30 daysβ notice. However, immediate termination may depend on specific conditions outlined in Section X.
-
Install dependencies:
pip install -r requirements.txt -
Add your Hugging Face token in the UI sidebar.
-
Add Neo4j credentials in
.streamlit/secrets.toml -
Run:
streamlit run app.py
- Model: Meta LLaMA-3B-Instruct (via HuggingFace)
- Endpoint: HuggingFace Inference Endpoint (stream=True)
- Temperature: 0 - 0.2 for factual output
- Streaming: Enabled to simulate real-time response using
requestswith stream
β Knowledge Graph-based reasoning
β Memory-augmented retrieval (FAISS)
β Legal/Policy Q&A grounded in real documents
β Streamlit-powered UI with chat history and controls
β Chat save/load functionality
β Real-time LLM responses using HuggingFace inference endpoint
.
βββ app.py # Main Streamlit app
βββ requirements.txt # Dependencies
βββ create_code.py # Code generation helper
βββ chat_history.json # Sample chat history
β
βββ Src/ # Core logic modules
β βββ memory.py # Persistent memory using Chroma
β βββ retriever.py # Entity extractor & KG triple retriever
β βββ prompt_injector.py # Prompt builder & LLM streaming query
β βββ graph_builder.py # For KG construction
β
βββ Triples/ # Triplets extracted from the source doc
β βββ graphrag_triplets.csv/json
β βββ triples_raw.json
β βββ triples_structured.json
β βββ knowledge_graph_triplets.json
β
βββ KG/ # Visuals & summaries
β βββ knowledge_graph_image.png
β βββ summary.json
β
βββ NER/ # Extracted named entities
β βββ ner_entities.json
β
βββ Data/
β βββ Ebay_user_agreement.pdf
β βββ cleaned_ebay_user_agreement.txt
β
βββ Notebooks/ # Jupyter notebooks for exploration
β βββ KG_creation.ipynb
β βββ preprocessing.ipynb
β βββ graphrag-quering-kg-and-llm-prompting.ipynb
β
βββ .streamlit/
β βββ secrets.toml # API keys & credentials
βββ .gitignore
βββ README.md- Clone the repository
git clone https://github.com/MohitGupta0123/GraphRAG-Ebay-User-Aggrement-Chatbot.git
cd GraphRAG-Ebay-User-Aggrement-Chatbot- Install dependencies
pip install -r requirements.txt- Token Configuration
This app prompts for your Hugging Face token (HF_TOKEN) securely at runtime in the sidebar.
You no longer need to store the token in secrets.toml.
However, Neo4j credentials are still required in .streamlit/secrets.toml:
NEO4J_URI = "bolt://your_neo4j_uri"
NEO4J_USERNAME = "neo4j"
NEO4J_PASSWORD = "your_password"
NEO4J_DATABASE = "neo4j"- Launch the app
streamlit run app.pyYou ask a question like:
"Can I terminate the agreement anytime?"
Entities like terminate, agreement are extracted.
Relevant triples from Neo4j are retrieved.
Past similar Q&A are pulled from persistent memory (Faiss).
Triples + memory form a context which is sent to LLaMA-3B via Hugging Face API.
The LLM answers based only on retrieved facts β no hallucination.
You can download your chat as a .json file and re-upload it to continue your session.
All retrieved triples and memory are retained across sessions!
- Frontend: Streamlit
- LLM: Meta LLaMA-3B-Instruct via HuggingFace
- Graph: Neo4j (Aura Free or Local)
- Embeddings: SentenceTransformers
- Memory Store: FAISS
- Triplet Extraction: SpaCy / RE Pipelines
- NER: Custom + pre-trained models
- Currently optimized for the eBay User Agreement
- Requires manual graph building from text
- Needs HuggingFace token (streaming)
For suggestions or collaboration:
- π§ mgmohit1111@gmail.com
- πΌ LinkedIn
- GraphRAG research from Meta AI
- Neo4j Knowledge Graphs
- LangChain Memory Chains












