NSHG-RAG: Neuro‑Symbolic Hybrid Graph RAG

Overview

NSHG‑RAG is not a generic Retrieval‑Augmented Generation (RAG) demo. It is an experimental, research‑grade RAG system designed to study and implement retrieval policies, hybrid knowledge representations, and evaluation‑driven reliability in large‑language‑model systems.

Rather than treating retrieval as a single vector search call, NSHG‑RAG models retrieval as a decision‑making process governed by query decomposition, conditional routing, symbolic constraints, and retriever‑specific trust weighting. Large Language Models (LLMs) are treated as components, not authorities.

The system is built as a neuro‑symbolic RAG testbed that allows controlled experimentation across dense, sparse, symbolic, and clustered retrieval paradigms — with first‑class support for ablation, evaluation, and architectural introspection.

Design Philosophy

NSHG‑RAG is guided by the following principles:

Retrieval is a policy, not a function call Retrieval decisions are explicitly reasoned about, decomposed, filtered, and weighted.
Knowledge is structured, fallible, and contextual Text chunks are treated as indexing artifacts, not ground truth. Symbolic graphs, metadata constraints, and clustering provide alternative views over the same knowledge.
LLMs synthesize — they do not decide LLMs are used for query decomposition, weighting, and synthesis, while control logic remains external and inspectable.
Evaluation precedes optimization Ablation studies, retriever comparisons, and controlled experiments are central to development.
Failure modes are first‑class The system is designed to expose retrieval bias, routing errors, and relevance trade‑offs rather than hide them behind fluent answers.

Key Contributions & Novelty

NSHG-RAG differs from typical RAG systems in several important ways:

Policy-driven retrieval — Retrieval decisions are externalized, auditable, and dynamically weighted across heterogeneous retrievers.
Hybrid neuro-symbolic knowledge — Combines dense, sparse, clustering, and symbolic graph retrieval rather than relying on a single retriever.
Planner-centric orchestration — LLMs are used for query decomposition and synthesis, while retrieval control remains external and inspectable.
Chunking as first-class concern — Semantic, structure-aware chunks ensure high-quality retrieval and reduce hallucination risk.
Evaluation-first design — Built-in ablation and comparative evaluation to understand retrieval quality, weighting effects, and system robustness.
Not a chatbot framework — Optimized for research, analysis, and controlled experiments rather than user-facing conversation.

Core Capabilities

Important architectural note: In NSHG-RAG, each retriever has a distinct epistemic role. Overlap in functionality is intentional and resolved at the policy and weighting level, not inside individual retrievers.

Retrieval Policy & Orchestration

Query decomposition into atomic, independent subqueries
Conditional retrieval based on:
- Query intent
- Scope constraints (file, folder, extension)
- Precision vs recall requirements
LLM‑assisted retriever weighting
Symbolic, semantic, sparse, and dense hybridization

Hybrid Knowledge Access

NSHG-RAG employs multiple retrievers, each optimized for a different notion of relevance. These retrievers are not interchangeable.

Dense vector retrieval (FAISS) Chunk-level semantic proximity. Optimized for paraphrasing, implicit references, and fine-grained relevance.
Sparse lexical retrieval (BM25) Literal keyword and factoid matching. Optimized for exact terms, identifiers, and surface-form precision.
Semantic clustering retrieval Concept-level routing. Optimized for identifying which conceptual region of a document is relevant, not which exact sentence.
Symbolic graph-based retrieval Structure- and author-intent-driven access via titles, sections, file paths, and explicit document organization.

Hybrid re-ranking is performed through explicit score-space ensembling under planner-controlled weights, rather than implicit cascades or opaque rerankers.

Dense vector retrieval (FAISS)
Sparse lexical retrieval (BM25)
Semantic clustering retrieval
Symbolic graph‑based retrieval
Hybrid re‑ranking under explicit weight control

Knowledge Representation

Chunk‑level indexing (fixed + semantic)
Metadata‑aware filtering
Symbolic graph views over document structure
Section‑ and title‑aware retrieval

Planner‑Executor Architecture

Explicit planning stage before retrieval
Retrieval decisions externalized from prompting
Multi‑step execution with intermediate state
Robust LLM output parsing and fallback logic

Evaluation & Analysis

Retriever ablation planning
Side‑by‑side retriever comparison
Precision / recall / F1 tracking
Regression‑friendly experimental CLI
Score‑space analysis across retrievers
Inspection of weight sensitivity and retrieval agreement

High‑Level Architecture

User Query
   ↓
Planner (Decompose → Filter → Weight)
   ↓
Hybrid Retriever (Symbolic | Cluster | BM25 | FAISS)
   ↓
Retrieved Evidence (Deduplicated)
   ↓
LLM Synthesizer
   ↓
Final Answer

Planner: The Core of NSHG‑RAG

The Planner is the defining component of NSHG‑RAG and the primary reason it qualifies as an advanced RAG system.

Planner Responsibilities

Query Decomposition Converts a complex user query into atomic, declarative subqueries designed solely for high‑confidence retrieval — not reasoning.
Scope & Constraint Extraction Identifies explicit filters such as:
- File names
- Folder paths
- File extensions
Enforces strict scoping when required.
Retriever Weight Assignment Dynamically assigns trust weights across retrievers:
- Symbolic
- Semantic clustering
- BM25
- FAISS
Weights are chosen based on query intent, scope strictness, and retrieval semantics — not heuristics alone.
Conditional Retrieval Execution Each subquery is retrieved independently with its own filters and weights, enabling:
- Multi‑hop retrieval
- Recall‑diverse evidence gathering
- Bias mitigation across retrievers
Evidence Aggregation Retrieved chunks are deduplicated and merged before synthesis.

Why This Matters

This design:

Prevents early hallucination
Decouples reasoning from retrieval
Makes retrieval decisions auditable
Enables ablation at the policy level

Semantic Chunking: The Foundation of NSHG-RAG

Retrieval quality in NSHG-RAG is fundamentally bounded by chunk quality. Instead of relying on fixed-size or naive recursive splitting, NSHG-RAG employs a semantic chunking strategy designed to preserve conceptual coherence while remaining retriever-friendly.

Semantic Chunker Design

The SemanticChunker operates as a content-aware preprocessing layer that normalizes heterogeneous documents into retrieval-stable units.

Key characteristics:

Multi-format parsing Supports Markdown, PDF, DOCX, PPTX, notebooks, spreadsheets, source code, configuration files, and plain text through specialized parsers.
Structure-preserving preprocessing Parsed elements retain metadata such as section titles and document hierarchy, enabling downstream symbolic and scoped retrieval.
Semantic-aware splitting Large textual elements are split using embedding similarity rather than fixed token counts. Splits occur only when semantic drift exceeds a configurable threshold.
Token-aware constraints Each chunk is bounded by minimum and maximum token limits to ensure compatibility with both sparse and dense retrievers.
Context-preserving overlap Adjacent chunks include limited sentence overlap to prevent boundary-induced information loss.
Retriever-aligned output Chunks are designed to be equally consumable by BM25, FAISS, clustering, and symbolic retrievers without format-specific bias.

Why This Matters

Most RAG failures attributed to retrieval or ranking are, in practice, chunking failures. NSHG-RAG treats chunking as a first-class architectural concern:

Reduces semantic fragmentation
Improves retriever agreement
Enables more stable score-space ensembling
Strengthens symbolic and section-aware filtering

Chunking in NSHG-RAG is not a preprocessing utility — it is an implicit retrieval policy.

Project Structure

`chunker/`

Semantic, structure-aware chunk generation (fixed + semantic splitting). Document parsing and chunk generation (fixed + semantic).

`retrievers/`

Independent implementations of:

BM25
FAISS
Hybrid retrieval
Semantic clustering
Symbolic graph traversal

`planner/`

Explicit retrieval‑reasoning orchestration layer.

`evaluate/`

Evaluation, ablation, and benchmarking utilities.

`storage/`

Embedding and metadata persistence (SQLite + vector stores).

`llm/`

LLM adapters (Gemini, Ollama).

`interface/`

Abstract contracts enforcing modularity and interchangeability.

Evaluation Philosophy

NSHG‑RAG treats evaluation as a first‑order concern:

Retrieval quality is measured independently of generation
Each retriever can be isolated and stress‑tested
Ablation studies reveal architectural dependencies
Performance is tracked across configuration changes

Most RAG systems fail here — NSHG‑RAG is designed to expose those failures.

Evaluation & Usage Example

NSHG-RAG comes with an evaluation CLI that demonstrates its capabilities in comparative and ablation studies.

Sample Command

python eval_cli.py --eval_type comparative --topk 5 --chunker_type semantic

Options

--eval_type:
- comparative → Compare all retrievers side-by-side
- ablation_weights → Test dynamic vs fixed planner-assigned weights
- ablation_retriever → Isolate retrievers to measure individual impact
--topk → Number of retrieved chunks per retriever
--chunker_type → Choose semantic (structure-aware) or file chunking
--human_eval → Force manual evaluation scoring
--use_dynamic_weights → For weight ablation experiments

This workflow demonstrates how NSHG-RAG separates retrieval policy from LLM synthesis, enabling fine-grained experimentation and analysis.

Score Calibration & Confidence

NSHG‑RAG explicitly distinguishes score normalization from retrieval confidence.

All retrievers emit scores normalized to a common ([0,1]) range to enable safe combination. However, normalized scores are not treated as calibrated probabilities. Confidence is instead approximated implicitly through:

Agreement across heterogeneous retrievers
Planner‑assigned trust weights based on query intent
Redundancy of evidence across independent subqueries

This design avoids over‑interpreting any single retriever’s score while still enabling principled score‑space ensembling. Explicit probabilistic calibration is left as a future research direction.

Cluster & Symbolic Retrieval: Design Rationale

Semantic Cluster Retrieval

Semantic clusters in NSHG-RAG are conceptual groupings, not reranking mechanisms.

Key design choice:

Cluster-level similarity scores are intentionally not refined at the chunk level.

Rationale:

The role of the cluster retriever is to identify which conceptual region of a document is relevant.
Fine-grained chunk-to-query proximity is delegated to FAISS and BM25, which are explicitly designed for that purpose.
Cluster retrieval therefore prioritizes high-recall concept discovery over local precision.

All chunks within a relevant cluster are surfaced and subsequently reweighted through the hybrid ensemble. This separation of concerns avoids double-counting semantic similarity and preserves retriever epistemic clarity.

Symbolic Retrieval

The Symbolic Retriever operates over a directory hierarchy knowledge graph (folder → file → section → chunk) and is intentionally not a primary relevance scorer. Its core role is search-space restriction based on query intent, as decided by the planner.

Purpose

Unlike dense (FAISS) or sparse (BM25) retrievers, the symbolic retriever:

Does not attempt to rank chunks by semantic relevance
Acts as a structural filter that reduces the candidate set
Encodes human-authored organization (folders, files, extensions, section titles)

This mirrors how experts navigate large codebases or document collections: "first narrow the scope, then search deeply".

Graph Structure

The symbolic knowledge graph is a MultiDiGraph with the following hierarchy:

Folder → File → Section → Chunk

Folder nodes encode directory structure
File nodes encode filenames, extensions, normalized paths
Section nodes encode semantic headings / document structure
Chunk nodes store the final retrievable content

Planner-Driven Filtering

The planner may infer query intent such as:

Target folder(s)
Specific file names
File extensions (e.g. .py, .md)

These constraints are passed to the symbolic retriever, which:

Identifies candidate files using graph indices
Collects all descendant section + chunk nodes
Returns a restricted chunk set to downstream retrievers

This filtered candidate pool is then consumed by:

BM25 (lexical precision)
FAISS (dense proximity)

Both retrievers operate only within this planner-approved subset.

Semantic Use Inside Symbolic Retrieval

Semantic scoring inside the symbolic retriever is lightweight and structural:

Uses section titles, not chunk text
Helps rank chunks within the already-filtered space
Never competes with FAISS or BM25 as a primary scorer

Design Rationale

This separation avoids:

Global dense search over irrelevant folders
Semantic score dilution across unrelated domains
Redundant computation already handled by FAISS

In this architecture:

Symbolic retrieval = intent-aligned filtering
Cluster retrieval = concept-level recall
FAISS / BM25 = fine-grained relevance

This makes retrieval:

Interpretable
Planner-controllable
Efficient at scale
Aligned with human information-seeking behavior

The symbolic retriever encodes explicit human structure:

Section titles
Document hierarchy
File and folder organization

Symbolic signals are treated as high-precision but low-recall. They are never assumed to be semantically complete and are always combined with semantic retrievers under planner control.

Current Limitations & Research Directions

NSHG‑RAG is intentionally open‑ended. Planned and exploratory directions include:

Temporal and versioned knowledge
Retrieval‑answer consistency verification
Contradiction detection across evidence
Confidence estimation and abstention
Learned retrieval policies
Adversarial and injection‑resistant retrieval

Intended Audience

NSHG‑RAG is built for:

Researchers studying RAG reliability
Engineers designing knowledge systems
Practitioners moving beyond demo‑level RAG

If your goal is a chatbot, this is overkill. If your goal is trustworthy knowledge synthesis, this is the right level of abstraction.

License

MIT License

Final Note

NSHG‑RAG treats RAG not as retrieval‑augmented text generation, but as externalized cognition — where memory, control, and reasoning are explicit, inspectable, and improvable.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
chunker		chunker
evaluate		evaluate
interface		interface
llm		llm
planner		planner
retrievers		retrievers
storage		storage
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_cli.py		eval_cli.py
main.py		main.py

License

SrabanMondal/NSHG-RAG

Folders and files

Latest commit

History

Repository files navigation

NSHG-RAG: Neuro‑Symbolic Hybrid Graph RAG

Overview

Design Philosophy

Key Contributions & Novelty

Core Capabilities

Retrieval Policy & Orchestration

Hybrid Knowledge Access

Knowledge Representation

Planner‑Executor Architecture

Evaluation & Analysis

High‑Level Architecture

Planner: The Core of NSHG‑RAG

Planner Responsibilities

Why This Matters

Semantic Chunking: The Foundation of NSHG-RAG

Semantic Chunker Design

Why This Matters

Project Structure

chunker/

retrievers/

planner/

evaluate/

storage/

llm/

interface/

Evaluation Philosophy

Evaluation & Usage Example

Sample Command

Options

Score Calibration & Confidence

Cluster & Symbolic Retrieval: Design Rationale

Semantic Cluster Retrieval

Symbolic Retrieval

Purpose

Graph Structure

Planner-Driven Filtering

Semantic Use Inside Symbolic Retrieval

Design Rationale

Current Limitations & Research Directions

Intended Audience

License

Final Note

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`chunker/`

`retrievers/`

`planner/`

`evaluate/`

`storage/`

`llm/`

`interface/`

Packages