NLP & LLM Methods Documentation

Comprehensive documentation for advanced Natural Language Processing and Large Language Model techniques implemented in the Nexus framework.

Overview

This directory contains detailed documentation for state-of-the-art NLP methods across multiple categories:

Reasoning: Advanced reasoning techniques for complex problem-solving
RAG (Retrieval-Augmented Generation): Methods for knowledge-grounded generation
PEFT (Parameter-Efficient Fine-Tuning): Efficient model adaptation techniques
Quantization: Model compression methods
Embeddings: Dense representation learning
Structured Generation: Constrained decoding techniques
Tokenization: Advanced tokenization methods

Quick Start

New to this documentation?

Quick Reference: Start with QUICK_REFERENCE.md for method selection cheat sheets
Overview: Read SUMMARY.md for comprehensive overview
Deep Dive: Explore individual method documentation

Looking for a specific capability?

Reasoning & Problem-Solving → reasoning/

Chain-of-Thought, Tree of Thoughts, Graph of Thoughts, Self-Consistency, ReAct

Knowledge Retrieval → rag/

Self-RAG, CRAG, GraphRAG, RAPTOR

Efficient Training → peft/

LoRA, QLoRA, DoRA

Model Compression → quantization/

GPTQ, AWQ

Text Representation → embeddings/

BGE-M3, Matryoshka Embeddings

Controlled Generation → structured_generation/

JSON Schema Decoding, Grammar-Constrained Decoding

Tokenization → tokenization/

Byte Latent Transformer, MambaByte

Documentation Structure

10_nlp_llm/
├── README.md                        # This file
├── SUMMARY.md                       # Comprehensive overview
├── QUICK_REFERENCE.md               # Cheat sheets and decision trees
│
├── reasoning/                       # Advanced Reasoning Methods
│   ├── README.md                    # Reasoning landscape overview
│   ├── chain_of_thought.md          # Sequential step-by-step reasoning
│   ├── tree_of_thoughts.md          # Search-based reasoning with BFS/DFS
│   ├── graph_of_thoughts.md         # Graph-structured reasoning
│   ├── self_consistency.md          # Ensemble reasoning via voting
│   └── react.md                     # Tool-augmented reasoning
│
├── rag/                             # Retrieval-Augmented Generation
│   ├── README.md                    # RAG landscape overview
│   ├── self_rag.md                  # Adaptive retrieval with self-reflection
│   ├── crag.md                      # Corrective retrieval with quality assessment
│   ├── graph_rag.md                 # Knowledge graph-based retrieval
│   └── raptor.md                    # Hierarchical tree-based retrieval
│
├── peft/                            # Parameter-Efficient Fine-Tuning
│   ├── README.md
│   ├── lora.md                      # Low-Rank Adaptation
│   ├── qlora.md                     # Quantized LoRA
│   └── dora.md                      # Weight-Decomposed LoRA
│
├── quantization/                    # Model Compression
│   ├── README.md
│   └── gptq.md                      # Post-training quantization
│
├── embeddings/                      # Dense Representations
│   ├── README.md
│   ├── bge_m3.md                    # Multi-lingual, Multi-granular embeddings
│   └── matryoshka_representation_learning.md  # Flexible-dimension embeddings
│
├── structured_generation/           # Constrained Decoding
│   ├── README.md
│   ├── json_schema_decoder.md       # JSON-constrained generation
│   └── grammar_constrained_decoding.md  # Grammar-based constraints
│
└── tokenization/                    # Advanced Tokenization
    ├── README.md
    └── byte_latent_transformer.md   # Byte-level tokenization

Key Features

Comprehensive Coverage

Each method documentation includes:

✅ Overview & Motivation: Why the method exists and when to use it ✅ Theoretical Background: Mathematical formulations and key insights ✅ High-Level Intuition: Flow diagrams and conceptual explanations ✅ Implementation Details: Practical guidance and code patterns ✅ Code Walkthrough: Real examples referencing Nexus implementations ✅ Optimization Tricks: Performance improvements and best practices ✅ Experiments & Results: Benchmark performance from papers ✅ Common Pitfalls: Known issues and how to avoid them ✅ References: Links to original papers and related work

Implementation References

All documentation references actual code in the Nexus framework:

Reasoning: Nexus/nexus/models/nlp/reasoning/
RAG: Nexus/nexus/models/nlp/rag/
Other methods: Nexus/nexus/models/nlp/

Decision Support

Built-in decision trees and comparison matrices help you:

Choose the right method for your task
Understand trade-offs (quality vs. cost)
Compare alternatives side-by-side

Statistics

📊 Total Documentation: ~10,000 lines across 27 markdown files 📚 Methods Covered: 9 reasoning methods, 8 RAG variations, plus PEFT, quantization, embeddings, and more 🔗 Code References: Direct links to 30+ implementation files 📈 Benchmark Results: Performance data from 20+ research papers

Recently Added (New!)

Reasoning Methods

✨ Chain-of-Thought: Complete guide with neural architecture details
✨ Tree of Thoughts: BFS/DFS search with thought evaluation
✨ Graph of Thoughts: Generate-Aggregate-Refine-Score operations
✨ Self-Consistency: Majority voting over diverse reasoning paths
✨ ReAct: Tool-augmented reasoning with Thought-Action-Observation loops

RAG Methods

✨ Self-RAG: Adaptive retrieval with [Retrieve], [IsRelevant], [IsSupported], [IsUseful] tokens
✨ CRAG: Corrective retrieval with Correct/Ambiguous/Incorrect decisions
✨ GraphRAG: Knowledge graph construction and community-based retrieval
✨ RAPTOR: Recursive clustering and multi-level tree retrieval

Usage Examples

Reasoning

# Chain-of-Thought
from nexus.models.nlp.chain_of_thoughts import ChainOfThoughtModule
cot = ChainOfThoughtModule(config)
output = cot(hidden_states)

# Tree of Thoughts
from nexus.models.nlp.reasoning.tree_of_thoughts import TreeOfThoughts
tot = TreeOfThoughts({"max_depth": 3, "search_method": "bfs"})
result = tot.solve("Problem statement")

# Self-Consistency
from nexus.models.nlp.reasoning.self_consistency import SelfConsistency
sc = SelfConsistency({"num_samples": 40})
answer = sc.solve("Question")

RAG

# Self-RAG
from nexus.models.nlp.rag.self_rag import SelfRAGModel
model = SelfRAGModel(config)
outputs = model(input_ids, document_embeddings)

# CRAG
from nexus.models.nlp.rag.crag import CRAGPipeline
crag = CRAGPipeline(config)
outputs = crag(query_embedding, document_embeddings)

# GraphRAG
from nexus.models.nlp.rag.graph_rag import GraphRAGPipeline
graph_rag = GraphRAGPipeline(config)
outputs = graph_rag(query_embedding, community_summaries)

Performance Highlights

Reasoning Methods (GSM8K Math)

Method	Accuracy	Gain over Baseline
Baseline	17.9%	-
Chain-of-Thought	58.1%	+40.2%
Self-Consistency	72.0%	+54.1%
Tree of Thoughts	74.0%	+56.1%

RAG Methods (PopQA)

Method	Accuracy	Gain over Baseline
Baseline LLM	20%	-
Standard RAG	35.2%	+15.2%
Self-RAG	56.1%	+36.1%
CRAG	63.5%	+43.5%

Contributing

To add new method documentation:

Follow the established structure (see any existing method doc)
Include all 9 required sections (Overview, Theory, Intuition, etc.)
Reference actual Nexus implementations
Add benchmark results from papers
Document optimization tricks and pitfalls
Update README files with new method

Support

Code Issues: See Nexus/nexus/models/nlp/
Documentation Issues: Check individual method docs
Quick Help: Use QUICK_REFERENCE.md

References

Foundational Papers

Reasoning:

Wei et al. (2022) - Chain-of-Thought Prompting
Wang et al. (2023) - Self-Consistency
Yao et al. (2023) - Tree of Thoughts
Besta et al. (2024) - Graph of Thoughts
Yao et al. (2023) - ReAct

RAG:

Lewis et al. (2020) - RAG: Retrieval-Augmented Generation
Asai et al. (2023) - Self-RAG
Yan et al. (2024) - CRAG
Edge et al. (2024) - GraphRAG
Sarthi et al. (2024) - RAPTOR

Survey Papers

Liu et al. (2023) - "Beyond Chain-of-Thought: A Survey of Chain-of-X"
Gao et al. (2024) - "Retrieval-Augmented Generation for Large Language Models: A Survey"

License

Documentation follows the same license as the Nexus framework.

Last Updated: February 2026 Version: 1.0 Total Methods Documented: 25+ Lines of Documentation: ~10,000

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NLP & LLM Methods Documentation

Overview

Quick Start

New to this documentation?

Looking for a specific capability?

Documentation Structure

Key Features

Comprehensive Coverage

Implementation References

Decision Support

Statistics

Recently Added (New!)

Reasoning Methods

RAG Methods

Usage Examples

Reasoning

RAG

Performance Highlights

Reasoning Methods (GSM8K Math)

RAG Methods (PopQA)

Contributing

Support

References

Foundational Papers

Survey Papers

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

NLP & LLM Methods Documentation

Overview

Quick Start

New to this documentation?

Looking for a specific capability?

Documentation Structure

Key Features

Comprehensive Coverage

Implementation References

Decision Support

Statistics

Recently Added (New!)

Reasoning Methods

RAG Methods

Usage Examples

Reasoning

RAG

Performance Highlights

Reasoning Methods (GSM8K Math)

RAG Methods (PopQA)

Contributing

Support

References

Foundational Papers

Survey Papers

License