Academic RAG Pipeline with LLM Integration

A retrieval-augmented generation (RAG) system designed for academic research papers. Combines intelligent text chunking, vector search, and local LLM capabilities to provide context-aware responses to research queries.

UI

Features

Intelligent Text Chunking: Section-aware paper splitting using research paper structure detection
Vector Search: Pinecone integration for semantic similarity search
Local LLM Integration: Ollama-powered language model with RAG augmentation
Modern Chat Interface: Full-screen black & white UI with RAG toggle
ArXiv Integration: Direct paper search and PDF processing
Smart Context Management: Automatic relevance scoring and chunk selection

Architecture

pipeline/
├── ragpipe/                 # Core RAG pipeline
│   ├── services/           # Business logic services
│   │   ├── arxiv_service.py    # ArXiv API integration
│   │   ├── pdf_service.py      # PDF processing & text extraction
│   │   ├── rag_service.py      # RAG orchestration
│   │   ├── section_chunker.py  # Intelligent text chunking
│   │   └── vector_service.py   # Pinecone vector operations
│   ├── models/             # Data models
│   │   └── paper.py            # Academic paper representation
│   ├── config/             # Configuration management
│   │   └── settings.py         # App-wide settings
│   └── utils/              # Utility functions
│       └── text_cleaner.py     # Text preprocessing
├── llm_integration/        # LLM service layer
│   ├── services/           # LLM services
│   │   ├── llm_service.py      # Ollama integration
│   │   ├── rag_orchestrator.py # RAG + LLM coordination
│   │   └── prompt_builder.py   # Context-aware prompts
│   ├── models/             # Conversation models
│   ├── config/             # LLM configuration
│   └── main.py             # Main application entry
├── chat_app.py             # Flask web interface
├── templates/               # HTML templates
└── requirements.txt         # Python dependencies

Prerequisites

Python 3.8+
Ollama with llama2:7b model
Pinecone account and API key
ArXiv API access

Installation

Clone the repository
```
git clone <repository-url>
cd pipeline
```

Create and activate conda environment

conda create -n pipeline python=3.9
conda activate pipeline

Install dependencies
```
pip install -r requirements.txt
```

Install Ollama and pull model

# Install Ollama (macOS)
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the required model
ollama pull llama2:7b

Set up environment variables

cp .env.example .env
# Edit .env with your API keys

Configuration

Environment Variables

Create a .env file in the project root:

# Pinecone Configuration
PINECONE_API_KEY=your_pinecone_api_key
ENVIRONMENT=your_pinecone_environment
INDEX=your_index_name

# LLM Configuration
LLM_MODEL=llama2:7b
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_TIMEOUT=120

# RAG Settings
MAX_CONTEXT_LENGTH=4000
RAG_SIMILARITY_THRESHOLD=0.7
DEFAULT_SIMILARITY_THRESHOLD=0.4

Key Configuration Parameters

OLLAMA_TIMEOUT: Increased to 120s to handle RAG context processing
DEFAULT_SIMILARITY_THRESHOLD: Set to 0.4 for academic paper relevance
MAX_CONTEXT_LENGTH: Limits context to prevent LLM timeouts

Usage

1. Start Ollama Service

ollama serve

2. Run the Chat Interface

python chat_app.py

The application will start on http://localhost:5001

3. Using the RAG System

RAG Mode ON: Queries search your paper database and provide context-augmented responses
RAG Mode OFF: Direct LLM responses without paper context
Toggle: Use the RAG toggle switch in the chat header

4. Command Line Testing

# Test the RAG pipeline
python -m ragpipe.main

# Test section chunking
python test_section_chunker.py

# Test chunked RAG integration
python test_chunked_rag.py

Core Components

Section Chunker

Intelligent text splitting based on academic paper structure:

Detects numbered sections (1. Introduction, 2.1 Background)
Identifies common academic headers (Abstract, Methods, Results)
Filters out tiny chunks (< 50 characters)
Configurable chunk sizes (default: 2000 chars max, 50 chars min)

RAG Service

Coordinates the complete retrieval pipeline:

Vector database search with quality assessment
ArXiv paper discovery when needed
PDF processing and text extraction
Intelligent chunk selection and relevance scoring

LLM Integration

Local language model with RAG capabilities:

Ollama integration for privacy and performance
Context-aware prompt building
Conversation history management
Automatic RAG/LLM routing based on query type

Data Flow

User Query → Chat interface
RAG Processing → Vector search + paper retrieval
Text Chunking → Section-aware splitting
Relevance Scoring → Semantic similarity + header matching
Context Building → Top-N relevant chunks
LLM Generation → Context-augmented response

Testing

The project includes test scripts:

test_section_chunker.py: Validates text chunking quality
test_chunked_rag.py: Tests RAG pipeline integration
test_pdf_content.py: Verifies PDF text extraction
ragpipe/main.py: End-to-end pipeline testing

Troubleshooting

Common Issues

Ollama Connection Failed
- Ensure ollama serve is running
- Check OLLAMA_BASE_URL in configuration
Pinecone Connection Error
- Verify API key and environment settings
- Check index name configuration
PDF Processing Issues
- Ensure PyPDF2 is installed
- Check download directory permissions
Timeout Errors
- Increase OLLAMA_TIMEOUT for complex queries
- Reduce max_chunks in RAG service

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Academic RAG Pipeline with LLM Integration

UI

Features

Architecture

Prerequisites

Installation

Configuration

Environment Variables

Key Configuration Parameters

Usage

1. Start Ollama Service

2. Run the Chat Interface

3. Using the RAG System

4. Command Line Testing

Core Components

Section Chunker

RAG Service

LLM Integration

Data Flow

Testing

Troubleshooting

Common Issues

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
llm_integration		llm_integration
ragpipe		ragpipe
templates		templates
.DS_Store		.DS_Store
README.md		README.md
chat_app.py		chat_app.py
requirements.txt		requirements.txt
test_chunked_rag.py		test_chunked_rag.py
test_paper_download.py		test_paper_download.py
test_paper_structure.py		test_paper_structure.py
test_pdf_content.py		test_pdf_content.py
test_section_chunker.py		test_section_chunker.py
test_specific_paper.py		test_specific_paper.py

Folders and files

Latest commit

History

Repository files navigation

Academic RAG Pipeline with LLM Integration

UI

Features

Architecture

Prerequisites

Installation

Configuration

Environment Variables

Key Configuration Parameters

Usage

1. Start Ollama Service

2. Run the Chat Interface

3. Using the RAG System

4. Command Line Testing

Core Components

Section Chunker

RAG Service

LLM Integration

Data Flow

Testing

Troubleshooting

Common Issues

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages