Small Giants

Exploring GenAI and Analytics use-cases with Small Language Models and Retrieval Models.

Overview

Small Giants is a curated collection of practical demonstrations showing how small language models (SLMs) and Retrieval models can power real-world AI applications. We focus on models in the 350M-3B parameter range, emphasizing efficiency, local inference, and structured outputs over raw scale.

Why Small Giants?

Small doesn't mean simple: Modern foundational models (LFMs) achieve remarkable performance in specialized domains
Local-first approach: Run inference on your hardware with Ollama
Cost-effective: Reduce API costs and improve privacy
Practical: Built on proven architectures (DSPy, structured extraction, agent orchestration)
Research-friendly: Playground for exploring scaling laws, Retrieval models, and few-shot learning

Projects

📄 Invoice Parser — Document Extraction

Multimodal document processing with structured extraction

Extract utility billing information (amount, currency, type) from invoice images using a two-stage pipeline:

Stage 1: Vision-language model (Liquid AI LFM2-VL-3B) extracts text from images
Stage 2: Compact extraction model (LFM2-1.2B-Extract) parses structured data

Attribute	Value
Type	Document Extraction
Architecture	DSPy Multi-stage
Models	LFM2-VL-3B, LFM2-1.2B-Extract
UI	Streamlit

→ Documentation

🤖 Granite Coder — Coding Agent

Token-efficient coding agent using IBM Granite 4

A lightweight coding assistant leveraging the "Greedy" architecture (Recursive Language Models) for token-efficient code assistance. Runs locally via Ollama with MCP server support for IDE integration.

Attribute	Value
Type	Coding Agent
Architecture	RLM (Recursive Language Model)
Models	IBM Granite 4
Interface	CLI + MCP Server

Usage:

cd granite-coder
uv sync

# CLI mode
granite-coder solve "Write a hello world function"

# Interactive chat
granite-coder chat

# MCP server mode
granite-coder mcp

→ Documentation

🔍 LangChain RAG — Retrieval-Augmented Generation

Local RAG pipeline with Qdrant vector search and RAGAS evaluation

A complete RAG benchmark demonstrating semantic retrieval with local models. Uses Qdrant for vector storage, Ollama for embeddings and generation, and RAGAS for automated evaluation of RAG metrics.

Attribute	Value
Type	RAG Pipeline
Architecture	LangChain + Qdrant + RAGAS
Models	nomic-embed-text, gpt-oss:20b-cloud
Evaluation	RAGAS (faithfulness, relevancy, recall, precision)

Usage:

cd langchain-qdrant-ollama-rag
make setup        # Install deps, pull models
make run-baseline # Run RAG pipeline
make run-ragas-eval # Run evaluation

→ Documentation

Roadmap

Planned use-cases exploring advanced GenAI + Analytics concepts:

Financial Analytics Pipeline: Multi-document reasoning across invoices/receipts/statements with time-series forecasting
Retrieval Model Framework: Develop evaluators for extraction accuracy, confidence estimation, and model comparison
Entity Linking System: Connect extracted data to knowledge bases for enriched analysis
Active Learning Loop: Identify high-uncertainty predictions for human review and model improvement
Benchmark Suite: Comparative evaluation of small models vs. larger alternatives on document understanding tasks

Quick Start

Prerequisites

Python 3.11+
Ollama (for local inference)
Git
uv (package manager)

Installation

# Clone repository
git clone https://github.com/olanigan/small-giants.git
cd small-giants

# Install project dependencies
cd dspy-liquid-agent && uv pip install -e . && cd ..
cd granite-coder && uv sync && cd ..
cd langchain-qdrant-ollama-rag && poetry install && cd ..

Run Demos

Invoice Parser:

cd dspy-liquid-agent
make download-samples  # Optional: create sample invoices
make run              # Launch Streamlit app at http://localhost:8501

Granite Coder:

cd granite-coder
granite-coder solve "What is 2+2?"

LangChain RAG:

cd langchain-qdrant-ollama-rag
# Prerequisites: Qdrant running (docker run -p 6333:6333 qdrant/qdrant)
make setup           # Install deps, pull Ollama models
make run-ragas-eval  # Run RAG pipeline with evaluation

Architecture

Agent-based orchestration using DSPy's modular framework:

Separates concerns: model inference, data validation, UI
Supports swappable model providers
Structured output enforcement via Pydantic schemas
Extensible for chain-of-thought and optimization

User Input → Agent → Stage 1 (Vision) → Stage 2 (Extraction) → Structured Output

RAG Pipeline using LangChain components:

Semantic search with Qdrant vector database
Local inference with Ollama embeddings and generation
Automated evaluation with RAGAS metrics

User Query → Embedding → Qdrant Search → Context → LLM → Answer → RAGAS Eval

Tech Stack

Component	Tools
Framework	DSPy, LangChain
Vector DB	Qdrant
Models	Liquid AI LFMs, IBM Granite
Inference	Ollama
UI	Streamlit
Validation	Pydantic
Evaluation	RAGAS
Dev Tools	Black, isort, mypy, pytest

Contributing

We welcome contributions! Whether you're adding new use-cases, improving Retrieval models, or enhancing existing pipelines:

Fork the repository
Create a feature branch
Follow existing code style (see Makefile linting commands)
Submit a pull request

See each use-case's directory for specific contribution guidelines.

License

MIT

Citations

If you use Small Giants in your research, please cite:

@software{small_giants_2024,
  author = {Olanigan, Ibrahim},
  title = {Small Giants: GenAI and Analytics with Small Language Models},
  url = {https://github.com/olanigan/small-giants},
  year = {2025}
}

Resources

Questions or ideas? Open an issue or start a discussion in the repository.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
dspy-liquid-agent		dspy-liquid-agent
granite-coder		granite-coder
langchain-qdrant-ollama-rag		langchain-qdrant-ollama-rag
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Small Giants

Overview

Why Small Giants?

Projects

📄 Invoice Parser — Document Extraction

🤖 Granite Coder — Coding Agent

🔍 LangChain RAG — Retrieval-Augmented Generation

Roadmap

Quick Start

Prerequisites

Installation

Run Demos

Architecture

Tech Stack

Contributing

License

Citations

Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Small Giants

Overview

Why Small Giants?

Projects

📄 Invoice Parser — Document Extraction

🤖 Granite Coder — Coding Agent

🔍 LangChain RAG — Retrieval-Augmented Generation

Roadmap

Quick Start

Prerequisites

Installation

Run Demos

Architecture

Tech Stack

Contributing

License

Citations

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages