Add Support for Multiple OCR/LLM Models Beyond OpenAI (Anthropic, Hugging Face, Local LLMs) #14

Kunal-Darekar · 2025-05-14T16:18:00Z

Overview

This PR implements support for multiple OCR/LLM models beyond OpenAI, addressing issue #4. The implementation allows users to choose between different model providers (OpenAI, Anthropic, Hugging Face, and local LLMs via llama.cpp) for document processing and semantic chunking.

Changes Made

1.Model Provider Infrastructure:
Added a new models package with provider implementations:
base.py: Abstract base classes and factory for model providers
openai_provider.py: OpenAI implementation
anthropic_provider.py: Anthropic Claude implementation
huggingface_provider.py: Hugging Face models implementation
local_provider.py: Local LLM implementation using llama.cpp
Added config.py for centralized configuration management

2.Model-Agnostic Utilities:
Added model_utils.py with model-agnostic implementations of:
Semantic chunking
Document text extraction
Image processing

Main Process Updates:
Updated the main process function in Unsiloed/init.py to support different model providers and their credentials
Added a new modelProvider parameter to specify which model to use
Enhanced credential handling to support different API keys for different providers
Chunking Service Updates:
Modified Unsiloed/services/chunking.py to use model-agnostic functions
Added a provider_name parameter to pass the selected model provider
Updated result metadata to include information about the model provider used
Chunking Utilities:
Updated Unsiloed/utils/chunking.py to use model-agnostic semantic chunking
Modified the semantic chunking function to support multiple model providers
Dependency Management:
Updated setup.py to add optional dependencies for different model providers:
anthropic for Claude models
huggingface_hub for Hugging Face models
llama-cpp-python for local LLM models
Documentation:
Updated README.md with comprehensive documentation for using different model providers
Added examples showing how to use each model provider
Added information about environment variables for different providers
Updated installation instructions to include optional dependencies
##How to Use
Users can now specify which model provider to use:

# Using OpenAI
result = Unsiloed.process_sync({
    "filePath": "./document.pdf",
    "credentials": {
        "apiKey": os.environ.get("OPENAI_API_KEY")
    },
    "modelProvider": "openai",
    "strategy": "semantic"
})

# Using Anthropic Claude
result = Unsiloed.process_sync({
    "filePath": "./document.pdf",
    "credentials": {
        "anthropicApiKey": os.environ.get("ANTHROPIC_API_KEY")
    },
    "modelProvider": "anthropic",
    "strategy": "semantic"
})

# Using Hugging Face
result = Unsiloed.process_sync({
    "filePath": "./document.pdf",
    "credentials": {
        "huggingfaceApiKey": os.environ.get("HUGGINGFACE_API_KEY")
    },
    "modelProvider": "huggingface",
    "strategy": "semantic"
})

# Using Local LLM
result = Unsiloed.process_sync({
    "filePath": "./document.pdf",
    "credentials": {
        "localModelPath": os.environ.get("LOCAL_MODEL_PATH")
    },
    "modelProvider": "local",
    "strategy": "semantic"
})

Environment Variables
Added support for the following environment variables:

OPENAI_API_KEY: OpenAI API key
ANTHROPIC_API_KEY: Anthropic API key
HUGGINGFACE_API_KEY: Hugging Face API key
LOCAL_MODEL_PATH: Path to local LLM model
UNSILOED_MODEL_PROVIDER: Default model provider
Installation Options
Users can now install with specific model providers:

# Basic installation with OpenAI support
pip install unsiloed

# Install with all model providers
pip install unsiloed[all]

# Install with specific model providers
pip install unsiloed[anthropic]  # For Anthropic Claude support
pip install unsiloed[huggingface]  # For Hugging Face models support
pip install unsiloed[local]  # For local LLM support via llama.cpp

Testing
The implementation has been tested with:

OpenAI GPT-4o for semantic chunking
Different chunking strategies (fixed, semantic, paragraph, heading, page)
Various document types (PDF, DOCX, PPTX)

Closes #4
/claim #4

Kunal-Darekar · 2025-05-14T17:00:25Z

@mubashir-oss Can you please review my pr and tell if any changes are required

mubashir-oss · 2025-05-18T07:14:40Z

@Kunal-Darekar please share a short recording

Kunal-Darekar added 2 commits May 14, 2025 21:41

Add support for multiple OCR/LLM models beyond OpenAI

7de0d98

Add model provider infrastructure and utilities

a0fe5ee

algora-pbc bot added the 🙋 Bounty claim label May 14, 2025

algora-pbc bot mentioned this pull request May 14, 2025

Add Support for Multiple OCR/LLM Models (Beyond OpenAI) #4

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Support for Multiple OCR/LLM Models Beyond OpenAI (Anthropic, Hugging Face, Local LLMs) #14

Add Support for Multiple OCR/LLM Models Beyond OpenAI (Anthropic, Hugging Face, Local LLMs) #14

Uh oh!

Kunal-Darekar commented May 14, 2025

Uh oh!

Kunal-Darekar commented May 14, 2025

Uh oh!

mubashir-oss commented May 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Support for Multiple OCR/LLM Models Beyond OpenAI (Anthropic, Hugging Face, Local LLMs) #14

Are you sure you want to change the base?

Add Support for Multiple OCR/LLM Models Beyond OpenAI (Anthropic, Hugging Face, Local LLMs) #14

Uh oh!

Conversation

Kunal-Darekar commented May 14, 2025

Overview

Changes Made

Uh oh!

Kunal-Darekar commented May 14, 2025

Uh oh!

mubashir-oss commented May 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants