Skip to content

Conversation

@Kunal-Darekar
Copy link

Overview

This PR implements support for multiple OCR/LLM models beyond OpenAI, addressing issue #4. The implementation allows users to choose between different model providers (OpenAI, Anthropic, Hugging Face, and local LLMs via llama.cpp) for document processing and semantic chunking.

Changes Made

1.Model Provider Infrastructure:
Added a new models package with provider implementations:
base.py: Abstract base classes and factory for model providers
openai_provider.py: OpenAI implementation
anthropic_provider.py: Anthropic Claude implementation
huggingface_provider.py: Hugging Face models implementation
local_provider.py: Local LLM implementation using llama.cpp
Added config.py for centralized configuration management

2.Model-Agnostic Utilities:
Added model_utils.py with model-agnostic implementations of:
Semantic chunking
Document text extraction
Image processing

  1. Main Process Updates:
    Updated the main process function in Unsiloed/init.py to support different model providers and their credentials
    Added a new modelProvider parameter to specify which model to use
    Enhanced credential handling to support different API keys for different providers

  2. Chunking Service Updates:
    Modified Unsiloed/services/chunking.py to use model-agnostic functions
    Added a provider_name parameter to pass the selected model provider
    Updated result metadata to include information about the model provider used

  3. Chunking Utilities:
    Updated Unsiloed/utils/chunking.py to use model-agnostic semantic chunking
    Modified the semantic chunking function to support multiple model providers

  4. Dependency Management:
    Updated setup.py to add optional dependencies for different model providers:
    anthropic for Claude models
    huggingface_hub for Hugging Face models
    llama-cpp-python for local LLM models

  5. Documentation:
    Updated README.md with comprehensive documentation for using different model providers
    Added examples showing how to use each model provider
    Added information about environment variables for different providers
    Updated installation instructions to include optional dependencies
    ##How to Use
    Users can now specify which model provider to use:

# Using OpenAI
result = Unsiloed.process_sync({
    "filePath": "./document.pdf",
    "credentials": {
        "apiKey": os.environ.get("OPENAI_API_KEY")
    },
    "modelProvider": "openai",
    "strategy": "semantic"
})

# Using Anthropic Claude
result = Unsiloed.process_sync({
    "filePath": "./document.pdf",
    "credentials": {
        "anthropicApiKey": os.environ.get("ANTHROPIC_API_KEY")
    },
    "modelProvider": "anthropic",
    "strategy": "semantic"
})

# Using Hugging Face
result = Unsiloed.process_sync({
    "filePath": "./document.pdf",
    "credentials": {
        "huggingfaceApiKey": os.environ.get("HUGGINGFACE_API_KEY")
    },
    "modelProvider": "huggingface",
    "strategy": "semantic"
})

# Using Local LLM
result = Unsiloed.process_sync({
    "filePath": "./document.pdf",
    "credentials": {
        "localModelPath": os.environ.get("LOCAL_MODEL_PATH")
    },
    "modelProvider": "local",
    "strategy": "semantic"
})

Environment Variables
Added support for the following environment variables:

OPENAI_API_KEY: OpenAI API key
ANTHROPIC_API_KEY: Anthropic API key
HUGGINGFACE_API_KEY: Hugging Face API key
LOCAL_MODEL_PATH: Path to local LLM model
UNSILOED_MODEL_PROVIDER: Default model provider
Installation Options
Users can now install with specific model providers:

# Basic installation with OpenAI support
pip install unsiloed

# Install with all model providers
pip install unsiloed[all]

# Install with specific model providers
pip install unsiloed[anthropic]  # For Anthropic Claude support
pip install unsiloed[huggingface]  # For Hugging Face models support
pip install unsiloed[local]  # For local LLM support via llama.cpp

Testing
The implementation has been tested with:

OpenAI GPT-4o for semantic chunking
Different chunking strategies (fixed, semantic, paragraph, heading, page)
Various document types (PDF, DOCX, PPTX)

Closes #4
/claim #4

@Kunal-Darekar
Copy link
Author

@mubashir-oss Can you please review my pr and tell if any changes are required

@mubashir-oss
Copy link
Contributor

@Kunal-Darekar please share a short recording

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Support for Multiple OCR/LLM Models (Beyond OpenAI)

2 participants