This project is a powerful AI-powered assistant designed to streamline the creation of detailed and practical Standard Operating Procedures (SOPs). It uses a Retrieval-Augmented Generation (RAG) approach to provide highly contextual and accurate outputs.
By combining an internal knowledge base of your organization's existing SOPs with up-to-date information from the internet, the assistant generates comprehensive documents tailored to your specific needs. This ensures your SOPs are both consistent with your internal policies and current with the latest technology and best practices.
- Local LLM Integration: Powered by Ollama, the application uses a locally-run, quantized LLM (Mistral) for private, efficient, and cost-free text generation.
- Internal Knowledge Base: Existing SOPs (PDFs, DOCX) are ingested, chunked, and stored in a local ChromaDB vector database. This allows the assistant to pull from your organization's unique policies and procedures.
- Real-time Web Search: The application integrates with SerpAPI to perform real-time web searches, ensuring the generated SOPs include the latest information, security updates, and technical details.
- FastAPI Backend: A robust and scalable RESTful API provides a clean interface for generating SOPs, making it easy to integrate with other applications.
SOPhi/
├── README.md - This file
├── .env.example - Example file for environment variables
├── .gitignore - Files to be ignored by Git
├── requirements.txt - Python dependencies
├── config.py - Configuration settings
├── scripts/
│ └── ingest_sops.py - Script to process and ingest documents
├── src/
│ ├── main.py - The core FastAPI application
│ ├── llm_service.py - Handles communication with the local LLM
│ ├── vector_db_service.py - Manages the ChromaDB vector store
│ ├── web_search_service.py - Connects to the SerpAPI for web search
├── data/
│ ├── raw_sops/ - **Place your raw SOPs (PDF, DOCX) here**
│ └── embeddings/ - Local vector store data (generated)
└── examples/
└── example_usage.py - A script to demonstrate API usage
Before you begin, ensure you have the following installed on your system:
- Python 3.9+
- Ollama (through terminal)
- SerpAPI Key
git clone https://github.com/shussin245/SOPhi.git
cd SOPhi#Create virtual environment
python3 -m venv venv
#Activate virtual environment
source venv/bin/activate pip install -r requirements.txtCreate a .env file in the project root:
SERPAPI_API_KEY="your-serpapi-key-here"You can edit the .env.example file provided and save as .env.
#Pull the generative LLM
ollama pull mistral
#Pull the embedding model
ollama pull nomic-embed-textollama serveAdd SOPs to data/raw_sops/ and run:
python -m scripts.ingest_sopsThis creates the embeddings in data/embeddings/.
uvicorn src.main:app --reloadServer runs on: http://127.0.0.1:8000
In a new terminal with your virtual environment activated, run:
python examples/example_usage.py- Ingesting SOPs: ~10 minutes
- Generating SOPs: ~20 minutes
- 0 → deterministic
- 0.3–0.7 → balanced (recommended)
- 0.9+ → highly creative, risk of hallucinations
Default: 0.5 for SOPs
- Deterministic vectors for text
- No seed needed for inference
- Seeds only matter for fine-tuning/sampling tasks
- Smaller chunks = better semantic precision, worse context
- Larger chunks = better context, worse precision
Recommended: 500 chars + 100 overlap
- Optimize ingestion/generation speed
- Web UI for non-technical users
- Fine-tuned models for domain-specific SOPs
Pull requests and issues are welcome.