The library for building libraries - By NuhaTech
From the Arabic word for library, Maktaba is a modern RAG infrastructure for building intelligent knowledge systems in any language.
- 🔌 Provider-agnostic: Works with OpenAI, Cohere, Azure, and more
- 🚀 Production-ready: Built for scale with async-first design
- 🧩 Modular: Use only what you need
- 🌍 Multilingual: Optimized for Arabic and international languages
- 📊 Type-safe: Full type hints and Pydantic validation
- 🧪 Well-tested: Comprehensive test coverage
- 🔍 Deep research: Built-in iterative planning for long-form reports
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Add maktaba to your project
uv add maktaba
# With OpenAI + Qdrant
uv add "maktaba[openai,qdrant]"
# With all providers
uv add "maktaba[all]"# Basic installation
pip install maktaba
# With OpenAI + Qdrant
pip install "maktaba[openai,qdrant]"
# With all providers
pip install "maktaba[all]"from maktaba.pipeline import QueryPipeline
from maktaba.embedding import OpenAIEmbedder
from maktaba.storage import QdrantStore
from maktaba.reranking import CohereReranker
# Create pipeline
pipeline = QueryPipeline(
embedder=OpenAIEmbedder(api_key="..."),
vector_store=QdrantStore(url="http://localhost:6333", collection_name="docs"),
reranker=CohereReranker(api_key="...")
)
# Search with automatic reranking and citation formatting
result = await pipeline.search(
query="What is Tawhid?",
top_k=10,
rerank=True
)
# Use in your LLM prompt
print(result["formatted_context"]) # [1]: content... [2]: content...
print(result["citations"]) # [{id: 1, source: "...", score: 0.95}, ...]Learn how to customise the default prompts via maktaba_templates.md.
from maktaba.pipeline import create_deep_research_pipeline
from maktaba.embedding import OpenAIEmbedder
from maktaba.storage import QdrantStore
from maktaba.llm import OpenAILLM
pipeline = create_deep_research_pipeline(
embedder=OpenAIEmbedder(api_key="..."),
store=QdrantStore(url="http://localhost:6333", collection_name="docs"),
llm=OpenAILLM(api_key="...", model="gpt-4o-mini"),
)
result = await pipeline.run_research("Impacts of lunar dust on spacecraft design")
chunks = [chunk async for chunk in result.stream]
print("".join(chunks)) # Final long-form report
print(result.queries_used) # Queries issued during research
print(result.source_indices) # 1-based indices of retained sourcesFor a full walkthrough (configuration knobs, streaming, stage overrides), see docs/DeepResearch.md and examples/deep_research_pipeline.py.
Before pushing to the remote repository, run all quality checks:
Linux/Mac/Git Bash:
./scripts/check.shWindows CMD:
scripts\check.batThis will run:
- Ruff linting
- MyPy type checking
- Pytest tests
All checks must pass before pushing.
- Overview: docs/Overview.md
- Quickstart: docs/Quickstart.md
- Pipelines: docs/Pipelines.md
- Providers: docs/Providers.md
- Examples: docs/Examples.md
- Troubleshooting: docs/Troubleshooting.md
Website (coming soon): maktaba.nuhatech.com
MIT License - see LICENSE
Built by NuhaTech - creators of Kutub and Muqabia.