This is a living document. We update it as priorities shift based on community feedback and production learnings. If something here excites you, open an issue or PR — we move fast on contributions.
- Python SDK (
inferedge-moss) — async-first, type-safe - TypeScript SDK (
@inferedge/moss) — full feature parity with Python - Built-in embedding models (
moss-minilm) - Custom embedding support (bring your own OpenAI, Cohere, etc.)
- Metadata filtering (
$eq,$and,$in,$near) - Document management (add, upsert, get, delete)
- LangChain integration
- DSPy integration
- Pipecat voice agent integration
- LiveKit voice agent integration
- Next.js example app
- VitePress search plugin
- Docker deployment examples (ECS/K8s patterns)
- WebAssembly runtime — client-side semantic search in the browser, no server required
- Benchmarks directory — reproducible latency/throughput scripts comparing Moss vs Pinecone, Qdrant, and Chroma on standardized datasets
- MCP server — expose Moss as a Model Context Protocol server so any MCP-compatible AI tool (Claude, Cursor, Windsurf) can do semantic search
- npm/PyPI package rename — consolidating package names under the Moss brand
- Vercel AI SDK integration — retrieval provider for the Vercel AI SDK
These are well-scoped and ready for contributors. Each one has (or will have) a corresponding GitHub issue with detailed instructions.
- Swift bindings — for iOS/macOS apps with on-device retrieval (
good first issue) - Go bindings — for backend services and CLI tools (
good first issue) - Elixir bindings — for Phoenix/LiveView apps (
good first issue) - Rust bindings — for performance-critical pipelines (
good first issue)
- CrewAI — Moss as a retrieval tool for CrewAI agents (
good first issue) - Haystack — document store / retriever integration
- AutoGen — retrieval-augmented tool for AutoGen agents
- LlamaIndex — retriever and query engine integration
- Semantic Kernel — .NET/Python retrieval plugin
- Reranking support — plug in cross-encoder rerankers (Cohere Rerank, bge-reranker, etc.) as a post-retrieval step
- Hybrid search — combine semantic search with BM25 keyword matching
- Multi-vector retrieval — support ColBERT-style late interaction models
- Doc-parsing connectors — ingest PDF, DOCX, HTML, and Markdown files directly into Moss indexes
- Chunking strategies — built-in text splitters (sentence, paragraph, recursive, semantic)
- Web crawling — crawl a URL and index the content
These are bigger bets we're exploring. They're directional, not committed — community input will shape what gets built.
- vLLM-based local inference + local search — a fully local pipeline: your model, your embeddings, your search, your hardware. No API calls. This is a natural fit for the privacy-first voice AI use case and can meaningfully cut latency for on-premise deployments.
- Ollama + Moss + Pipecat reference architecture — an end-to-end fully local voice agent: Ollama for LLM inference, Moss for retrieval, Pipecat for real-time audio. A single
docker compose upto run the entire stack.
- LLM-as-a-judge evaluation framework — automated retrieval quality scoring using LLM judges. We want to lay the foundation and let the community decide the direction — what metrics matter, which judges to support, how to benchmark fairly.
- Retrieval quality dashboard — visualize query performance, relevance scores, and failure modes over time
- Edge runtime support — run Moss in Cloudflare Workers, Deno Deploy, and Vercel Edge Functions
- Pick something from "Next Up" — these are ready for PRs
- Check the issues — look for
good first issueandhelp wantedlabels - Propose something new — open an issue describing what you want to build. We're open to ideas that aren't on this list.
- Read the Contributing Guide — fork, branch from
main, PR
If you're unsure where to start, drop a message in Discord and we'll point you in the right direction.