v0.2.1
Release date: 2025.10.22
Highlights
- Comprehensive Multimodal Upgrade: Both the Retriever and Generation Servers now support multimodal inputs, enabling a complete end-to-end multimodal workflow from retrieval to generation.
- Corpus Parsing and Chunking Redesign: The Corpus Server adds multi-format file parsing with deep MinerU integration, supporting token-level, sentence-level, and customizable chunking strategies to flexibly adapt to diverse corpus structures.
- Unified Deployment and Efficient Inference: The Retriever and Generation Servers are fully compatible with standardized deployment frameworks such as vLLM, supporting offline inference, multi-engine adaptation, and accelerated experimentation.
- Enhanced Evaluation and Experimentation Workflow: Introduced TREC-based retrieval evaluation and significance testing modules, supporting parallel experiment execution and multimodal result visualization to optimize research assessment and experimental workflows.
What's Changed
- Corpus Server supports plain text extraction from .txt, .md, .pdf, .xps, .oxps, .epub, .mobi, and .fb2 files. @mssssss123
- Corpus Server adds simple per-page image conversion for .pdf files. @mssssss123
- Corpus Server integrates MinerU for high-precision PDF parsing. @mssssss123
- Corpus Server introduces a new chunking strategy supporting token-level (word/character) segmentation. @mssssss123
- Corpus Server supports sentence-level chunking. @mssssss123
- Corpus Server supports customizable chunking rules (default rule recognizes Markdown sections; other rules can be extended via config files). @mssssss123
- Retriever Server supports three retrieval engines: Infinity, Sentence-Transformers, and OpenAI. @mssssss123
- Retriever Server supports multimodal retrieval. @mssssss123
- Retriever Server adds BM25 sparse retrieval. @xhd0728
- Retriever Server supports hybrid retrieval (dense + sparse). @mssssss123
- Retriever Server provides standardized deployment based on vLLM, unified under the OpenAI-compatible API. @xhd0728
- Retriever Server supports online retrieval via Exa, Tavily, and ZhipuAI. @xhd0728
- Reranker Server supports Infinity, Sentence-Transformers, and OpenAI ranking engines. @xhd0728
- Generation Server supports multimodal inference. @mssssss123
- Generation Server introduces vLLM offline inference, significantly improving experimental efficiency. @mssssss123
- Generation Server supports Hugging Face inference for local debugging. @xhd0728
- Evaluation Server supports TREC retrieval evaluation. @xhd0728
- Evaluation Server supports TREC significance testing. @xhd0728
- VisRAG Pipeline enables an end-to-end workflow from local PDF ingestion to multimodal retrieval and generation. @mssssss123
- RAG Client supports running multiple experiments in parallel under the same pipeline through custom parameter files. @mssssss123
- UltraRAG Benchmark adds six new VQA datasets, including wiki2024 and corresponding VQA corpora. @mssssss123 @xhd0728 @hm1229
- Case Study UI adds multimodal result visualization support. @mssssss123