-
Notifications
You must be signed in to change notification settings - Fork 320
feat: Simple ReRanker local models. #190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Force push was due to black linting. All done! ✨ 🍰 ✨ |
1af1023 to
b0fbc78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a new /rerank endpoint to enable document reranking using open source models via the rerankers library. The implementation allows users to submit a query and a list of documents to be reranked based on relevance, with optional control over the number of top results returned.
Key Changes:
- Added
rerankerslibrary dependencies with transformers and flashrank support - Implemented
/rerankendpoint that accepts queries and documents for reranking - Configured Docker Compose with NVIDIA runtime and HuggingFace cache volume for model support
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| requirements.txt | Added rerankers library with transformers and flashrank extras for document reranking functionality |
| docker-compose.yaml | Added NVIDIA runtime support and HuggingFace cache volume mount to support GPU-accelerated model inference |
| app/routes/document_routes.py | Implemented reranker instance initialization and /rerank endpoint handler with document processing logic |
| app/models.py | Added QueryMultipleDocs Pydantic model to define request schema for the rerank endpoint |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
This is a PR as per the suggestion from danny-avila/LibreChat#9102
This will add an endpoint
/rerankin order to use open source models to rerank documents. The endpoint needs a query to rerank against and documents to rank. We can also add information on how many results we need,k, and a configuration to set the model and keys in order to run this operation.All available configuration options could be found over at https://github.com/AnswerDotAI/rerankers, which this endpoint is a thin wrapper over.
Test call
Expected response:
Realized that sending the model over the call is not the correct option, we need to load it one time to improve performance so now you can configure that in the environment for the rag_api repository.