Skip to content

Add compatibility w/ JinaAI Reranker API and Cohere Rerank API#797

Open
alvarobartt wants to merge 13 commits intomainfrom
add-jinaai-and-cohere-rerank-apis
Open

Add compatibility w/ JinaAI Reranker API and Cohere Rerank API#797
alvarobartt wants to merge 13 commits intomainfrom
add-jinaai-and-cohere-rerank-apis

Conversation

@alvarobartt
Copy link
Member

@alvarobartt alvarobartt commented Jan 7, 2026

What does this PR do?

This PR adds the routes for re-ranking for JinaAI Reranker API (v1) @ /v1/rerank and Cohere Rerank API (v2) @ /v2/rerank.

Also this PR includes uuid with --features v4 as a dependency, required to generate the IDs that one can encounter when using either JinaAI Reranker API (included in the error messages) or Cohere Rerank API (included in both the response and the errors).

Warning

The implementations are aligned with their counterparts on both JinaAI and Cohere, but there's some considerations:

  • JinaAI errors cannot be 1:1 produced in the same way in Text Embeddings Inference, so the specification matches, but the content might differ.
  • Cohere applies a per-document truncation, which requires us to tokenize each document in advance, then truncate it to whatever length, then convert it to string and run the inference (tokenization + forward pass) on each of the query-document pairs; which is a bit tricky and we might be able to do that better.

JinaAI Reranker API

Run any text-ranking model from https://huggingface.co/models?pipeline_tag=text-ranking&sort=trending as:

cargo run --release --features ort,http --no-default-features -- --model-id onnx-community/gte-multilingual-reranker-base --dtype float32

Then run the inference via requests in Python (or your preferred alternative):

import requests
import json

url = "http://localhost:3000/v1/rerank"
headers = {"Content-Type": "application/json"}
data = {
    "model": "jina-reranker-v3",
    "query": "Organic skincare products for sensitive skin",
    "top_n": 3,
    "documents": [
        "Organic skincare for sensitive skin with aloe vera and chamomile: Imagine the soothing embrace of nature with our organic skincare range, crafted specifically for sensitive skin. Infused with the calming properties of aloe vera and chamomile, each product provides gentle nourishment and protection. Say goodbye to irritation and hello to a glowing, healthy complexion.",
        "New makeup trends focus on bold colors and innovative techniques: Step into the world of cutting-edge beauty with this seasons makeup trends. Bold, vibrant colors and groundbreaking techniques are redefining the art of makeup. From neon eyeliners to holographic highlighters, unleash your creativity and make a statement with every look.",
        "Bio-Hautpflege für empfindliche Haut mit Aloe Vera und Kamille: Erleben Sie die wohltuende Wirkung unserer Bio-Hautpflege, speziell für empfindliche Haut entwickelt. Mit den beruhigenden Eigenschaften von Aloe Vera und Kamille pflegen und schützen unsere Produkte Ihre Haut auf natürliche Weise. Verabschieden Sie sich von Hautirritationen und genießen Sie einen strahlenden Teint.",
        "Neue Make-up-Trends setzen auf kräftige Farben und innovative Techniken: Tauchen Sie ein in die Welt der modernen Schönheit mit den neuesten Make-up-Trends. Kräftige, lebendige Farben und innovative Techniken setzen neue Maßstäbe. Von auffälligen Eyelinern bis hin zu holografischen Highlightern – lassen Sie Ihrer Kreativität freien Lauf und setzen Sie jedes Mal ein Statement.",
        "Cuidado de la piel orgánico para piel sensible con aloe vera y manzanilla: Descubre el poder de la naturaleza con nuestra línea de cuidado de la piel orgánico, diseñada especialmente para pieles sensibles. Enriquecidos con aloe vera y manzanilla, estos productos ofrecen una hidratación y protección suave. Despídete de las irritaciones y saluda a una piel radiante y saludable.",
        "Las nuevas tendencias de maquillaje se centran en colores vivos y técnicas innovadoras: Entra en el fascinante mundo del maquillaje con las tendencias más actuales. Colores vivos y técnicas innovadoras están revolucionando el arte del maquillaje. Desde delineadores neón hasta iluminadores holográficos, desata tu creatividad y destaca en cada look.",
        "针对敏感肌专门设计的天然有机护肤产品:体验由芦荟和洋甘菊提取物带来的自然呵护。我们的护肤产品特别为敏感肌设计,温和滋润,保护您的肌肤不受刺激。让您的肌肤告别不适,迎来健康光彩。",
        "新的化妆趋势注重鲜艳的颜色和创新的技巧:进入化妆艺术的新纪元,本季的化妆趋势以大胆的颜色和创新的技巧为主。无论是霓虹眼线还是全息高光,每一款妆容都能让您脱颖而出,展现独特魅力。",
        "敏感肌のために特別に設計された天然有機スキンケア製品: アロエベラとカモミールのやさしい力で、自然の抱擁を感じてください。敏感肌用に特別に設計された私たちのスキンケア製品は、肌に優しく栄養を与え、保護します。肌トラブルにさようなら、輝く健康な肌にこんにちは。",
        "新しいメイクのトレンドは鮮やかな色と革新的な技術に焦点を当てています: 今シーズンのメイクアップトレンドは、大胆な色彩と革新的な技術に注目しています。ネオンアイライナーからホログラフィックハイライターまで、クリエイティビティを解き放ち、毎回ユニークなルックを演出しましょう。"
    ],
    "return_documents": False
}

response = requests.post(url, headers=headers, data=json.dumps(data))
print(response.json())

The Python snippet above has been copied from https://jina.ai/reranker/.

Cohere Rerank API

Run any text-ranking model from https://huggingface.co/models?pipeline_tag=text-ranking&sort=trending as:

cargo run --release --features ort,http --no-default-features -- --model-id onnx-community/gte-multilingual-reranker-base --dtype float32

Then run the inference via Python with Cohere Python SDK as:

import cohere

co = cohere.ClientV2()

docs = [
    "Carson City is the capital city of the American state of Nevada.",
    "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
    "Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",
    "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
    "Capital punishment has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states.",
]

response = co.rerank(
    model="",
    query="What is the capital of the United States?",
    documents=docs,
    top_n=3,
)
print(response)

The Python snippet above has been copied from https://docs.cohere.com/reference/rerank.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines.
  • Did you write any new necessary tests? If applicable, did you include or update the insta snapshots?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@alvarobartt alvarobartt linked an issue Jan 9, 2026 that may be closed by this pull request
@alvarobartt alvarobartt marked this pull request as ready for review January 9, 2026 18:21
@alvarobartt alvarobartt changed the title Add endpoints for compatibility w/ JinaAI Reranker API and Cohere Rerank API (WIP) Add compatibility w/ JinaAI Reranker API and Cohere Rerank API Jan 9, 2026
@alvarobartt
Copy link
Member Author

alvarobartt commented Jan 9, 2026

Hey guys @vrdn-23, @rowan-fan, as you were both interested in the "OpenAI" compatible routes for text ranking models, namely JinaAI Reranker API and Cohere Rerank API, I invite you both to review + test this PR to see if the current fits your needs 🤗

The PR implements both routes under v1/rerank for JinaAI and v2/rerank for Cohere, find an example on how to deploy for testing + run inference with Python on top.

For an easier review please check https://github.com/huggingface/text-embeddings-inference/pull/797/files/d25c565d502720eb95767883cb9ee56743ecab57..1190ce2e3d6808110d55f49087cd4ad028b8748b which only contains the changes related to the implementation of both routes, without the formatting fixes applied by the pre-commit.

@alvarobartt alvarobartt added this to the v1.9.0 milestone Jan 9, 2026
@vrdn-23
Copy link
Contributor

vrdn-23 commented Jan 9, 2026

Awesome! Thanks a lot @alvarobartt . I'll be sure to take a look at this early next week! <3

pub(crate) struct RerankResponse(pub Vec<Rank>);

#[derive(Deserialize, ToSchema)]
pub(crate) struct JinaAIRerankRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be great if users, could add their prompt_template etc to the query, do you think we could fit it into one of these templates?

Option: String<{query}{document}> that can be formatted with stringfmt crate. What do you think?

@alvarobartt alvarobartt modified the milestones: v1.9.0, v1.10.0 Feb 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Openai compatiblity for Rerank models

3 participants