Skip to content

[Feature] Qwen3 Reranker #695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sigridjineth
Copy link

@sigridjineth sigridjineth commented Aug 10, 2025

What does this PR do?

This PR adds support for Qwen3 reranker models to text-embeddings-inference. (Issue: #643)

These models function as binary classifiers that determine the relevance between a query and a document. They output a simple probability score, making them perfect for re-ranking search results.

Key Changes

  • Introduced a new ListwiseReranker model type to properly distinguish these models from standard cross-encoder models.
  • Enabled reranker functionality for both the standard and Flash Qwen3 models. This involved:
    • Implementing a predict method to extract the logits for "yes" and "no" tokens.
    • Calculating the final score based on the probability of the "yes" token.
  • The router now automatically identifies reranker models by checking if "reranker" is in the model name or if the is_reranker flag is set in the model's config.

Technical Details

  • The models use the hidden state of the final token to predict relevance.
  • Scores are calculated from the logits for the "yes" (ID: 9693) and "no" (ID: 2152) tokens.
  • The final output is a probability score between 0 and 1, representing how relevant a document is to the query.

Who can review?

Anyone in the community is welcome to review the PR once the tests have passed. Feel free to tag anyone who might be interested.

@OlivierDehaene or @Narsil

@sigridjineth
Copy link
Author

The test has been passed on my local macbook pro
image

@sigridjineth sigridjineth force-pushed the feat/qwen3-reranker branch 3 times, most recently from 004f0a4 to c16c0bd Compare August 10, 2025 19:11
@vrdn-23
Copy link

vrdn-23 commented Aug 11, 2025

@sigridjineth Thanks for the great work! Excited to see if this will get merged
Quick question: I think you may have accidentally deleted all the files for the Python backend? Is that intentional?

@vrdn-23
Copy link

vrdn-23 commented Aug 11, 2025

I wonder if it would be a simpler code change to support the model as a SequenceClassificationModel as mentioned in this discussion?
https://huggingface.co/Qwen/Qwen3-Reranker-0.6B/discussions/3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants