[Feature] Qwen3 Reranker #695

sigridjineth · 2025-08-10T19:08:11Z

What does this PR do?

This PR adds support for Qwen3 reranker models to text-embeddings-inference. (Issue: #643)

These models function as binary classifiers that determine the relevance between a query and a document. They output a simple probability score, making them perfect for re-ranking search results.

Key Changes

Introduced a new ListwiseReranker model type to properly distinguish these models from standard cross-encoder models.
Enabled reranker functionality for both the standard and Flash Qwen3 models. This involved:
- Implementing a predict method to extract the logits for "yes" and "no" tokens.
- Calculating the final score based on the probability of the "yes" token.
The router now automatically identifies reranker models by checking if "reranker" is in the model name or if the is_reranker flag is set in the model's config.

Technical Details

The models use the hidden state of the final token to predict relevance.
Scores are calculated from the logits for the "yes" (ID: 9693) and "no" (ID: 2152) tokens.
The final output is a probability score between 0 and 1, representing how relevant a document is to the query.

Who can review?

Anyone in the community is welcome to review the PR once the tests have passed. Feel free to tag anyone who might be interested.

@OlivierDehaene or @Narsil

sigridjineth · 2025-08-10T19:09:06Z

The test has been passed on my local macbook pro

vrdn-23 · 2025-08-11T17:38:55Z

@sigridjineth Thanks for the great work! Excited to see if this will get merged
Quick question: I think you may have accidentally deleted all the files for the Python backend? Is that intentional?

vrdn-23 · 2025-08-11T17:53:30Z

I wonder if it would be a simpler code change to support the model as a SequenceClassificationModel as mentioned in this discussion?
https://huggingface.co/Qwen/Qwen3-Reranker-0.6B/discussions/3

feat: support qwen3 reranker

c30aebc

sigridjineth force-pushed the feat/qwen3-reranker branch 3 times, most recently from 004f0a4 to c16c0bd Compare August 10, 2025 19:11

refactor: support qwen3 reranker

5763cd3

sigridjineth force-pushed the feat/qwen3-reranker branch from c16c0bd to 5763cd3 Compare August 10, 2025 19:12

sigridjineth mentioned this pull request Aug 10, 2025

Feature Request: Add Support for Qwen3-Reranker Model #643

Open

sigridjineth force-pushed the feat/qwen3-reranker branch from aa3ac4b to 2f91a32 Compare August 11, 2025 14:20

fix: build issues

221323b

sigridjineth force-pushed the feat/qwen3-reranker branch from 2f91a32 to 221323b Compare August 11, 2025 16:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Qwen3 Reranker #695

[Feature] Qwen3 Reranker #695

sigridjineth commented Aug 10, 2025 •

edited

Loading

Uh oh!

sigridjineth commented Aug 10, 2025

Uh oh!

vrdn-23 commented Aug 11, 2025

Uh oh!

vrdn-23 commented Aug 11, 2025

Uh oh!

Uh oh!

[Feature] Qwen3 Reranker #695

Are you sure you want to change the base?

[Feature] Qwen3 Reranker #695

Conversation

sigridjineth commented Aug 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Key Changes

Technical Details

Who can review?

Uh oh!

sigridjineth commented Aug 10, 2025

Uh oh!

vrdn-23 commented Aug 11, 2025

Uh oh!

vrdn-23 commented Aug 11, 2025

Uh oh!

Uh oh!

sigridjineth commented Aug 10, 2025 •

edited

Loading