Skip to content

Commit 7f039ae

Browse files
authored
FEAT: [model] support jina-reranker-v3 (xorbitsai#4156)
1 parent 218e875 commit 7f039ae

File tree

2 files changed

+17
-3
lines changed
  • doc/source/models/model_abilities
  • xinference/model/rerank/sentence_transformers

2 files changed

+17
-3
lines changed

doc/source/models/model_abilities/embed.rst

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,4 +123,10 @@ Does Embeddings API provides integration method for LangChain?
123123
-----------------------------------------------------------------------------------
124124

125125
Yes, you can refer to the related sections in LangChain's respective official Xinference documentation.
126-
Here is the link: `Text Embedding Models: Xinference <https://python.langchain.com/docs/integrations/text_embedding/xinference>`_
126+
Here is the link: `Text Embedding Models: Xinference <https://python.langchain.com/docs/integrations/text_embedding/xinference>`_
127+
128+
129+
Does Embeddings API support hrbrid model?
130+
-----------------------------------------------------------------------------------
131+
132+
Yes, you can use ``flag`` as the engine to deploy the model and call Embeddings API by setting the extra parameter ``return_parse=True`` which will return sparse vectors.

xinference/model/rerank/sentence_transformers/core.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ def load(self):
8181
if (
8282
self.model_family.type == "normal"
8383
and "qwen3" not in self.model_family.model_name.lower()
84+
and "jina-reranker-v3" not in self.model_family.model_name.lower()
8485
):
8586
try:
8687
import sentence_transformers
@@ -109,7 +110,10 @@ def load(self):
109110
)
110111
if self._use_fp16:
111112
self._model.model.half()
112-
elif "qwen3" in self.model_family.model_name.lower():
113+
elif (
114+
"qwen3" in self.model_family.model_name.lower()
115+
or "jina-reranker-v3" in self.model_family.model_name.lower()
116+
):
113117
# qwen3-reranker
114118
# now we use transformers
115119
# TODO: support engines for rerank models
@@ -225,6 +229,7 @@ def rerank(
225229
if (
226230
self.model_family.type == "normal"
227231
and "qwen3" not in self.model_family.model_name.lower()
232+
and "jina-reranker-v3" not in self.model_family.model_name.lower()
228233
):
229234
logger.debug("Passing processed sentences: %s", sentence_combinations)
230235
similarity_scores = self._model.predict(
@@ -235,7 +240,10 @@ def rerank(
235240
).cpu()
236241
if similarity_scores.dtype == torch.bfloat16:
237242
similarity_scores = similarity_scores.float()
238-
elif "qwen3" in self.model_family.model_name.lower():
243+
elif (
244+
"qwen3" in self.model_family.model_name.lower()
245+
or "jina-reranker-v3" in self.model_family.model_name.lower()
246+
):
239247

240248
def format_instruction(instruction, query, doc):
241249
if instruction is None:

0 commit comments

Comments
 (0)