Skip to content

Comments

[Feature Request] New NVIDA reranker module#1199

Open
hypoxisaurea wants to merge 3 commits intoMarker-Inc-Korea:mainfrom
hypoxisaurea:main
Open

[Feature Request] New NVIDA reranker module#1199
hypoxisaurea wants to merge 3 commits intoMarker-Inc-Korea:mainfrom
hypoxisaurea:main

Conversation

@hypoxisaurea
Copy link

Description

https://build.nvidia.com/nvidia/rerank-qa-mistral-4b

위 모델을 추가하였습니다.
unit test와 개인 api key 호출 테스트에 성공하였습니다.

Related Issue

close #430

@hypoxisaurea hypoxisaurea marked this pull request as draft January 25, 2026 13:56
@hypoxisaurea hypoxisaurea marked this pull request as ready for review January 25, 2026 13:56
@vkehfdl1 vkehfdl1 self-requested a review January 25, 2026 14:00
Copy link
Contributor

@vkehfdl1 vkehfdl1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your amazing contribution!
There are few things to be changed, but feel free to leave a comment if you have another opinion about the comments.
Plus, I really want you to provide documentation about NVIDIA rerank API.
And it is always thankful to write a proper documentation about this new reranker module. You can make a new markdown file in docs/source/nodes/passage_reranker folder.

Comment on lines 25 to 27
self.api_key = (
os.getenv("NVIDIA_API_KEY", None) if self.api_key is None else self.api_key
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify to this

self.api_key = self.api_key or os.getenv("NVIDIA_API_KEY", None)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed review! I've updated the code based on your comments. Please check it again.

response_body = await response.json()

rankings = response_body.get("rankings", [])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should check that the input documents lengths and rankings which is returned from the NVIDIA API are same length.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed review! I've updated the code based on your comments. Please check it again.

Comment on lines 148 to 153
def _score(item):
if item.get("logit") is not None:
return float(item["logit"])
if item.get("score") is not None:
return float(item["score"])
return 0.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a two types of score? Can you provide more information what is "logit" and "score" in the return value? I want to see NVIDIA API official documentation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out.

I checked the official NVIDIA documentation and it states:

"Output Format: list of floats, each the probability score (or raw logits). The user can decide if a Sigmoid activation function is applied to the logits."

Since the API response might vary (either logit or score) depending on the user's configuration, I believe it is safer to keep the logic that checks both fields to ensure AutoRAG works robustly in all scenarios.

I have added a comment to the code to clarify this intent. Do you think this approach is acceptable?

Comment on lines 26 to 45
mock_response = {
"rankings": [
{"index": 1, "logit": 0.9},
{"index": 0, "logit": 0.2},
]
}
m.post(NVIDIA_RERANK_URL, payload=mock_response)
async with aiohttp.ClientSession() as session:
session.headers.update(
{"Authorization": "Bearer mock_api_key", "Accept": "application/json"}
)
content_result, id_result, score_result = await nvidia_rerank_pure(
session,
NVIDIA_RERANK_URL,
"nvidia/rerank-qa-mistral-4b",
queries_example[0],
contents_example[0],
ids_example[0],
top_k=2,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the contents_example[0] has total three document contents, the NVIDIA API should return total three rankings, not two.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed review! I've updated the code based on your comments. Please check it again.

Comment on lines 56 to 71
async def mock_nvidia_rerank_pure(
session, invoke_url, model, query, documents, ids, top_k, truncate=None
):
if query == queries_example[0]:
return (
[documents[1], documents[2], documents[0]][:top_k],
[ids[1], ids[2], ids[0]][:top_k],
[0.8, 0.2, 0.1][:top_k],
)
if query == queries_example[1]:
return (
[documents[1], documents[0], documents[2]][:top_k],
[ids[1], ids[0], ids[2]][:top_k],
[0.8, 0.2, 0.1][:top_k],
)
raise ValueError(f"Unexpected query: {query}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not mock the pure function itself.
Mocking a whole function that defines in the AutoRAG package is anti-pattern.
Instead, mock the aiohttp's response to proper mock response.
Plus, to simplify the test, try to use same example and mock response throughout the tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed review! I've updated the code based on your comments. Please check it again.

Comment on lines 79 to 112
@patch(
"autorag.nodes.passagereranker.nvidia.nvidia_rerank_pure",
mock_nvidia_rerank_pure,
)
def test_nvidia_reranker(nvidia_reranker_instance):
top_k = 3
contents_result, id_result, score_result = nvidia_reranker_instance._pure(
queries_example, contents_example, scores_example, ids_example, top_k
)
base_reranker_test(contents_result, id_result, score_result, top_k)


@patch(
"autorag.nodes.passagereranker.nvidia.nvidia_rerank_pure",
mock_nvidia_rerank_pure,
)
def test_nvidia_reranker_batch_one(nvidia_reranker_instance):
top_k = 3
batch = 1
contents_result, id_result, score_result = nvidia_reranker_instance._pure(
queries_example,
contents_example,
scores_example,
ids_example,
top_k,
batch=batch,
)
base_reranker_test(contents_result, id_result, score_result, top_k)


@patch(
"autorag.nodes.passagereranker.nvidia.nvidia_rerank_pure",
mock_nvidia_rerank_pure,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All patches has to be changed due to the previous comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed review! I've updated the code based on your comments. Please check it again.

@bwook00 bwook00 self-requested a review January 25, 2026 14:27
Copy link
Contributor

@bwook00 bwook00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution 🚀
Feel free to leave a comment if you have another opinion about the comments.

I've just added a few minor comments.

Copy link
Contributor

@bwook00 bwook00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for your update 👍 One more minor review

@hypoxisaurea hypoxisaurea requested a review from bwook00 February 10, 2026 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add NVIDA reranker module

3 participants