[Feature Request] New NVIDA reranker module#1199
[Feature Request] New NVIDA reranker module#1199hypoxisaurea wants to merge 3 commits intoMarker-Inc-Korea:mainfrom
Conversation
vkehfdl1
left a comment
There was a problem hiding this comment.
Thank you for your amazing contribution!
There are few things to be changed, but feel free to leave a comment if you have another opinion about the comments.
Plus, I really want you to provide documentation about NVIDIA rerank API.
And it is always thankful to write a proper documentation about this new reranker module. You can make a new markdown file in docs/source/nodes/passage_reranker folder.
| self.api_key = ( | ||
| os.getenv("NVIDIA_API_KEY", None) if self.api_key is None else self.api_key | ||
| ) |
There was a problem hiding this comment.
Simplify to this
self.api_key = self.api_key or os.getenv("NVIDIA_API_KEY", None)
There was a problem hiding this comment.
Thanks for the detailed review! I've updated the code based on your comments. Please check it again.
| response_body = await response.json() | ||
|
|
||
| rankings = response_body.get("rankings", []) | ||
|
|
There was a problem hiding this comment.
You should check that the input documents lengths and rankings which is returned from the NVIDIA API are same length.
There was a problem hiding this comment.
Thanks for the detailed review! I've updated the code based on your comments. Please check it again.
| def _score(item): | ||
| if item.get("logit") is not None: | ||
| return float(item["logit"]) | ||
| if item.get("score") is not None: | ||
| return float(item["score"]) | ||
| return 0.0 |
There was a problem hiding this comment.
Why is there a two types of score? Can you provide more information what is "logit" and "score" in the return value? I want to see NVIDIA API official documentation.
There was a problem hiding this comment.
Thanks for pointing this out.
I checked the official NVIDIA documentation and it states:
"Output Format: list of floats, each the probability score (or raw logits). The user can decide if a Sigmoid activation function is applied to the logits."
Since the API response might vary (either logit or score) depending on the user's configuration, I believe it is safer to keep the logic that checks both fields to ensure AutoRAG works robustly in all scenarios.
I have added a comment to the code to clarify this intent. Do you think this approach is acceptable?
| mock_response = { | ||
| "rankings": [ | ||
| {"index": 1, "logit": 0.9}, | ||
| {"index": 0, "logit": 0.2}, | ||
| ] | ||
| } | ||
| m.post(NVIDIA_RERANK_URL, payload=mock_response) | ||
| async with aiohttp.ClientSession() as session: | ||
| session.headers.update( | ||
| {"Authorization": "Bearer mock_api_key", "Accept": "application/json"} | ||
| ) | ||
| content_result, id_result, score_result = await nvidia_rerank_pure( | ||
| session, | ||
| NVIDIA_RERANK_URL, | ||
| "nvidia/rerank-qa-mistral-4b", | ||
| queries_example[0], | ||
| contents_example[0], | ||
| ids_example[0], | ||
| top_k=2, | ||
| ) |
There was a problem hiding this comment.
Since the contents_example[0] has total three document contents, the NVIDIA API should return total three rankings, not two.
There was a problem hiding this comment.
Thanks for the detailed review! I've updated the code based on your comments. Please check it again.
| async def mock_nvidia_rerank_pure( | ||
| session, invoke_url, model, query, documents, ids, top_k, truncate=None | ||
| ): | ||
| if query == queries_example[0]: | ||
| return ( | ||
| [documents[1], documents[2], documents[0]][:top_k], | ||
| [ids[1], ids[2], ids[0]][:top_k], | ||
| [0.8, 0.2, 0.1][:top_k], | ||
| ) | ||
| if query == queries_example[1]: | ||
| return ( | ||
| [documents[1], documents[0], documents[2]][:top_k], | ||
| [ids[1], ids[0], ids[2]][:top_k], | ||
| [0.8, 0.2, 0.1][:top_k], | ||
| ) | ||
| raise ValueError(f"Unexpected query: {query}") |
There was a problem hiding this comment.
Do not mock the pure function itself.
Mocking a whole function that defines in the AutoRAG package is anti-pattern.
Instead, mock the aiohttp's response to proper mock response.
Plus, to simplify the test, try to use same example and mock response throughout the tests.
There was a problem hiding this comment.
Thanks for the detailed review! I've updated the code based on your comments. Please check it again.
| @patch( | ||
| "autorag.nodes.passagereranker.nvidia.nvidia_rerank_pure", | ||
| mock_nvidia_rerank_pure, | ||
| ) | ||
| def test_nvidia_reranker(nvidia_reranker_instance): | ||
| top_k = 3 | ||
| contents_result, id_result, score_result = nvidia_reranker_instance._pure( | ||
| queries_example, contents_example, scores_example, ids_example, top_k | ||
| ) | ||
| base_reranker_test(contents_result, id_result, score_result, top_k) | ||
|
|
||
|
|
||
| @patch( | ||
| "autorag.nodes.passagereranker.nvidia.nvidia_rerank_pure", | ||
| mock_nvidia_rerank_pure, | ||
| ) | ||
| def test_nvidia_reranker_batch_one(nvidia_reranker_instance): | ||
| top_k = 3 | ||
| batch = 1 | ||
| contents_result, id_result, score_result = nvidia_reranker_instance._pure( | ||
| queries_example, | ||
| contents_example, | ||
| scores_example, | ||
| ids_example, | ||
| top_k, | ||
| batch=batch, | ||
| ) | ||
| base_reranker_test(contents_result, id_result, score_result, top_k) | ||
|
|
||
|
|
||
| @patch( | ||
| "autorag.nodes.passagereranker.nvidia.nvidia_rerank_pure", | ||
| mock_nvidia_rerank_pure, | ||
| ) |
There was a problem hiding this comment.
All patches has to be changed due to the previous comment.
There was a problem hiding this comment.
Thanks for the detailed review! I've updated the code based on your comments. Please check it again.
bwook00
left a comment
There was a problem hiding this comment.
Thank you for your contribution 🚀
Feel free to leave a comment if you have another opinion about the comments.
I've just added a few minor comments.
bwook00
left a comment
There was a problem hiding this comment.
Thx for your update 👍 One more minor review
Description
https://build.nvidia.com/nvidia/rerank-qa-mistral-4b
위 모델을 추가하였습니다.
unit test와 개인 api key 호출 테스트에 성공하였습니다.
Related Issue
close #430