Contextual Precision Metric behavior #1549

iosuAbal · 2025-04-30T10:14:24Z

iosuAbal
Apr 30, 2025

Hi!
I'm having some issues to understand the behavior of the Multimodal Contextual Precision Metric, as the documentation states that to achieve a high score relevant statements (or nodes) should be ranked higher than irrelevant ones.

In my case, I evaluated the retrieval part of a question (I'm building a multimodal RAG) and obtained a score of 1.0 (the maximum). I don't think that is right, since the relevant part is obtained in the 3rd node, as it is the one that contains the actual answer.

However, after checking how that metric works internally, I noticed that first the model generates a list of verdicts to determine whether each node is relevant. When generating these verdicts, Node 3 is magically positioned at the top of the list, even if it wasn't the first retrieved node, as you can see in the image. (Of course it is predicted as relevant)

As I mentioned the score was a perfect 1.0 and this is the provided reason.

So I have 2 questions overall.

Does this metric have any type of reranker internally that positions the relevant nodes up?
Even if the relevant node is positioned at the top, I have retrieved 14 nodes that are irrelevant and noisy (could distract the generator). Does that not have a penalty or impact on the final precision score?

Any help is welcome!
Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contextual Precision Metric behavior #1549

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Contextual Precision Metric behavior #1549

Uh oh!

iosuAbal Apr 30, 2025

Replies: 0 comments

iosuAbal
Apr 30, 2025