You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR adds a fix for the issue mentioned in #1108
However I have a points to discuss @shahules786 :
- I had added `conciseness_score` to penalize long summaries, but I also
do not want to promote very very short and skimpy summaries, need to
find a middle ground.
- Is `averaging` a good way to combine `QA_score` and
`conciseness_score`?
- Ranking based metrics to measure quality of summarization (as
mentioned by shahul in the above issue)
Given the conclusions we reach based on these discussion points, I will
push more commits, let's keep this PR open till we resolve these points.
---------
Co-authored-by: Shahules786 <[email protected]>
We also introduce an option to penalize larger summaries by proving a conciseness score. If this option is enabled, the final score is calculated as the average of the summarization score and the conciseness score. This conciseness scores ensures that summaries that are just copies of the text do not get a high score, because they will obviously answer all questions correctly.
14
+
We also introduce an option to penalize larger summaries by proving a conciseness score. If this option is enabled, the final score is calculated as the weighted average of the summarization score and the conciseness score. This conciseness scores ensures that summaries that are just copies of the text do not get a high score, because they will obviously answer all questions correctly. Also, we do not want the summaries that are empty. We add a small value `1e-10` to the denominator to avoid division by zero.
15
15
16
16
```{math}
17
17
:label: conciseness-score
18
-
\text{conciseness score} = 1 - \frac{\text{length of summary}}{\text{length of context}}
18
+
\text{conciseness score} = 1 - \frac{\min(\text{length of summary}, \text{length of context})}{\text{length of context} + \text{1e-10}}
19
19
````
20
20
21
+
We also provide a coefficient `coeff`(default value 0.5) to control the weightage of the scores.
22
+
21
23
The final summarization score is then calculated as:
@@ -61,13 +64,14 @@ The final summarization score is then calculated as:
61
64
## Example
62
65
63
66
```{code-block} python
64
-
from datasets import Dataset
65
67
from ragas.metrics import summarization_score
66
68
from ragas import evaluate
69
+
from datasets import Dataset
70
+
67
71
68
72
data_samples = {
69
-
'contexts' : [[c1], [c2]],
70
-
'summary': [s1, s2]
73
+
'contexts':[["A company is launching a new product, a smartphone app designed to help users track their fitness goals. The app allows users to set daily exercise targets, log their meals, and track their water intake. It also provides personalized workout recommendations and sends motivational reminders throughout the day."]],
74
+
'summary':['A company is launching a fitness tracking app that helps users set exercise goals, log meals, and track water intake, with personalized workout suggestions and motivational reminders.'],
0 commit comments