You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+33-13Lines changed: 33 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -87,32 +87,52 @@ Judges within YESciEval are defined as follows:
87
87
|`AutoJudge`| Base class for loading and running evaluation models with PEFT adapters. |
88
88
|`AskAutoJudge`| Multidisciplinary judge tuned on the ORKGSyn dataset from the Open Research Knowledge Graph. |
89
89
|`BioASQAutoJudge`| Biomedical domain judge tuned on the BioASQ dataset from the BioASQ challenge. |
90
-
|`CustomAutoJudge`| Custom LLM that can be used as a judge within YESciEval rubrics|
90
+
|`CustomAutoJudge`| Custom LLM (open-source LLMs) that can be used as a judge within YESciEval rubrics |
91
91
92
-
A total of nine evaluation rubrics were defined as part of the YESciEval test framework and can be used via ``yescieval``. Following simple example shows how to import rubrics in your code:
92
+
A total of **23** evaluation rubrics were defined as part of the YESciEval test framework and can be used via ``yescieval``. Following simple example shows how to import rubrics in your code:
93
93
94
94
```python
95
-
from yescieval import Informativeness, Correctness, Completeness,
96
-
Coherence, Relevancy, Integration,
97
-
Cohesion, Readability, Conciseness
95
+
from yescieval import Informativeness, Correctness, Completeness, Coherence, Relevancy, \
Copy file name to clipboardExpand all lines: docs/source/quickstart.rst
+21-1Lines changed: 21 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
Quickstart
2
2
=================
3
3
4
-
YESciEval is a library designed to evaluate the quality of synthesized scientific answers using predefined rubrics and advanced LLM-based judgment models. This guide walks you through how to evaluate answers based on **informativeness** using a pretrained judge and parse LLM output into structured JSON.
4
+
YESciEval is a library designed to evaluate the quality of synthesized scientific answers using predefined rubrics and advanced LLM-based judgment models. This guide walks you through how to evaluate answers based on **informativeness** and **gap identification** using a pretrained & a custom judge and parse LLM output into structured JSON.
5
5
6
6
7
7
**Example: Evaluating an Answer Using Informativeness + AskAutoJudge**
@@ -46,6 +46,26 @@ YESciEval is a library designed to evaluate the quality of synthesized scientifi
46
46
- Use the ``device="cuda"`` if running on GPU for better performance.
47
47
- Add more rubrics such as ``Informativeness``, ``Relevancy``, etc for multi-criteria evaluation.
48
48
49
+
50
+
**Example: Evaluating an Answer Using GapIdentification + CustomAutoJudge**
51
+
52
+
.. code-block:: python
53
+
54
+
from yescieval import GapIdentification, CustomAutoJudge
0 commit comments