Skip to content

Commit 1bca1fa

Browse files
committed
ragas: bump ragas version, pass old rubric in RubricScore
Before ragas v0.2.11 RubricScores.rubrics wasn't being applied properly. This commit sets that as the minimum version for this library. A change in v0.2.11 from previous versions was a change in the prompt for domain specific knowledge evaluation with reference. The new prompt is hardcoded in case ragas makes any changes to their prompts again in the future. Signed-off-by: Ali Maredia <[email protected]>
1 parent 03afb6c commit 1bca1fa

File tree

2 files changed

+13
-7
lines changed

2 files changed

+13
-7
lines changed

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,4 @@ pandas
1010
pandas-stubs
1111
lm-eval>=0.4.4
1212
httpx
13-
ragas
13+
ragas>=0.2.11

src/instructlab/eval/ragas.py

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,17 +11,24 @@
1111
from pydantic import BaseModel, ConfigDict, Field
1212
from ragas.evaluation import EvaluationDataset, EvaluationResult, RunConfig, evaluate
1313
from ragas.metrics import Metric
14-
from ragas.metrics._domain_specific_rubrics import ( # the rubrics we must instantiate are located inside of a file marked as private
15-
DEFAULT_WITH_REFERENCE_RUBRICS,
16-
RubricsScore,
17-
)
14+
from ragas.metrics._domain_specific_rubrics import RubricsScore
1815

1916
# Local
2017
from .evaluator import Evaluator
2118
from .logger_config import setup_logger
2219

2320
logger = setup_logger(__name__)
2421

22+
# DEFAULT_WITH_REFERENCE_RUBRICS from ragas v0.2.11.
23+
# This rubric is hardcoded in case ragas makes any changes to their DEFAULT_WITH_REFERENCE_RUBRICS in the future
24+
SCORING_RUBRICS = {
25+
"score1_description": "The response is entirely incorrect, irrelevant, or does not align with the reference in any meaningful way.",
26+
"score2_description": "The response partially matches the reference but contains major errors, significant omissions, or irrelevant information.",
27+
"score3_description": "The response aligns with the reference overall but lacks sufficient detail, clarity, or contains minor inaccuracies.",
28+
"score4_description": "The response is mostly accurate, aligns closely with the reference, and contains only minor issues or omissions.",
29+
"score5_description": "The response is fully accurate, completely aligns with the reference, and is clear, thorough, and detailed.",
30+
}
31+
2532

2633
class Sample(TypedDict):
2734
"""
@@ -256,9 +263,8 @@ def _generate_answers_from_model(
256263

257264
@staticmethod
258265
def _get_metrics() -> List[Metric]:
259-
# default set of metrics
260266
return [
261267
RubricsScore(
262-
rubrics=DEFAULT_WITH_REFERENCE_RUBRICS,
268+
rubrics=SCORING_RUBRICS,
263269
)
264270
]

0 commit comments

Comments
 (0)