Skip to content

Commit 41c0c28

Browse files
i-w-ashahules786
andauthored
[patch] Reduce NaN Occurrences by Simple Prompt Modification for JSON Output for context_precision (#581)
## Overview During the calculation of `context_precision`, an issue was observed where increasing the context amount led to a surge in NaN occurrences. Comparatively, `context_recall` does not exhibit this problem. An investigation into the causes of the difference uncovered that the issue stems from whether the prompts specify outputting in JSON format. ## Discovery It was found that simply specifying JSON output for `context_precision`, similar to what is done for `context_recall`, significantly reduces the incidence of NaN. Utilizing JSON mode appears to be crucial, as noted in the OpenAI reference for text generation in JSON mode: > "If you don't include an explicit instruction to generate JSON, the model may generate an unending stream of whitespace and the request may run continually until it reaches the token limit." [OpenAI Text Generation JSON Mode Documentation](https://platform.openai.com/docs/guides/text-generation/json-mode) ## Solution To align with best practices and address the NaN generation issue, I propose updating the prompt for `context_precision` to explicitly instruct the generation of output in JSON format. This small but impactful change will bring `context_precision` in line with how `context_recall` operates and ensure more stable and predictable outcomes when handling larger context volumes. ## Impact By making this explicit switch to JSON output, we not only follow the guideline provided by OpenAI but also prevent the potential uncontrolled behavior that can result in a heavy onslaught of NaN values. This improvement should increase the reliability of calculations within our system and significantly decrease the time spent debugging NaN-related issues. I look forward to your review and approval of this change, which will help us maintain robustness in our context precision calculations. Best, i-w-a --------- Co-authored-by: Shahules786 <[email protected]>
1 parent 7dc3d50 commit 41c0c28

File tree

2 files changed

+7
-2
lines changed

2 files changed

+7
-2
lines changed

src/ragas/llms/prompt.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,12 @@ def to_string(self) -> str:
9191
"""
9292
Generate the prompt string from the variables.
9393
"""
94-
prompt_str = self.instruction + "\n"
94+
added_json_instruction = (
95+
"\nOutput in only valid JSON format."
96+
if self.output_type.lower() == "json"
97+
else ""
98+
)
99+
prompt_str = self.instruction + added_json_instruction + "\n"
95100

96101
if self.examples:
97102
# Format the examples to match the Langchain prompt template

src/ragas/metrics/_context_precision.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818

1919
CONTEXT_PRECISION = Prompt(
2020
name="context_precision",
21-
instruction="""Given question, answer and context verify if the context was useful in arriving at the given answer. Give verdict as "1" if useful and "0" if not. """,
21+
instruction="""Given question, answer and context verify if the context was useful in arriving at the given answer. Give verdict as "1" if useful and "0" if not with json output. """,
2222
examples=[
2323
{
2424
"question": """What can you tell me about albert Albert Einstein?""",

0 commit comments

Comments
 (0)