Skip to content

Commit 1c17b20

Browse files
authored
Update README.md
1 parent b457bfc commit 1c17b20

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ Using JudgeIt framework is simple, just pick what is the task you want to evalua
115115
- [ ] Query-Rewrite support for More-than 2-turn
116116
- [ ] Specific support for multiple LLM generated text formats (Like Anthropic etc.)
117117

118-
**Known-Limitation** :
118+
**Current-Release-Limitation** :
119119
1. Verbose vs Crisp RAG Answers - We found that the framework sometimes becomes extremely conservative when there is a large gap between the size of the golden text and the generated text. For RAG comparisons, if the golden text is one page long and the generated text is just a few lines, it tends to treat them as dissimilar. We are currently working on a more 'liberal' version of this judge, which will be released soon.
120120
2. Comparing outputs for two LLMs.- We found that some LLM's preface their answers with some text, which can throw Judge off. When comparing two LLM try to avoid standard text like "I am just an average Language Model...etc". We are working on next version which will do this automatically.
121121

0 commit comments

Comments
 (0)