Conversation
|
Hello @BastienZim , please resolve the conflict. |
|
Hey @BastienZim, it seems like there were some issues while resolving the conflicts and your original typo fix is no longer in this PR. Could you please update this PR one more time resolving this issue? |
|
It seems the evaluation function has been moved to the src/evaluators/openai.py file. Any tips of how i could change my PR to shift my change to only this file or should I open a new PR ? |
|
We could also discuss whether a LLM call is the best evaluation function for the needle task. As the task is quite straightforward, resolving to the few shot-learning capabilities of LLMs might not be the best solution. I would be interested in implementing something like a rouge based evaluation or any hard coded evaluation such as whether the answer contains the exact string expected (a boolean indicating if the needle is there or not). |
|
@BastienZim, these might help you resolve all the merge conflicts:
I like your idea of using other algorithms for evaluation, like ROUGE and cosine similarity. Would be awesome if you can add those features! |
|
I will come back soon with a base to work on different metrics ! |
As I was going through the code to understand what the evaluation was doing I encountered this typo.
This affects the whole evaluation process and might bias the results.
This is why I would consider incorporating this change.