Fixed a typo in evaluation function by BastienZim · Pull Request #9 · gkamradt/LLMTest_NeedleInAHaystack

BastienZim · 2024-02-26T15:47:30Z

As I was going through the code to understand what the evaluation was doing I encountered this typo.
This affects the whole evaluation process and might bias the results.

This is why I would consider incorporating this change.

kedarchandrayan · 2024-03-07T12:52:42Z

Hello @BastienZim , please resolve the conflict.

LazaroHurtado · 2024-03-09T06:50:24Z

Hey @BastienZim, it seems like there were some issues while resolving the conflicts and your original typo fix is no longer in this PR. Could you please update this PR one more time resolving this issue?

BastienZim · 2024-03-11T15:50:32Z

It seems the evaluation function has been moved to the src/evaluators/openai.py file. Any tips of how i could change my PR to shift my change to only this file or should I open a new PR ?

BastienZim · 2024-03-11T15:50:49Z

@LazaroHurtado

BastienZim · 2024-03-11T15:54:24Z

We could also discuss whether a LLM call is the best evaluation function for the needle task. As the task is quite straightforward, resolving to the few shot-learning capabilities of LLMs might not be the best solution. I would be interested in implementing something like a rouge based evaluation or any hard coded evaluation such as whether the answer contains the exact string expected (a boolean indicating if the needle is there or not).

LazaroHurtado · 2024-03-11T16:22:25Z

@BastienZim, these might help you resolve all the merge conflicts:

Go to your forked repo and sync with the root repo, there should be a "sync" button
In your terminal checkout to the main branch: git checkout main
pull the latest changes coming in from the sync: git pull
Now you should have the latest changes without any conflicts, so you can make your changes to the evaluators/openai.py file
commit and push

I like your idea of using other algorithms for evaluation, like ROUGE and cosine similarity. Would be awesome if you can add those features!

BastienZim · 2024-03-12T08:49:33Z

Thanks for the tips @lazaro. Your instructions were good. However, the sync button seemed to have cancelled the PR. I opened a new one, "Fixed a typo in evaluation function #29".

BastienZim · 2024-03-12T08:49:55Z

I will come back soon with a base to work on different metrics !

BastienZim closed this Mar 12, 2024

BastienZim force-pushed the main branch from 5887cf2 to 4e38dac Compare March 12, 2024 08:43

BastienZim mentioned this pull request Mar 12, 2024

Fixed a typo in evaluation function #29

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed a typo in evaluation function#9

Fixed a typo in evaluation function#9
BastienZim wants to merge 0 commit intogkamradt:mainfrom
BastienZim:main

BastienZim commented Feb 26, 2024

Uh oh!

kedarchandrayan commented Mar 7, 2024

Uh oh!

LazaroHurtado commented Mar 9, 2024

Uh oh!

BastienZim commented Mar 11, 2024

Uh oh!

BastienZim commented Mar 11, 2024

Uh oh!

BastienZim commented Mar 11, 2024

Uh oh!

LazaroHurtado commented Mar 11, 2024

Uh oh!

BastienZim commented Mar 12, 2024

Uh oh!

BastienZim commented Mar 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

BastienZim commented Feb 26, 2024

Uh oh!

kedarchandrayan commented Mar 7, 2024

Uh oh!

LazaroHurtado commented Mar 9, 2024

Uh oh!

BastienZim commented Mar 11, 2024

Uh oh!

BastienZim commented Mar 11, 2024

Uh oh!

BastienZim commented Mar 11, 2024

Uh oh!

LazaroHurtado commented Mar 11, 2024

Uh oh!

BastienZim commented Mar 12, 2024

Uh oh!

BastienZim commented Mar 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants