Commit e4ca299
test LLM output for semantic similarity using vector embeddings (#61)
## Add example how to test LLM output for semantic similarity using
vector embeddings.
### Snapshot testing is allows capture embeddings vector and notice when
it changes.
This pull request includes significant updates to the
`examples/team_recommender/tests/example_1_text_response` module,
focusing on enhancing the functionality and improving the accuracy of
the embeddings and similarity computations. The most important changes
include the addition of new functions for embedding stabilization, new
test cases, and updates to existing test cases to ensure robustness.
### Enhancements to embeddings and similarity computations:
*
[`examples/team_recommender/tests/example_1_text_response/openai_embeddings.py`](diffhunk://#diff-7e124963dbad8becc0d3cf8af970ffcdb8d3a15f08ec530ac66648503fe787c1R30-R41):
Added functions `stabilize_embedding`, `stabilize_embedding_object`, and
`stabilize_float` to stabilize embeddings and floating-point numbers.
*
[`examples/team_recommender/tests/example_1_text_response/cosine_similarity.py`](diffhunk://#diff-6d0324093a21552d9854f103e5761711492dd7fde2f4a5ceff44c86f69096647R6-R15):
Added a new function `compute_alignment` to calculate the alignment
vector between two lists.
### Updates to test cases:
*
[`examples/team_recommender/tests/example_1_text_response/test_compute_alignment.py`](diffhunk://#diff-470d6ed6512f589fd0364c59fe23c9164e0ad16d30dac25d23b35399b46b3938R1-R30):
Added a new test case `test_compute_alignment` to verify the
functionality of the `compute_alignment` function.
*
[`examples/team_recommender/tests/example_1_text_response/test_compute_cosine_similarity.py`](diffhunk://#diff-d2553c1d795d28735b05fbc5a923348089e71c795324b1fd6202b725058e3658R1-R104):
Added multiple test cases to verify the correctness of cosine similarity
computations, including tests for aligned vectors, random vectors, and
saved responses.
*
[`examples/team_recommender/tests/example_1_text_response/test_openai_embeddings.py`](diffhunk://#diff-c1433d81448f809ce1a324c642d4ef5a4b4d99f554e99d33464aaf3505a8a10cR1-R43):
Added test cases to verify the stabilization functions, ensuring they
work correctly with various inputs.
### Removal of outdated test data:
*
[`examples/team_recommender/tests/example_1_text_response/snapshots/test_good_fit_for_project/test_llm_will_hallucinate_given_no_data/hallucination_response.txt`](diffhunk://#diff-ffb88fc44f30433441477d3cfb2840d31d5aa7ebbc1cb3cc57a8bc36455dcd26L1-L21):
Removed outdated test snapshot data.
*
[`examples/team_recommender/tests/example_1_text_response/snapshots/test_good_fit_for_project/test_llm_will_hallucinate_given_no_data/please_provide_missing_information_response.txt`](diffhunk://#diff-45037acb82b4a9fca1c6eef4ece5eedc292a858b39fdd9b84dbbcef0530c2b7bL1):
Removed outdated test snapshot data.
---------
Signed-off-by: Paul Zabelin <paulzabelin@artium.ai>
Co-authored-by: Carl Jackson <carl@realvr.ai>
Co-authored-by: Austin Putman <austin@rawfingertips.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>1 parent 42de8c8 commit e4ca299
File tree
16 files changed
+17332
-238
lines changed- examples/team_recommender/tests
- example_1_text_response
- snapshots
- test_compute_alignment/test_compute_alignment
- test_compute_cosine_similarity/test_reproducing_the_same_text_embedding
- test_good_fit_for_project/test_llm_will_hallucinate_given_no_data
- fixtures
16 files changed
+17332
-238
lines changedLines changed: 10 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
Lines changed: 13 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
| 4 | + | |
4 | 5 | | |
5 | 6 | | |
6 | 7 | | |
| |||
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
29 | 42 | | |
30 | 43 | | |
31 | 44 | | |
| |||
0 commit comments