Limit engine correctness check by hjk1030 · Pull Request #178 · ddkang/aidb

hjk1030 · 2024-05-07T14:53:05Z

Resolves issue #54. I assume the problem appears when there are floating point errors during serialization, so I added a function to check the subset relation with floating error tolerance.

* Add subset check with float error tolerance

hjk1030 · 2024-05-07T15:54:07Z

@ttt-77 Do you mean to add a check like this?

ttt-77 · 2024-05-15T06:59:23Z

Yes. But can you refactor the code? Don't use three functions.

* Use direct code instead of functions

hjk1030 · 2024-05-15T17:08:52Z

@ttt-77 Is this better?

ttt-77 · 2024-05-16T02:41:55Z

tests/test_limit_engine.py

+
+    aidb_res = set(aidb_res)
+    gt_res = set(gt_res)
+    return all(any(


You can create a variable and make it more readable.

* Expand function into loops

hjk1030 · 2024-05-17T03:05:44Z

I expanded the outer two layers into loops to make it more readable, but I think retaining the innermost all() function won't affect readability much and simplifies the code.

ttt-77 · 2024-05-18T21:53:34Z

tests/test_limit_engine.py

        logger.info(f'There are {len(aidb_res)} elements in limit engine results '
              f'and {len(gt_res)} elements in ground truth results')
-        assert len(set(aidb_res) - set(gt_res)) == 0
+        assert self._subset_check(aidb_res, gt_res)


Can you also add another check?
if len(aidb_res) < 100:
assert len(aidb_res) == len(gt_res)
else:
assert len(aidb_res) == 100

Ok, added this.

* Check the aidb result length

* Check the usage of proxy score by doing limit query on random and accurate proxy scores

hjk1030 · 2024-05-26T12:09:57Z

@ttt-77 Could you please review this? I added the test mentioned by issue #55 to this PR as well. Since no real index is provided for the jackson dataset, I compared the inference count when using the random proxy score and the accurate proxy score generated from the ground truth result.

ttt-77 · 2024-05-27T02:36:38Z

Can you split the new commit into another PR?

This reverts commit d04001d.

Limit engine correctness check

4b70345

* Add subset check with float error tolerance

Refactor check

a498390

* Use direct code instead of functions

ttt-77 reviewed May 16, 2024

View reviewed changes

Refactor code to improve readability

d7067db

* Expand function into loops

ttt-77 reviewed May 18, 2024

View reviewed changes

hjk1030 added 2 commits May 19, 2024 10:41

Add check for result length

d5ae6f4

* Check the aidb result length

Add test for proxy score usage validation

d04001d

* Check the usage of proxy score by doing limit query on random and accurate proxy scores

Revert "Add test for proxy score usage validation"

7ed79ef

This reverts commit d04001d.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit engine correctness check#178

Limit engine correctness check#178
hjk1030 wants to merge 6 commits intoddkang:mainfrom
hjk1030:limit_engine_correctness_check

hjk1030 commented May 7, 2024

Uh oh!

hjk1030 commented May 7, 2024

Uh oh!

ttt-77 commented May 15, 2024

Uh oh!

hjk1030 commented May 15, 2024

Uh oh!

ttt-77 May 16, 2024

Uh oh!

hjk1030 commented May 17, 2024

Uh oh!

ttt-77 May 18, 2024

Uh oh!

hjk1030 May 19, 2024

Uh oh!

hjk1030 commented May 26, 2024

Uh oh!

ttt-77 commented May 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hjk1030 commented May 7, 2024

Uh oh!

hjk1030 commented May 7, 2024

Uh oh!

ttt-77 commented May 15, 2024

Uh oh!

hjk1030 commented May 15, 2024

Uh oh!

ttt-77 May 16, 2024

Choose a reason for hiding this comment

Uh oh!

hjk1030 commented May 17, 2024

Uh oh!

ttt-77 May 18, 2024

Choose a reason for hiding this comment

Uh oh!

hjk1030 May 19, 2024

Choose a reason for hiding this comment

Uh oh!

hjk1030 commented May 26, 2024

Uh oh!

ttt-77 commented May 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants