-
Notifications
You must be signed in to change notification settings - Fork 136
Open
Labels
Description
When running the knnPerfTest multiple times with different quantization levels, I noticed that the true nearest neighbors are recomputed and cached to different files each time, which is not necessary. This is because the indexPath is used to calculate the hash key for caching the true nearest neighbors.
I think this is redundant because the true nearest neighbors should be index-agnostic. Should we remove indexPath as a hash parameter ?
luceneutil/src/main/knn/KnnGraphTester.java
Lines 1025 to 1029 in 3184d64
| private int[][] getExactNN(Path docPath, Path indexPath, Path queryPath, int queryStartIndex) throws IOException, InterruptedException { | |
| // look in working directory for cached nn file | |
| String hash = Integer.toString(Objects.hash(docPath, indexPath, queryPath, numDocs, numQueryVectors, topK, similarityFunction.ordinal(), parentJoin, queryStartIndex, prefilter ? selectivity : 1f, prefilter ? randomSeed : 0f), 36); | |
| String nnFileName = "nn-" + hash + ".bin"; | |
| Path nnPath = Paths.get(nnFileName); |