-
Notifications
You must be signed in to change notification settings - Fork 22
[MOD-11881] Fix HNSW undefined behavior in container access #806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## backport-803-to-0.6 #806 +/- ##
=======================================================
+ Coverage 95.07% 95.08% +0.01%
=======================================================
Files 60 60
Lines 3451 3460 +9
=======================================================
+ Hits 3281 3290 +9
Misses 170 170 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
2d0233e to
0c3b4e5
Compare
alonre24
approved these changes
Oct 16, 2025
github-merge-queue bot
pushed a commit
that referenced
this pull request
Oct 16, 2025
* [0.8] Replace Valgrind with address-sanitizer [MOD-10409] (#803) * Replace Valgrind with address-sanitizer [MOD-10409] (#734) * add asan using augment - wip (code with svs cannot compile with llvm clang currently) * rename run-valgrind to asan * fix buffer overflow in test * fix bad testSizeEstimation in int8 and uint8 + cleanup * fix cmake install for macos * ran sanitizer as part of every flow + remove from non latest OSs + only run debian11 with gcc11 (alignment with search) * revert 18 from clang format * fix syntax errors * * rename dataSize -> storedDataSize *refine distinction between storedDataSize and inputBlobSize: inputBlobSize not necessarly dim *sizeof(type) * vecsimAbstract: move data members to private when possible and not too complicated * remove effective_input_size check from VecSimIndexAbstract<DataType, DistType>::preprocessQuery: That's logically wrong. The previous logic incorrectly used force_copy to determine blob size: - force_copy only indicates whether to copy the blob for memory management - It doesn't determine the actual size of the input blob In tiered indexes, when the backend creates a batch iterator, it receives the query blob from the frontend's batch iterator (already preprocessed). This preprocessed blob has the processed size (e.g., with cosine norm appended), not the original dim * sizeof(type) size. The old logic accidentally worked when inputBlobSize == dim * sizeof(type) because using dataSize (processed size) happened to match the preprocessed blob size from the frontend. However, this was a coincidence that broke when inputBlobSize was updated to reflect actual input blob sizes. Now preprocessQuery correctly uses inputBlobSize, which represents the actual size of the input blob being processed. BruteForce Factory Changes: • Templated NewAbstractInitParams to be reusable across different algorithm parameter types All factories changes: storedDataSize: Size after preprocessing (e.g., with cosine norm appended) inputBlobSize: Size of input blobs the index expects to receive (depends on whether vectors are pre-normalized) • Added inputBlobSize calculation based on normalization status: uses storedDataSize if normalized, otherwise dim * sizeof(type) *svs tests testSizeEstimation: using constexpr to eliminate compiler warnings about potential division by zero when quantBits could be 0 (VecSimSvsQuant_NONE) * fix merge * another fix * have leak * run first san * build with debug * upload san srtifacts * fix condition * fix path * clean up * run wuth san macos * real leak * fix pah * fix security issue * remove san from event pr run cov with san run mac with san * trigger again * unit test flow name * add space * split sanitizer in intel machine * fix cpu info * fix alpine pre deps * add to needs * unify codecov and sanitizer again * disable flow temp add cov and sanitize to nightly * revert unnecessary changes * remove bin before building sanitizer in codecov to use clanmg * install clang * install clang 18 * instsll clang 18 * remove tools * install llvm on cov * fix deps --------- Co-authored-by: meiravgri <[email protected]> (cherry picked from commit 2fcc669) * fix debian and nightly * remove slack notify on nightly * change cov machine * revert cov changes - no sanitizer * no pre deps * remove --------- Co-authored-by: alonre24 <[email protected]> (cherry picked from commit 44df9db) * [MOD-11881] Fix HNSW undefined behavior in container access (#806) fix invalid reads --------- Co-authored-by: alonre24 <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem description
Accessing
top()on empty priority queues is considered UB.Errors were detected by AddressSanitizer on MacOS in the following tests:
HNSWTest/0.resizeNAlignIndexandhnsw_delete_entry_point_Test.Fixes in this PR
removeExtraLinks:getNeighborsByHeuristic2might return fewer candidates than input (orig_candidates). In this fix we first iterate the modified candidates returned bygetNeighborsByHeuristic2, then handle the remainingorig_candidateselements rejected by heuristic.processCandidate: Validate thatcandidate_setis not empty when calling prefetch.