-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Pull out common subclasses for our custom flat & hnsw vector formats #132663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
.../src/main/java/org/elasticsearch/index/codec/vectors/es816/ES816BinaryFlatVectorsScorer.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/codec/vectors/AbstractHnswVectorsFormat.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we are in a 🐔 & 🥚 scenario with this and moving directIO to a mapper index option :/
But it seems to me that the flat formats should provide an abstract API to return "useDirectIO" or not to make moving away from the experimental java opt setting easier
| public static final boolean USE_DIRECT_IO = getUseDirectIO(); | ||
|
|
||
| @SuppressForbidden( | ||
| reason = "TODO Deprecate any lenient usage of Boolean#parseBoolean https://github.com/elastic/elasticsearch/issues/128993" | ||
| ) | ||
| private static boolean getUseDirectIO() { | ||
| return Boolean.parseBoolean(System.getProperty("vector.rescoring.directio", "false")); | ||
| } | ||
|
|
||
| protected AbstractFlatVectorsFormat(String name) { | ||
| super(name); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like since we are moving to a mapper setting, direct IO can be a parameter or an abstract method (like you did with flat in HNSW) that is supplied in the ctor and sub-classes of the appropriate type will always provide their subsequent "true/false"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are kinda in the same code area - I would prefer to merge this in first (which doesn't change behaviour), then update the direct IO PR to use the abstract classes introduced here in its formats, removing the direct IO JVM option at the same time
Pull some common code into common subclasses. Whilst these are copied from Lucene, they share some duplicated functionality. We are likely to continue to have custom flat & HNSW formats for a good time yet, so we should aim to centralise our ES-specific settings and defaults into a single class.