Skip to content

Conversation

@thecoop
Copy link
Member

@thecoop thecoop commented Aug 11, 2025

Pull some common code into common subclasses. Whilst these are copied from Lucene, they share some duplicated functionality. We are likely to continue to have custom flat & HNSW formats for a good time yet, so we should aim to centralise our ES-specific settings and defaults into a single class.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0 labels Aug 11, 2025
Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we are in a 🐔 & 🥚 scenario with this and moving directIO to a mapper index option :/

But it seems to me that the flat formats should provide an abstract API to return "useDirectIO" or not to make moving away from the experimental java opt setting easier

Comment on lines +20 to +31
public static final boolean USE_DIRECT_IO = getUseDirectIO();

@SuppressForbidden(
reason = "TODO Deprecate any lenient usage of Boolean#parseBoolean https://github.com/elastic/elasticsearch/issues/128993"
)
private static boolean getUseDirectIO() {
return Boolean.parseBoolean(System.getProperty("vector.rescoring.directio", "false"));
}

protected AbstractFlatVectorsFormat(String name) {
super(name);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like since we are moving to a mapper setting, direct IO can be a parameter or an abstract method (like you did with flat in HNSW) that is supplied in the ctor and sub-classes of the appropriate type will always provide their subsequent "true/false"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are kinda in the same code area - I would prefer to merge this in first (which doesn't change behaviour), then update the direct IO PR to use the abstract classes introduced here in its formats, removing the direct IO JVM option at the same time

@thecoop thecoop merged commit 5912eab into elastic:main Aug 27, 2025
33 checks passed
@thecoop thecoop deleted the abstract-knn-format-classes branch August 27, 2025 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>refactoring :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants