Skip to content

Elasticsearch vector store - Wrong error reported when missing index #1316

@l-trotta

Description

@l-trotta

Hello, I'm the author of parts of the implementation of the Elasticsearch Vector store (see #592).

I was testing the new release and I noticed that this PR introduced a check that prevents documents to be added to the bulk only if the index already exists.

This is problematic because if the user set initializeSchema as false in the initial configuration and did not previously create the index (which is allowed by Elasticsearch, since it will automatically configure and create the index), the user will receive the following error:

Caused by: co.elastic.clients.util.MissingRequiredPropertyException: Missing required property 'BulkRequest.operations'
	at co.elastic.clients.util.ApiTypeHelper.requireNonNull(ApiTypeHelper.java:76) ~[elasticsearch-java-8.13.4.jar:na]
	at co.elastic.clients.util.ApiTypeHelper.unmodifiableRequired(ApiTypeHelper.java:141) ~[elasticsearch-java-8.13.4.jar:na]
	at co.elastic.clients.elasticsearch.core.BulkRequest.<init>(BulkRequest.java:122) ~[elasticsearch-java-8.13.4.jar:na]
	at co.elastic.clients.elasticsearch.core.BulkRequest.<init>(BulkRequest.java:77) ~[elasticsearch-java-8.13.4.jar:na]
	at co.elastic.clients.elasticsearch.core.BulkRequest$Builder.build(BulkRequest.java:518) ~[elasticsearch-java-8.13.4.jar:na]
	at org.springframework.ai.vectorstore.ElasticsearchVectorStore.doAdd(ElasticsearchVectorStore.java:130) ~[spring-ai-elasticsearch-store-1.0.0-20240904.212955-466.jar:1.0.0-SNAPSHOT]
	at com.example.demo.Service.init(Service.java:88) ~[classes/:na]
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[na:na]
	at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[na:na]
	at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMethod.invoke(InitDestroyAnnotationBeanPostProcessor.java:457) ~[spring-beans-6.1.10.jar:6.1.10]
	at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeInitMethods(InitDestroyAnnotationBeanPostProcessor.java:401) ~[spring-beans-6.1.10.jar:6.1.10]
	at org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeInitialization(InitDestroyAnnotationBeanPostProcessor.java:219) ~[spring-beans-6.1.10.jar:6.1.10]
	... 18 common frames omitted

which doesn't really explain what is going on, as this is an exception that occurs when a BulkRequest is built without the necessary properties.

I have 3 possible solutions for this:

  1. Remove the check, let Elasticsearch create the index automatically with the default configuration (so only cosine allowed as similarity function).
  2. Leave the check, throwing an appropriate exception if the index does not exist.
  3. Leave the check, throwing the exception only if the user selected a similarity function different from cosine.

Let me know which one is the more appropriate and I will implement it.

Personally I would go with number 3, so that we both keep the benefit of autoconfiguration when possible, and avoid users not knowing when the index actually needs to be configured or not; in any case I'd like to add some more information around the index creation both in the code and the documentation.

Thank you for your time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions