-
Notifications
You must be signed in to change notification settings - Fork 2k
Add support for Elasticsearch vector store #234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 3 commits
Commits
Show all changes
75 commits
Select commit
Hold shift + click to select a range
7294f7d
Add ElasticsearchAiSearchFilterExpressionConverter
JM-Lab c069db8
Add support for Elasticsearch vector store
JM-Lab 8ff8b87
Add support for various vector functions provided by Elasticsearch
JM-Lab 5d41978
Add support for Elasticsearch 8 vector store using the elasticsearch-…
JM-Lab 011dfcb
Merge branch 'spring-projects:main' into elasticsearch-vector-store
JM-Lab 8701ccc
Improve VertexAiGeminiAutoConfiguration to allow function calling por…
tzolov db383f8
Update README.md
markpollack 7d04167
Adding support for OpenAI Audio transcriptions
michaellavelle 8ae506f
Improve the OpenAi Audio code structure
tzolov c0c9dd2
Upgrade Azure OpenAI from 1.0.0-beta.6 to 1.0.0-beta.7
tzolov 7cc2eab
Change flatten pom config
markpollack 7396597
Update to add latest gpt-3.5 turbo model number
youngmoneee 3938cc8
Remove setters from options interface
youngmoneee f6c57f6
rename directory spring-ai-stabilityai to stability-ai
bcb559a
add license header plugin and update all java files
markpollack 8ed2af4
Add Ollama enum with supported models and their ids
sblashuk 78f73d1
Exclude .antlr and aot.factories from license headers check
tzolov 1e3eaec
Refactor and centralize Retry logic:
tzolov 1d6d11c
Resolve OpenAiApi initialization bug
tzolov e535c0e
Adjust the retry properties names and documentation
tzolov ba94039
Use platform independent line separators
tzolov 4fa2c6c
Move the maven license plugin into a 'license' profile
tzolov 659f007
Update auto-configure classes to return the most specific class types
tzolov af957b9
Add ETL pipeline diagrams
tzolov 1ea8515
Fix spring-ai-retry BOM version
tzolov fef7303
Replace %n by System.lineSeparator()
tzolov 8a0ad1d
Disable outdated samples
tzolov 8a336fb
Update google could bom from 26.33.0 to 26.34.0
tzolov 680150e
Use MessageType.FUNCTION for FunctionMessage
yarisvt 7f60e03
Fix Bedrock Anthropic line separator handling on Windows
tzolov 8784a59
Add logging of page processing progress in PagePdfDocumentReader
markpollack 7b89341
Add shell.log to gitignore
markpollack 634e6d0
Add Azure Workshop sample app to getting started doc
markpollack f8f38d6
Add MistralAI links to getting started doc
ricken07 0fdc6aa
Add streaming Function Calling support of OpenAI and Mistral AI
tzolov c5d9ae3
Improve Milvus documentation
tzolov 7f1570d
Update javadoc for message package and docs for Gemini Multimodal sup…
markpollack 4209d5a
Make consistent sync/stream AssistantMessage properties for OpenAI an…
tzolov 777b79e
Fix RedisVectorStore not closing Jedis pipelines
Bragolgirith 1e98f82
Change Azure embedding and chat options 'model' property to 'deployme…
markpollack ed5eef4
fix failing test due to property name change of model->deployment-nam…
markpollack 4fd15b1
Add documentation for OpenAI Transcription
markpollack 7bfe312
Convinence StreamingChatClient stream default
tzolov 783d81b
add quotes around command in github action to upload javadocs
markpollack 490f3cd
Milestone Release 0.8.1
markpollack 4c617e1
Prepare next development iteration
markpollack 9e98962
Add additional ctor for OpenAiEmbeddingClient
markpollack 255e542
CI/CD configuration fixes
tzolov 7d44f14
Fix typo in model name, fixed #442
abel533 73f9cb8
Fixed syntax in reference
be7e291
Fixed typo in Bedrock Titan chat option exception
pradipkhomane ba3e94e
Fixed typo in 'reference'
bottlerocketjonny 152420f
Fix Typos and Grammatical Errors
mackey0225 7c6bbef
Allow more customization for Neo4j store (id and constraint).
meistermeier 6638efe
fix code formatting
tzolov 0e8b696
modified ImageMessage.java line 45: mage > Image
a8142b7
fixing GH document upload action
tzolov 4e3bba7
Fix GH documenttation-upload.yml
tzolov cf463c8
Improve the Deploy docs steps of Docs uplaod GH action
tzolov c453529
Final GH doc action fixes
tzolov 9330187
Improve auto-confg condigiton on class to prevent undesired activation
tzolov 49a1cc6
Update Mistral AI function calling docs and layout
tzolov 9c19dc1
Improve Antora documentation layout
tzolov 5f0123c
Add MongoDB Atlas Vector store
Kirbstomper db43fe3
Add PostgresMlAutoConfiguration to AutoConfiguration.imports
izeye c036931
Implemented Bedrock Jurassic 2 ChatClient
hemeda3 c5f3502
Remove withReuse in MongoDBAtlasContainer
eddumelendez c2b3659
Add ElasticsearchAiSearchFilterExpressionConverter
JM-Lab 04b25ff
Add support for Elasticsearch vector store
JM-Lab bdc413c
Add support for various vector functions provided by Elasticsearch
JM-Lab d0a185b
Add support for Elasticsearch 8 vector store using the elasticsearch-…
JM-Lab ba1728e
Merge branch 'elasticsearch-vector-store' of https://github.com/JM-La…
JM-Lab b38552c
Rebase, fix compilation and style errors add missing dependecies in p…
tzolov a28f245
Fix ElasticsearchVectorStoreIT FilterExpression with Date type requir…
JM-Lab 5cb6d92
Rename Elasticsearch8VectorStoreIT to ElasticsearchVectorStoreIT
JM-Lab File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
<parent> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai</artifactId> | ||
<version>0.8.0-SNAPSHOT</version> | ||
<relativePath>../../pom.xml</relativePath> | ||
</parent> | ||
<artifactId>spring-ai-elasticsearch-store</artifactId> | ||
<packaging>jar</packaging> | ||
<name>Spring AI Vector Store - Elasticsearch</name> | ||
<description>Spring AI Elasticsearch Vector Store</description> | ||
<url>https://github.com/spring-projects/spring-ai</url> | ||
|
||
<scm> | ||
<url>https://github.com/spring-projects/spring-ai</url> | ||
<connection>git://github.com/spring-projects/spring-ai.git</connection> | ||
<developerConnection>[email protected]:spring-projects/spring-ai.git</developerConnection> | ||
</scm> | ||
|
||
<properties> | ||
<!-- testing --> | ||
<hikari-cp.version>4.0.3</hikari-cp.version> | ||
</properties> | ||
|
||
<dependencies> | ||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-core</artifactId> | ||
<version>${parent.version}</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.elasticsearch.client</groupId> | ||
<artifactId>elasticsearch-rest-client</artifactId> | ||
<version>8.11.3</version> | ||
</dependency> | ||
|
||
<!-- TESTING --> | ||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-openai</artifactId> | ||
<version>${parent.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
|
||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-test</artifactId> | ||
<version>${parent.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.springframework.boot</groupId> | ||
<artifactId>spring-boot-starter-test</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.testcontainers</groupId> | ||
<artifactId>elasticsearch</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.testcontainers</groupId> | ||
<artifactId>junit-jupiter</artifactId> | ||
<version>${testcontainers.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
</dependencies> | ||
|
||
</project> |
141 changes: 141 additions & 0 deletions
141
...va/org/springframework/ai/vectorstore/ElasticsearchAiSearchFilterExpressionConverter.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,141 @@ | ||
/* | ||
* Copyright 2023-2023 the original author or authors. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* https://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.springframework.ai.vectorstore; | ||
|
||
import org.springframework.ai.vectorstore.filter.Filter; | ||
import org.springframework.ai.vectorstore.filter.Filter.Expression; | ||
import org.springframework.ai.vectorstore.filter.Filter.Key; | ||
import org.springframework.ai.vectorstore.filter.converter.AbstractFilterExpressionConverter; | ||
|
||
import java.text.ParseException; | ||
import java.text.SimpleDateFormat; | ||
import java.util.Date; | ||
import java.util.List; | ||
import java.util.TimeZone; | ||
import java.util.regex.Pattern; | ||
|
||
public class ElasticsearchAiSearchFilterExpressionConverter extends AbstractFilterExpressionConverter { | ||
|
||
private static final Pattern DATE_FORMAT_PATTERN = Pattern.compile("\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z"); | ||
|
||
private final SimpleDateFormat dateFormat; | ||
|
||
public ElasticsearchAiSearchFilterExpressionConverter() { | ||
this.dateFormat = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss'Z'"); | ||
this.dateFormat.setTimeZone(TimeZone.getTimeZone("UTC")); | ||
} | ||
|
||
@Override | ||
protected void doExpression(Expression expression, StringBuilder context) { | ||
if (expression.type() == Filter.ExpressionType.IN || expression.type() == Filter.ExpressionType.NIN) { | ||
context.append(getOperationSymbol(expression)); | ||
context.append("("); | ||
this.convertOperand(expression.left(), context); | ||
this.convertOperand(expression.right(), context); | ||
context.append(")"); | ||
} else { | ||
this.convertOperand(expression.left(), context); | ||
context.append(getOperationSymbol(expression)); | ||
this.convertOperand(expression.right(), context); | ||
} | ||
} | ||
|
||
@Override | ||
protected void doStartValueRange(Filter.Value listValue, StringBuilder context) { | ||
} | ||
|
||
@Override | ||
protected void doEndValueRange(Filter.Value listValue, StringBuilder context) { | ||
} | ||
|
||
@Override | ||
protected void doAddValueRangeSpitter(Filter.Value listValue, StringBuilder context) { | ||
context.append(" OR "); | ||
} | ||
|
||
private String getOperationSymbol(Expression exp) { | ||
return switch (exp.type()) { | ||
case AND -> " AND "; | ||
case OR -> " OR "; | ||
case EQ, IN -> ""; | ||
case NE -> " NOT "; | ||
case LT -> "<"; | ||
case LTE -> "<="; | ||
case GT -> ">"; | ||
case GTE -> ">="; | ||
case NIN -> "NOT "; | ||
default -> throw new RuntimeException("Not supported expression type: " + exp.type()); | ||
}; | ||
} | ||
|
||
@Override | ||
public void doKey(Key key, StringBuilder context) { | ||
var identifier = hasOuterQuotes(key.key()) ? removeOuterQuotes(key.key()) : key.key(); | ||
var prefixedIdentifier = withMetaPrefix(identifier); | ||
context.append(prefixedIdentifier.trim()).append(":"); | ||
} | ||
|
||
public String withMetaPrefix(String identifier) { | ||
return "metadata." + identifier; | ||
} | ||
|
||
@Override | ||
protected void doValue(Filter.Value filterValue, StringBuilder context) { | ||
if (filterValue.value() instanceof List list) { | ||
int c = 0; | ||
for (Object v : list) { | ||
context.append(v); | ||
if (c++ < list.size() - 1) { | ||
this.doAddValueRangeSpitter(filterValue, context); | ||
} | ||
} | ||
} else { | ||
this.doSingleValue(filterValue.value(), context); | ||
} | ||
} | ||
|
||
@Override | ||
protected void doSingleValue(Object value, StringBuilder context) { | ||
if (value instanceof Date date) { | ||
context.append(this.dateFormat.format(date)); | ||
} else if (value instanceof String text) { | ||
if (DATE_FORMAT_PATTERN.matcher(text).matches()) { | ||
try { | ||
Date date = this.dateFormat.parse(text); | ||
context.append(this.dateFormat.format(date)); | ||
} catch (ParseException e) { | ||
throw new IllegalArgumentException("Invalid date type:" + text, e); | ||
} | ||
} else { | ||
context.append(text); | ||
} | ||
} else { | ||
context.append(value); | ||
} | ||
} | ||
|
||
@Override | ||
public void doStartGroup(Filter.Group group, StringBuilder context) { | ||
context.append("("); | ||
} | ||
|
||
@Override | ||
public void doEndGroup(Filter.Group group, StringBuilder context) { | ||
context.append(")"); | ||
} | ||
|
||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we be using the
elasticsearch-java
client instead?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose to use the Java Low Level REST Client (https://www.elastic.co/guide/en/elasticsearch/client/java-api-client/8.12/java-rest-low.html) with minimal dependencies instead of the elasticsearch-java library's High Level Rest Client. This decision was based on the High Level Rest Client's heavy reliance on dependencies and its sensitivity to Elasticsearch server versions, requiring exact matching with versions 7, 8, and even minor releases. By implementing only the essential data classes for Vectorstore, I successfully tested compatibility with both version 7 and 8.