-
Notifications
You must be signed in to change notification settings - Fork 2k
Adding SAP HanaDB as a vector store. #535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
7cdb1bd
036a9d0
e3752a6
68555c6
5f2f7c2
4804c8c
329a0a8
2923aad
60bf93f
f90417e
6fad1fa
abcf652
76f06c2
3de1684
e7dd2ad
eb0acd7
9bbf2f8
1405a81
e140448
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,284 @@ | ||
# SpringAI using SAP HANA Cloud vector engine | ||
|
||
## How to setup a project that uses SAP Hana Cloud as the vector DB and leverage OpenAI to implement RAG pattern | ||
|
||
1. Create a table `CRICKET_WORLD_CUP` in **SAP Hana DB**: | ||
``` roomsql | ||
CREATE TABLE CRICKET_WORLD_CUP ( | ||
_ID VARCHAR2(255) PRIMARY KEY, | ||
CONTENT CLOB, | ||
EMBEDDING REAL_VECTOR(1536) | ||
) | ||
``` | ||
|
||
2. Download information about [Cricket World Cup](https://en.wikipedia.org/wiki/Cricket_World_Cup) from wikipedia as a PDF file: | ||
|
||
 | ||
|
||
3. Add the following dependencies in your `pom.xml`. | ||
You may set the property **spring-ai-version** as `<spring-ai-version>1.0.0-SNAPSHOT</spring-ai-version>`: | ||
``` xml | ||
<dependency> | ||
<groupId>org.springframework.boot</groupId> | ||
<artifactId>spring-boot-starter-web</artifactId> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-pdf-document-reader</artifactId> | ||
<version>${spring-ai-version}</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-hanadb-store</artifactId> | ||
<version>${spring-ai-version}</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-openai-spring-boot-starter</artifactId> | ||
<version>${spring-ai-version}</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.projectlombok</groupId> | ||
<artifactId>lombok</artifactId> | ||
<version>1.18.30</version> | ||
<scope>provided</scope> | ||
</dependency> | ||
``` | ||
|
||
4. Add the following properties in `application.properties` file: | ||
``` properties | ||
spring.ai.openai.api-key=<YOUR_OPENAI_API_KEY> | ||
spring.ai.openai.embedding.options.model=text-embedding-ada-002 | ||
|
||
spring.datasource.driver-class-name=com.sap.db.jdbc.Driver | ||
spring.datasource.url=<YOUR_HANA_CLOUD_DB_URL> | ||
spring.datasource.username=<HANA_DB_USERNAME> | ||
spring.datasource.password=<HANA_DB_PASSWORD> | ||
|
||
hana.table.name=CRICKET_WORLD_CUP | ||
hana.similarity.search.topK=5 | ||
``` | ||
|
||
5. Add the **Cricket_World_Cup.pdf** to `src/main/resources` directory: | ||
|
||
 | ||
|
||
6. Create an **Entity** class named `CricketWorldCup` that **extends** from `HanaVectorEntity`: | ||
``` java | ||
package com.interviewpedia.spring.ai.openai.rag; | ||
|
||
import jakarta.persistence.Column; | ||
import jakarta.persistence.Entity; | ||
import jakarta.persistence.Table; | ||
import lombok.Data; | ||
import lombok.NoArgsConstructor; | ||
import lombok.experimental.SuperBuilder; | ||
import lombok.extern.jackson.Jacksonized; | ||
import org.springframework.ai.vectorstore.HanaVectorEntity; | ||
|
||
@Entity | ||
@Table(name = "CRICKET_WORLD_CUP") | ||
@Data | ||
@Jacksonized | ||
@SuperBuilder(toBuilder = true) | ||
@NoArgsConstructor | ||
public class CricketWorldCup extends HanaVectorEntity { | ||
@Column(name = "content") | ||
private String content; | ||
} | ||
``` | ||
|
||
7. Create a **Repository** named `CricketWorldCupRepository` that **implements** `HanaVectorRepository` **interface**: | ||
``` java | ||
package com.interviewpedia.spring.ai.openai.rag; | ||
|
||
import jakarta.persistence.EntityManager; | ||
import jakarta.persistence.PersistenceContext; | ||
import jakarta.transaction.Transactional; | ||
import org.springframework.ai.vectorstore.HanaVectorRepository; | ||
import org.springframework.stereotype.Repository; | ||
|
||
import java.util.List; | ||
|
||
@Repository | ||
public class CricketWorldCupRepository implements HanaVectorRepository<CricketWorldCup> { | ||
@PersistenceContext | ||
private EntityManager entityManager; | ||
|
||
@Override | ||
@Transactional | ||
public void save(String tableName, String id, String embedding, String content) { | ||
String sql = String.format(""" | ||
INSERT INTO %s (_ID, EMBEDDING, CONTENT) | ||
VALUES(:_id, TO_REAL_VECTOR(:embedding), :content) | ||
""", tableName); | ||
|
||
entityManager.createNativeQuery(sql) | ||
.setParameter("_id", id) | ||
.setParameter("embedding", embedding) | ||
.setParameter("content", content) | ||
.executeUpdate(); | ||
} | ||
|
||
@Override | ||
@Transactional | ||
public int deleteEmbeddingsById(String tableName, List<String> idList) { | ||
String sql = String.format(""" | ||
DELETE FROM %s WHERE _ID IN (:ids) | ||
""", tableName); | ||
|
||
return entityManager.createNativeQuery(sql) | ||
.setParameter("ids", idList) | ||
.executeUpdate(); | ||
} | ||
|
||
@Override | ||
@Transactional | ||
public int deleteAllEmbeddings(String tableName) { | ||
String sql = String.format(""" | ||
DELETE FROM %s | ||
""", tableName); | ||
|
||
return entityManager.createNativeQuery(sql).executeUpdate(); | ||
} | ||
|
||
@Override | ||
public List<CricketWorldCup> cosineSimilaritySearch(String tableName, int topK, String queryEmbedding) { | ||
String sql = String.format(""" | ||
SELECT TOP :topK * FROM %s | ||
ORDER BY COSINE_SIMILARITY(EMBEDDING, TO_REAL_VECTOR(:queryEmbedding)) DESC | ||
""", tableName); | ||
|
||
return entityManager.createNativeQuery(sql, CricketWorldCup.class) | ||
.setParameter("topK", topK) | ||
.setParameter("queryEmbedding", queryEmbedding) | ||
.getResultList(); | ||
} | ||
} | ||
``` | ||
|
||
8. Create a **Configuration** class named `CricketWorldCupConfig` as follows: | ||
``` java | ||
package com.interviewpedia.spring.ai.openai.rag; | ||
|
||
import org.springframework.ai.embedding.EmbeddingClient; | ||
import org.springframework.ai.image.ImageClient; | ||
import org.springframework.ai.openai.OpenAiImageClient; | ||
import org.springframework.ai.openai.api.OpenAiImageApi; | ||
import org.springframework.ai.vectorstore.HanaCloudVectorStore; | ||
import org.springframework.ai.vectorstore.HanaCloudVectorStoreConfig; | ||
import org.springframework.ai.vectorstore.VectorStore; | ||
import org.springframework.beans.factory.annotation.Autowired; | ||
import org.springframework.beans.factory.annotation.Value; | ||
import org.springframework.context.annotation.Bean; | ||
import org.springframework.context.annotation.Configuration; | ||
import org.springframework.core.io.ResourceLoader; | ||
|
||
@Configuration | ||
public class CricketWorldCupConfig { | ||
@Autowired | ||
private ResourceLoader resourceLoader; | ||
|
||
@Value("${hana.table.name}") | ||
private String tableName; | ||
|
||
@Value("${hana.similarity.search.topK}") | ||
private int topK; | ||
|
||
@Bean | ||
public VectorStore hanaCloudVectorStore(CricketWorldCupRepository cricketWorldCupRepository, | ||
EmbeddingClient embeddingClient) { | ||
return new HanaCloudVectorStore(cricketWorldCupRepository, embeddingClient, | ||
HanaCloudVectorStoreConfig.builder() | ||
.tableName(tableName) | ||
.topK(topK) | ||
.build()); | ||
} | ||
} | ||
``` | ||
|
||
9. Now, create a **REST Controller** class `CricketWorldCupHanaController`, and **autowire** `ChatClient` and `VectorStore` as dependencies: | ||
In this controller class, create the following REST endpoints: | ||
- `/ai/hana-vector-store/cricket-world-cup/purge-embeddings` - to purge all the embeddings from the Vector Store | ||
- `/ai/hana-vector-store/cricket-world-cup/upload` - to upload the Cricket_World_Cup.pdf so that its data gets stored in | ||
SAP Hana Cloud Vector DB as embeddings | ||
- `/ai/hana-vector-store/cricket-world-cup` - to implement **RAG** using [Cosine_Similarity in SAP Hana DB](https://help.sap.com/docs/hana-cloud-database/sap-hana-cloud-sap-hana-database-vector-engine-guide/vectors-vector-embeddings-and-metrics) | ||
``` java | ||
package com.interviewpedia.spring.ai.openai.rag; | ||
|
||
import lombok.extern.slf4j.Slf4j; | ||
import org.springframework.ai.chat.ChatClient; | ||
import org.springframework.ai.chat.messages.UserMessage; | ||
import org.springframework.ai.chat.prompt.Prompt; | ||
import org.springframework.ai.chat.prompt.SystemPromptTemplate; | ||
import org.springframework.ai.document.Document; | ||
import org.springframework.ai.reader.pdf.PagePdfDocumentReader; | ||
import org.springframework.ai.transformer.splitter.TokenTextSplitter; | ||
import org.springframework.ai.vectorstore.HanaCloudVectorStore; | ||
import org.springframework.ai.vectorstore.VectorStore; | ||
import org.springframework.beans.factory.annotation.Autowired; | ||
import org.springframework.core.io.Resource; | ||
import org.springframework.http.ResponseEntity; | ||
import org.springframework.web.bind.annotation.GetMapping; | ||
import org.springframework.web.bind.annotation.PostMapping; | ||
import org.springframework.web.bind.annotation.RequestParam; | ||
import org.springframework.web.bind.annotation.RestController; | ||
import org.springframework.web.multipart.MultipartFile; | ||
|
||
import java.io.IOException; | ||
import java.util.List; | ||
import java.util.Map; | ||
import java.util.function.Function; | ||
import java.util.function.Supplier; | ||
import java.util.stream.Collectors; | ||
|
||
@RestController | ||
@Slf4j | ||
public class CricketWorldCupHanaController { | ||
private final VectorStore hanaCloudVectorStore; | ||
private final ChatClient chatClient; | ||
|
||
@Autowired | ||
public CricketWorldCupHanaController(ChatClient chatClient, VectorStore hanaCloudVectorStore) { | ||
this.chatClient = chatClient; | ||
this.hanaCloudVectorStore = hanaCloudVectorStore; | ||
} | ||
|
||
@PostMapping("/ai/hana-vector-store/cricket-world-cup/purge-embeddings") | ||
public ResponseEntity<String> purgeEmbeddings() { | ||
int deleteCount = ((HanaCloudVectorStore) this.hanaCloudVectorStore).purgeEmbeddings(); | ||
log.info("{} embeddings purged from CRICKET_WORLD_CUP table in Hana DB", deleteCount); | ||
return ResponseEntity.ok().body(String.format("%d embeddings purged from CRICKET_WORLD_CUP table in Hana DB", deleteCount)); | ||
} | ||
|
||
@PostMapping("/ai/hana-vector-store/cricket-world-cup/upload") | ||
public ResponseEntity<String> handleFileUpload(@RequestParam("pdf") MultipartFile file) throws IOException { | ||
Resource pdf = file.getResource(); | ||
Supplier<List<Document>> reader = new PagePdfDocumentReader(pdf); | ||
Function<List<Document>, List<Document>> splitter = new TokenTextSplitter(); | ||
List<Document> documents = splitter.apply(reader.get()); | ||
log.info("{} documents created from pdf file: {}", documents.size(), pdf.getFilename()); | ||
hanaCloudVectorStore.accept(documents); | ||
return ResponseEntity.ok().body(String.format("%d documents created from pdf file: %s", | ||
documents.size(), pdf.getFilename())); | ||
} | ||
|
||
@GetMapping("/ai/hana-vector-store/cricket-world-cup") | ||
public Map<String, String> hanaVectorStoreSearch(@RequestParam(value = "message") String message) { | ||
var documents = this.hanaCloudVectorStore.similaritySearch(message); | ||
var inlined = documents.stream().map(Document::getContent).collect(Collectors.joining(System.lineSeparator())); | ||
var similarDocsMessage = new SystemPromptTemplate("Based on the following: {documents}") | ||
.createMessage(Map.of("documents", inlined)); | ||
|
||
var userMessage = new UserMessage(message); | ||
Prompt prompt = new Prompt(List.of(similarDocsMessage, userMessage)); | ||
String generation = chatClient.call(prompt).getResult().getOutput().getContent(); | ||
log.info("Generation: {}", generation); | ||
return Map.of("generation", generation); | ||
} | ||
} | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" | ||
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
<parent> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai</artifactId> | ||
<version>1.0.0-SNAPSHOT</version> | ||
<relativePath>../../pom.xml</relativePath> | ||
</parent> | ||
<artifactId>spring-ai-hanadb-store</artifactId> | ||
<packaging>jar</packaging> | ||
<name>Spring AI HanaDB Vector Store</name> | ||
<description>Spring AI HanaDB Vector Store</description> | ||
<url>https://github.com/spring-projects-experimental/spring-ai</url> | ||
|
||
<scm> | ||
<url>https://github.com/spring-projects/spring-ai</url> | ||
<connection>git://github.com/spring-projects/spring-ai.git</connection> | ||
<developerConnection>[email protected]:spring-projects/spring-ai.git</developerConnection> | ||
</scm> | ||
|
||
<dependencies> | ||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-core</artifactId> | ||
<version>${parent.version}</version> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.springframework.boot</groupId> | ||
<artifactId>spring-boot-starter-data-jpa</artifactId> | ||
|
||
</dependency> | ||
|
||
<!-- HanaDB --> | ||
<dependency> | ||
<groupId>com.sap.cloud.db.jdbc</groupId> | ||
<artifactId>ngdbc</artifactId> | ||
<version>2.20.11</version> | ||
|
||
</dependency> | ||
|
||
<dependency> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please don't use the lombok in the vector store implementation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +ACK |
||
<groupId>org.projectlombok</groupId> | ||
<artifactId>lombok</artifactId> | ||
<version>1.18.30</version> | ||
<scope>provided</scope> | ||
</dependency> | ||
|
||
<!-- TESTING --> | ||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-openai-spring-boot-starter</artifactId> | ||
<version>${parent.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.springframework.ai</groupId> | ||
<artifactId>spring-ai-pdf-document-reader</artifactId> | ||
<version>${parent.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
|
||
<dependency> | ||
<groupId>org.springframework.boot</groupId> | ||
<artifactId>spring-boot-starter-test</artifactId> | ||
<scope>test</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.testcontainers</groupId> | ||
<artifactId>junit-jupiter</artifactId> | ||
<version>${testcontainers.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
</dependencies> | ||
</project> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please convert the README into an adoc under the vectordbs and add a link to the nav.adoc.
The README can contain a single line link to the adoc documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ACK