Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@
<module>vector-stores/spring-ai-elasticsearch-store</module>
<module>spring-ai-spring-boot-starters/spring-ai-starter-watsonx-ai</module>
<module>spring-ai-spring-boot-starters/spring-ai-starter-elasticsearch-store</module>
<module>vector-stores/spring-ai-hanadb-store</module>
</modules>

<organization>
Expand Down
284 changes: 284 additions & 0 deletions vector-stores/spring-ai-hanadb-store/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,284 @@
# SpringAI using SAP HANA Cloud vector engine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please convert the README into an adoc under the vectordbs and add a link to the nav.adoc.
The README can contain a single line link to the adoc documentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ACK


## How to setup a project that uses SAP Hana Cloud as the vector DB and leverage OpenAI to implement RAG pattern

1. Create a table `CRICKET_WORLD_CUP` in **SAP Hana DB**:
``` roomsql
CREATE TABLE CRICKET_WORLD_CUP (
_ID VARCHAR2(255) PRIMARY KEY,
CONTENT CLOB,
EMBEDDING REAL_VECTOR(1536)
)
```

2. Download information about [Cricket World Cup](https://en.wikipedia.org/wiki/Cricket_World_Cup) from wikipedia as a PDF file:

![Alt text](readme-assets/1.png?raw=true "Title")

3. Add the following dependencies in your `pom.xml`.
You may set the property **spring-ai-version** as `<spring-ai-version>1.0.0-SNAPSHOT</spring-ai-version>`:
``` xml
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>

<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pdf-document-reader</artifactId>
<version>${spring-ai-version}</version>
</dependency>

<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-hanadb-store</artifactId>
<version>${spring-ai-version}</version>
</dependency>

<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>${spring-ai-version}</version>
</dependency>

<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.30</version>
<scope>provided</scope>
</dependency>
```

4. Add the following properties in `application.properties` file:
``` properties
spring.ai.openai.api-key=<YOUR_OPENAI_API_KEY>
spring.ai.openai.embedding.options.model=text-embedding-ada-002

spring.datasource.driver-class-name=com.sap.db.jdbc.Driver
spring.datasource.url=<YOUR_HANA_CLOUD_DB_URL>
spring.datasource.username=<HANA_DB_USERNAME>
spring.datasource.password=<HANA_DB_PASSWORD>

hana.table.name=CRICKET_WORLD_CUP
hana.similarity.search.topK=5
```

5. Add the **Cricket_World_Cup.pdf** to `src/main/resources` directory:

![Alt text](readme-assets/2.png?raw=true "Title")

6. Create an **Entity** class named `CricketWorldCup` that **extends** from `HanaVectorEntity`:
``` java
package com.interviewpedia.spring.ai.openai.rag;

import jakarta.persistence.Column;
import jakarta.persistence.Entity;
import jakarta.persistence.Table;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.experimental.SuperBuilder;
import lombok.extern.jackson.Jacksonized;
import org.springframework.ai.vectorstore.HanaVectorEntity;

@Entity
@Table(name = "CRICKET_WORLD_CUP")
@Data
@Jacksonized
@SuperBuilder(toBuilder = true)
@NoArgsConstructor
public class CricketWorldCup extends HanaVectorEntity {
@Column(name = "content")
private String content;
}
```

7. Create a **Repository** named `CricketWorldCupRepository` that **implements** `HanaVectorRepository` **interface**:
``` java
package com.interviewpedia.spring.ai.openai.rag;

import jakarta.persistence.EntityManager;
import jakarta.persistence.PersistenceContext;
import jakarta.transaction.Transactional;
import org.springframework.ai.vectorstore.HanaVectorRepository;
import org.springframework.stereotype.Repository;

import java.util.List;

@Repository
public class CricketWorldCupRepository implements HanaVectorRepository<CricketWorldCup> {
@PersistenceContext
private EntityManager entityManager;

@Override
@Transactional
public void save(String tableName, String id, String embedding, String content) {
String sql = String.format("""
INSERT INTO %s (_ID, EMBEDDING, CONTENT)
VALUES(:_id, TO_REAL_VECTOR(:embedding), :content)
""", tableName);

entityManager.createNativeQuery(sql)
.setParameter("_id", id)
.setParameter("embedding", embedding)
.setParameter("content", content)
.executeUpdate();
}

@Override
@Transactional
public int deleteEmbeddingsById(String tableName, List<String> idList) {
String sql = String.format("""
DELETE FROM %s WHERE _ID IN (:ids)
""", tableName);

return entityManager.createNativeQuery(sql)
.setParameter("ids", idList)
.executeUpdate();
}

@Override
@Transactional
public int deleteAllEmbeddings(String tableName) {
String sql = String.format("""
DELETE FROM %s
""", tableName);

return entityManager.createNativeQuery(sql).executeUpdate();
}

@Override
public List<CricketWorldCup> cosineSimilaritySearch(String tableName, int topK, String queryEmbedding) {
String sql = String.format("""
SELECT TOP :topK * FROM %s
ORDER BY COSINE_SIMILARITY(EMBEDDING, TO_REAL_VECTOR(:queryEmbedding)) DESC
""", tableName);

return entityManager.createNativeQuery(sql, CricketWorldCup.class)
.setParameter("topK", topK)
.setParameter("queryEmbedding", queryEmbedding)
.getResultList();
}
}
```

8. Create a **Configuration** class named `CricketWorldCupConfig` as follows:
``` java
package com.interviewpedia.spring.ai.openai.rag;

import org.springframework.ai.embedding.EmbeddingClient;
import org.springframework.ai.image.ImageClient;
import org.springframework.ai.openai.OpenAiImageClient;
import org.springframework.ai.openai.api.OpenAiImageApi;
import org.springframework.ai.vectorstore.HanaCloudVectorStore;
import org.springframework.ai.vectorstore.HanaCloudVectorStoreConfig;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.ResourceLoader;

@Configuration
public class CricketWorldCupConfig {
@Autowired
private ResourceLoader resourceLoader;

@Value("${hana.table.name}")
private String tableName;

@Value("${hana.similarity.search.topK}")
private int topK;

@Bean
public VectorStore hanaCloudVectorStore(CricketWorldCupRepository cricketWorldCupRepository,
EmbeddingClient embeddingClient) {
return new HanaCloudVectorStore(cricketWorldCupRepository, embeddingClient,
HanaCloudVectorStoreConfig.builder()
.tableName(tableName)
.topK(topK)
.build());
}
}
```

9. Now, create a **REST Controller** class `CricketWorldCupHanaController`, and **autowire** `ChatClient` and `VectorStore` as dependencies:
In this controller class, create the following REST endpoints:
- `/ai/hana-vector-store/cricket-world-cup/purge-embeddings` - to purge all the embeddings from the Vector Store
- `/ai/hana-vector-store/cricket-world-cup/upload` - to upload the Cricket_World_Cup.pdf so that its data gets stored in
SAP Hana Cloud Vector DB as embeddings
- `/ai/hana-vector-store/cricket-world-cup` - to implement **RAG** using [Cosine_Similarity in SAP Hana DB](https://help.sap.com/docs/hana-cloud-database/sap-hana-cloud-sap-hana-database-vector-engine-guide/vectors-vector-embeddings-and-metrics)
``` java
package com.interviewpedia.spring.ai.openai.rag;

import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.SystemPromptTemplate;
import org.springframework.ai.document.Document;
import org.springframework.ai.reader.pdf.PagePdfDocumentReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.HanaCloudVectorStore;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.core.io.Resource;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.multipart.MultipartFile;

import java.io.IOException;
import java.util.List;
import java.util.Map;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collectors;

@RestController
@Slf4j
public class CricketWorldCupHanaController {
private final VectorStore hanaCloudVectorStore;
private final ChatClient chatClient;

@Autowired
public CricketWorldCupHanaController(ChatClient chatClient, VectorStore hanaCloudVectorStore) {
this.chatClient = chatClient;
this.hanaCloudVectorStore = hanaCloudVectorStore;
}

@PostMapping("/ai/hana-vector-store/cricket-world-cup/purge-embeddings")
public ResponseEntity<String> purgeEmbeddings() {
int deleteCount = ((HanaCloudVectorStore) this.hanaCloudVectorStore).purgeEmbeddings();
log.info("{} embeddings purged from CRICKET_WORLD_CUP table in Hana DB", deleteCount);
return ResponseEntity.ok().body(String.format("%d embeddings purged from CRICKET_WORLD_CUP table in Hana DB", deleteCount));
}

@PostMapping("/ai/hana-vector-store/cricket-world-cup/upload")
public ResponseEntity<String> handleFileUpload(@RequestParam("pdf") MultipartFile file) throws IOException {
Resource pdf = file.getResource();
Supplier<List<Document>> reader = new PagePdfDocumentReader(pdf);
Function<List<Document>, List<Document>> splitter = new TokenTextSplitter();
List<Document> documents = splitter.apply(reader.get());
log.info("{} documents created from pdf file: {}", documents.size(), pdf.getFilename());
hanaCloudVectorStore.accept(documents);
return ResponseEntity.ok().body(String.format("%d documents created from pdf file: %s",
documents.size(), pdf.getFilename()));
}

@GetMapping("/ai/hana-vector-store/cricket-world-cup")
public Map<String, String> hanaVectorStoreSearch(@RequestParam(value = "message") String message) {
var documents = this.hanaCloudVectorStore.similaritySearch(message);
var inlined = documents.stream().map(Document::getContent).collect(Collectors.joining(System.lineSeparator()));
var similarDocsMessage = new SystemPromptTemplate("Based on the following: {documents}")
.createMessage(Map.of("documents", inlined));

var userMessage = new UserMessage(message);
Prompt prompt = new Prompt(List.of(similarDocsMessage, userMessage));
String generation = chatClient.call(prompt).getResult().getOutput().getContent();
log.info("Generation: {}", generation);
return Map.of("generation", generation);
}
}
```
76 changes: 76 additions & 0 deletions vector-stores/spring-ai-hanadb-store/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai</artifactId>
<version>1.0.0-SNAPSHOT</version>
<relativePath>../../pom.xml</relativePath>
</parent>
<artifactId>spring-ai-hanadb-store</artifactId>
<packaging>jar</packaging>
<name>Spring AI HanaDB Vector Store</name>
<description>Spring AI HanaDB Vector Store</description>
<url>https://github.com/spring-projects-experimental/spring-ai</url>

<scm>
<url>https://github.com/spring-projects/spring-ai</url>
<connection>git://github.com/spring-projects/spring-ai.git</connection>
<developerConnection>[email protected]:spring-projects/spring-ai.git</developerConnection>
</scm>

<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-core</artifactId>
<version>${parent.version}</version>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed this.
Please try to replace the boot starter with the relevant data jpa dependency.
We should not use starters here, not even for the testing.
(P.S. I know that we leaked the logging starter in some places that we will have to fix)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Should be fixed now. Thanks

</dependency>

<!-- HanaDB -->
<dependency>
<groupId>com.sap.cloud.db.jdbc</groupId>
<artifactId>ngdbc</artifactId>
<version>2.20.11</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the version as a property in the parent pom.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ACK

</dependency>

<dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use the lombok in the vector store implementation.
Later might be fine for end user applications, but because of multiple technical and compatibility issues is inappropriate for utilities/libraries such as spring-ai.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ACK

<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.30</version>
<scope>provided</scope>
</dependency>

<!-- TESTING -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
<version>${parent.version}</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pdf-document-reader</artifactId>
<version>${parent.version}</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.testcontainers</groupId>
<artifactId>junit-jupiter</artifactId>
<version>${testcontainers.version}</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading