Skip to content

Commit d5b1589

Browse files
authored
Added new spring-ai couchbase vector store tutorial (#53)
* Added new spring-ai demo tutorial * Changed section on metadata filtering * Added more references, changed erroneus statemements, sentences changed
1 parent 31a315b commit d5b1589

File tree

1 file changed

+230
-0
lines changed

1 file changed

+230
-0
lines changed
Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
---
2+
# frontmatter
3+
path: "/tutorial-java-spring-ai"
4+
title: Couchbase Vector Search using Spring AI
5+
short_title: Spring AI Vector Storage
6+
description:
7+
- Learn how to configure and use couchbase vector search with Spring AI
8+
- Learn how to vectorize data with Spring AI
9+
- Learn how to retrieve vector data from Couchbase
10+
content_type: tutorial
11+
filter: sdk
12+
technology:
13+
- connectors
14+
- vector search
15+
tags:
16+
- LangChain
17+
- Artificial Intelligence
18+
- Data Ingestion
19+
sdk_language:
20+
- java
21+
length: 10 Mins
22+
---
23+
24+
## About This Tutorial
25+
This tutorial is your quick and easy guide to getting started with Spring AI and Couchbase as a Vector Store. Let's dive in and explore how these powerful tools can work together to enhance your applications.
26+
### Example Source code
27+
Example source code for this tutorial can be obtained from [Spring AI demo application with Couchbase Vector Store](https://github.com/couchbase-examples/couchbase-spring-ai-demo).
28+
To do this, clone the repository using git:
29+
```shell
30+
git clone https://github.com/couchbase-examples/couchbase-spring-ai-demo.git
31+
cd couchbase-spring-ai-demo
32+
```
33+
34+
### What is Spring AI?
35+
36+
37+
Spring AI is an extension of the Spring Framework that simplifies the integration of AI capabilities into Spring applications. It provides abstractions and integrations for working with various AI services and models, making it easier for developers to incorporate AI functionality without having to manage low-level implementation details.
38+
39+
Key features of Spring AI include:
40+
- **Model integrations**: Pre-built connectors to popular AI models (like OpenAI)
41+
- **Prompt engineering**: Tools for crafting and managing prompts
42+
- **Vector stores**: Abstractions for storing and retrieving vector embeddings
43+
- **Document processing**: Utilities for working with unstructured data
44+
45+
##### Why Use Spring AI?
46+
47+
Spring AI brings several benefits to Java developers:
48+
1. **Familiar programming model**: Uses Spring's dependency injection and configuration
49+
2. **Abstraction layer**: Provides consistent interfaces across different AI providers
50+
3. **Enterprise-ready**: Built with production use cases in mind
51+
4. **Simplified development**: Reduces boilerplate code for AI integrations
52+
53+
54+
- [Spring AI](https://docs.spring.io/spring-ai/reference/index.html)
55+
- [Spring AI Github Page](https://github.com/spring-projects/spring-ai)
56+
57+
58+
## Couchbase Embedding Store
59+
Couchbase spring-ai integration stores each embedding in a separate document and uses an FTS vector index to perform
60+
queries against stored vectors.
61+
- [Couchbase Integration with Spring AI Documentation](https://docs.spring.io/spring-ai/reference/api/vectordbs/couchbase.html)
62+
63+
## Project Structure
64+
65+
```
66+
src/main/java/com/couchbase_spring_ai/demo/
67+
├── Config.java # Application configuration
68+
├── Controller.java # REST API endpoints
69+
└── CouchbaseSpringAiDemoApplications.java # Application entry point
70+
71+
src/main/resources/
72+
├── application.properties # Application settings
73+
└── bbc_news_data.json # Sample data
74+
```
75+
76+
## Setup and Configuration
77+
78+
### Prerequisites
79+
- [Couchbase Capella](https://docs.couchbase.com/cloud/get-started/create-account.html) account or locally installed [Couchbase Server](/tutorial-couchbase-installation-options)
80+
- Java 17
81+
- Maven
82+
- Couchbase Server
83+
- OpenAI API key
84+
85+
### Configuration Details
86+
87+
The application is configured in `application.properties`:
88+
89+
```properties
90+
spring.application.name=spring-ai-demo
91+
spring.ai.openai.api-key=your-openai-api-key
92+
spring.couchbase.connection-string=couchbase://127.0.0.1
93+
spring.couchbase.username=Administrator
94+
spring.couchbase.password=password
95+
```
96+
97+
## Key Components
98+
99+
### Configuration Class (`Config.java`)
100+
101+
This class creates the necessary beans for:
102+
- Connecting to Couchbase cluster
103+
- Setting up the OpenAI embedding model (OpenAI key is assumed to be stored as an environment variable.)
104+
- Configuring the Couchbase vector store
105+
106+
```java
107+
public class Config {
108+
@Value("${spring.couchbase.connection-string}")
109+
private String connectionUrl;
110+
@Value("${spring.couchbase.username}")
111+
private String username;
112+
@Value("${spring.couchbase.password}")
113+
private String password;
114+
@Value("${spring.ai.openai.api-key}")
115+
private String openaiKey;
116+
117+
public Config() {
118+
}
119+
120+
@Bean
121+
public Cluster cluster() {
122+
return Cluster.connect(this.connectionUrl, this.username, this.password);
123+
}
124+
125+
@Bean
126+
public Boolean initializeSchema() {
127+
return true;
128+
}
129+
130+
@Bean
131+
public EmbeddingModel embeddingModel() {
132+
return new OpenAiEmbeddingModel(OpenAiApi.builder().apiKey(this.openaiKey).build());
133+
}
134+
135+
@Bean
136+
public VectorStore couchbaseSearchVectorStore(Cluster cluster,
137+
EmbeddingModel embeddingModel,
138+
Boolean initializeSchema) {
139+
return CouchbaseSearchVectorStore
140+
.builder(cluster, embeddingModel)
141+
.bucketName("test")
142+
.scopeName("test")
143+
.collectionName("test")
144+
.initializeSchema(initializeSchema)
145+
.build();
146+
}
147+
}
148+
```
149+
150+
The vector store is configured to use:
151+
- Bucket: "test"
152+
- Scope: "test"
153+
- Collection: "test"
154+
155+
### Vector Store Integration
156+
157+
The application uses `CouchbaseSearchVectorStore`, which:
158+
- Stores document embeddings in Couchbase
159+
- Provides similarity search capabilities
160+
- Maintains metadata alongside vector embeddings
161+
162+
### Vector Index
163+
The embedding store uses an FTS vector index in order to perform vector similarity lookups. If provided with a name for
164+
vector index that does not exist on the cluster, the store will attempt to create a new index with default
165+
configuration based on the provided initialization settings. It is recommended to manually review the settings for the
166+
created index and adjust them according to specific use cases. More information about vector search and FTS index
167+
configuration can be found at [Couchbase Documentation](https://docs.couchbase.com/server/current/vector-search/vector-search.html).
168+
169+
### Controller Class (`Controller.java`)
170+
171+
Provides REST API endpoints:
172+
- `/tutorial/load`: Loads sample BBC news data into Couchbase
173+
- `/tutorial/search`: Performs a semantic search for sports-related news articles
174+
175+
##### Load functionality
176+
```java
177+
...
178+
Document doc = new Document(String.format("%s", i + 1), j.getString("content"), Map.of("title", j.getString("title")))
179+
...
180+
this.couchbaseSearchVectorStore.add(doc);
181+
...
182+
```
183+
184+
- A new Document object is created. The document's ID is generated using String.format("%s", i + 1), which increments an index i to ensure unique IDs and same ID across calls. Metadata is added as a map with a key "title" and its corresponding value from a previously parsed JSON.
185+
- The document is then added to the couchbaseSearchVectorStore, which is an instance of a class that handles storing documents in Couchbase. This operation involves vectorizing the document content and storing it in a format suitable for vector search.
186+
187+
188+
##### Search functionality
189+
```java
190+
List<Document> results = this.couchbaseSearchVectorStore.similaritySearch(SearchRequest.builder()
191+
.query("Give me some sports news")
192+
.similarityThreshold((double)0.75F)
193+
.topK(15)
194+
.build());
195+
196+
return (List)results.stream()
197+
.map((doc) -> Map.of("content", doc.getText(), "metadata", doc.getMetadata()))
198+
.collect(Collectors.toList());
199+
```
200+
201+
- A SearchRequest is built with a query string "Give me some sports news". The similarityThreshold is set to 0.75, meaning only documents with a similarity score above this threshold will be considered relevant. The topK parameter is set to 15, indicating that the top 15 most similar documents should be returned.
202+
- The similaritySearch method of couchbaseSearchVectorStore is called with the built SearchRequest. This method performs a vector similarity search against the stored documents.
203+
- The results, which are a list of Document objects, are processed using Java Streams. Each document is mapped to a simplified structure containing its text content and metadata. The final result is a list of maps, each representing a document with its content and metadata.
204+
205+
## Using the Application
206+
207+
This is basically a Spring Boot project with two endpoints `tutorial/load` and `tutorial/search`.
208+
In order to run this application, use the following command:
209+
`./mvnw spring-boot:run`
210+
211+
212+
### Loading Data
213+
214+
1. Start the application
215+
2. Make a GET request to `http://localhost:8080/tutorial/load`
216+
3. This loads BBC news articles from the included JSON file into Couchbase, creating embeddings via OpenAI
217+
218+
### Performing Similarity Searches
219+
220+
1. Make a GET request to `http://localhost:8080/tutorial/search`
221+
2. The application will search for documents semantically similar to "Give me some sports news"
222+
3. Results are returned with content and metadata, sorted by similarity score
223+
224+
225+
## Resources
226+
227+
- [Spring AI Documentation](https://docs.spring.io/spring-ai/reference/index.html)
228+
- [Couchbase Vector Search](https://docs.couchbase.com/server/current/fts/vector-search.html)
229+
- [OpenAI Embeddings Documentation](https://platform.openai.com/docs/guides/embeddings)
230+
- [Spring Boot Documentation](https://docs.spring.io/spring-boot/docs/current/reference/html/)

0 commit comments

Comments
 (0)