Skip to content

Commit c77fa76

Browse files
authored
Merge pull request #49 from TheovanKraay/java-rag-chatbot
spring java rag chatbot
2 parents 63f40c2 + 2948211 commit c77fa76

File tree

41 files changed

+7091
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+7091
-0
lines changed
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
HELP.md
2+
env.sh
3+
target/
4+
!.mvn/wrapper/maven-wrapper.jar
5+
!**/src/main/**/target/
6+
!**/src/test/**/target/
7+
8+
### STS ###
9+
.apt_generated
10+
.classpath
11+
.factorypath
12+
.project
13+
.settings
14+
.springBeans
15+
.sts4-cache
16+
17+
### IntelliJ IDEA ###
18+
.idea
19+
*.iws
20+
*.iml
21+
*.ipr
22+
23+
### NetBeans ###
24+
/nbproject/private/
25+
/nbbuild/
26+
/dist/
27+
/nbdist/
28+
/.nb-gradle/
29+
build/
30+
!**/src/main/**/build/
31+
!**/src/test/**/build/
32+
33+
### VS Code ###
34+
.vscode/
35+
36+
### AZD
37+
.azure
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) Microsoft Corporation.
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
# Spring ChatGPT Sample with Azure Cosmos DB
2+
3+
This sample shows how to build a ChatGPT like application in Spring and run on Azure Spring Apps with Azure Cosmos DB. The vector store in Azure Cosmos DB enables the application to use your private data to answer the questions.
4+
5+
### Application Architecture
6+
7+
This application utilizes the following Azure resources:
8+
9+
- [**Azure Spring Apps**](https://docs.microsoft.com/azure/spring-apps/) to host the application.
10+
- [**Azure OpenAI**](https://docs.microsoft.com/azure/cognitive-services/openai/) for chat completion and embedding APIs.
11+
- [**Azure Cosmos DB NoSQL API**](https://learn.microsoft.com/azure/cosmos-db/nosql/vector-search) as the vector store database.
12+
13+
Here's a high level architecture diagram that illustrates these components.
14+
15+
!["Application architecture diagram"](assets/resources.png)
16+
17+
## How it works
18+
19+
![Workflow](./docs/workflow.png)
20+
21+
1. Indexing flow (CLI)
22+
1. Load private documents from your local disk.
23+
1. Split the text into chunks.
24+
1. Convert text chunks into embeddings
25+
1. Save the embeddings into Cosmos DB Vector Store
26+
1. Query flow (Web API)
27+
1. Convert the user's query text to an embedding.
28+
1. Query Top-K nearest text chunks from the Cosmos DB vector store (by cosine similarity).
29+
1. Populate the prompt template with the chunks.
30+
1. Call to OpenAI text completion API.
31+
32+
33+
## Getting Started
34+
35+
### Prerequisites
36+
37+
The following prerequisites are required to use this application. Please ensure that you have them all installed locally.
38+
39+
- [Git](http://git-scm.com/).
40+
- [Java 17 or later](https://learn.microsoft.com/java/openjdk/install)
41+
- [Azure Cosmos DB NoSQL API account](https://learn.microsoft.com/azure/cosmos-db/nosql/how-to-create-account)
42+
- An Azure OpenAI account (see more [here](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUOFA5Qk1UWDRBMjg0WFhPMkIzTzhKQ1dWNyQlQCN0PWcu))
43+
44+
### Quickstart
45+
46+
1. git clone this repo.
47+
2. Create the following `system environment variables` with the appropriate values:
48+
49+
```shell
50+
set AZURE_OPENAI_EMBEDDINGDEPLOYMENTID=<Your OpenAI embedding deployment id>
51+
set AZURE_OPENAI_CHATDEPLOYMENTID=<Your Azure OpenAI chat deployment id>
52+
set AZURE_OPENAI_ENDPOINT=<Your Azure OpenAI endpoint>
53+
set AZURE_OPENAI_APIKEY=<Your Azure OpenAI API key>
54+
set COSMOS_URI=<Cosmos DB NoSQL Account URI>
55+
set COSMOS_KEY=<Cosmos DB NoSQL Account Key>
56+
```
57+
58+
3. Build the application:
59+
60+
```shell
61+
mvn clean package
62+
```
63+
64+
4. The following command will read and process your own private text documents, create a Cosmos DB NoSQL API collection with [vector indexing](https://learn.microsoft.com/azure/cosmos-db/nosql/vector-search#vector-indexing-policies) and [embeddings](https://learn.microsoft.com/azure/cosmos-db/nosql/vector-search#container-vector-policies) policies (see `com.microsoft.azure.springchatgpt.sample.common.store.CosmosDBVectorStore.java`), and load the processed documents into it:
65+
66+
```shell
67+
java -jar spring-chatgpt-sample-cli/target/spring-chatgpt-sample-cli-0.0.1-SNAPSHOT.jar --from=C:/<path you your private text docs>
68+
69+
```
70+
> Note: if you don't run the above to process your own documents, at first startup the application will read a pre-provided and pre-processed `vector-store.json` file in `private-data` folder, and load those documents into Cosmos DB instead.
71+
72+
5. Run the following command to build and run the application:
73+
74+
```shell
75+
java -jar spring-chatgpt-sample-webapi/target/spring-chatgpt-sample-webapi-0.0.1-SNAPSHOT.jar
76+
```
77+
6. Open your browser and navigate to `http://localhost:8080/`. You should see the below page. Test it out by typing in a question and clicking `Send`.
78+
79+
!["Screenshot of deployed chatgpt app"](assets/chatgpt.png)
80+
81+
<sup>Screenshot of the deployed chatgpt app</sup>
1.8 MB
Loading
134 KB
Loading
473 KB
Loading
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
3+
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
4+
<modelVersion>4.0.0</modelVersion>
5+
<groupId>com.microsoft.azure</groupId>
6+
<artifactId>spring-chatgpt-sample-cosmos</artifactId>
7+
<version>0.0.1-SNAPSHOT</version>
8+
<name>spring-chatgpt-sample-cosmos</name>
9+
<description>Demo project for Spring Boot</description>
10+
<packaging>pom</packaging>
11+
<modules>
12+
<module>spring-chatgpt-sample-common</module>
13+
<module>spring-chatgpt-sample-webapi</module>
14+
<module>spring-chatgpt-sample-cli</module>
15+
</modules>
16+
17+
<parent>
18+
<groupId>org.springframework.boot</groupId>
19+
<artifactId>spring-boot-starter-parent</artifactId>
20+
<version>3.0.6</version>
21+
<relativePath/> <!-- lookup parent from repository -->
22+
</parent>
23+
24+
<properties>
25+
<java.version>17</java.version>
26+
<maven.compiler.source>17</maven.compiler.source>
27+
<maven.compiler.target>17</maven.compiler.target>
28+
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
29+
</properties>
30+
31+
<dependencies>
32+
<dependency>
33+
<groupId>org.springframework.boot</groupId>
34+
<artifactId>spring-boot-starter-web</artifactId>
35+
</dependency>
36+
<dependency>
37+
<groupId>com.azure</groupId>
38+
<artifactId>azure-spring-data-cosmos</artifactId>
39+
<version>LATEST</version>
40+
</dependency>
41+
</dependencies>
42+
<dependencyManagement>
43+
<dependencies>
44+
<dependency>
45+
<groupId>com.azure</groupId>
46+
<artifactId>azure-ai-openai</artifactId>
47+
<version>1.0.0-beta.2</version>
48+
</dependency>
49+
<dependency>
50+
<groupId>com.knuddels</groupId>
51+
<artifactId>jtokkit</artifactId>
52+
<version>0.5.0</version>
53+
</dependency>
54+
</dependencies>
55+
</dependencyManagement>
56+
<build>
57+
<plugins>
58+
<plugin>
59+
<groupId>org.springframework.boot</groupId>
60+
<artifactId>spring-boot-maven-plugin</artifactId>
61+
<configuration>
62+
<excludes>
63+
<exclude>
64+
<groupId>org.projectlombok</groupId>
65+
<artifactId>lombok</artifactId>
66+
</exclude>
67+
</excludes>
68+
</configuration>
69+
</plugin>
70+
</plugins>
71+
</build>
72+
</project>
Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
3+
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
4+
<modelVersion>4.0.0</modelVersion>
5+
<parent>
6+
<groupId>com.microsoft.azure</groupId>
7+
<artifactId>spring-chatgpt-sample-cosmos</artifactId>
8+
<version>0.0.1-SNAPSHOT</version>
9+
<relativePath>../pom.xml</relativePath>
10+
</parent>
11+
<artifactId>spring-chatgpt-sample-cli</artifactId>
12+
<version>0.0.1-SNAPSHOT</version>
13+
<name>spring-chatgpt-sample-cli</name>
14+
<description>Demo project for Spring Boot</description>
15+
<dependencies>
16+
<dependency>
17+
<groupId>org.springframework.boot</groupId>
18+
<artifactId>spring-boot-starter</artifactId>
19+
</dependency>
20+
<dependency>
21+
<groupId>com.azure</groupId>
22+
<artifactId>azure-ai-openai</artifactId>
23+
</dependency>
24+
<dependency>
25+
<groupId>com.microsoft.azure</groupId>
26+
<artifactId>spring-chatgpt-sample-common</artifactId>
27+
<version>0.0.1-SNAPSHOT</version>
28+
</dependency>
29+
<dependency>
30+
<groupId>com.azure</groupId>
31+
<artifactId>azure-spring-data-cosmos</artifactId>
32+
<version>LATEST</version>
33+
</dependency>
34+
</dependencies>
35+
<build>
36+
<plugins>
37+
<plugin>
38+
<groupId>org.springframework.boot</groupId>
39+
<artifactId>spring-boot-maven-plugin</artifactId>
40+
<configuration>
41+
<skip>false</skip>
42+
</configuration>
43+
</plugin>
44+
</plugins>
45+
</build>
46+
</project>
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
package com.microsoft.azure.spring.chatgpt.sample.cli;
2+
3+
import com.microsoft.azure.spring.chatgpt.sample.common.DocumentIndexPlanner;
4+
import org.springframework.boot.ApplicationArguments;
5+
import org.springframework.boot.ApplicationRunner;
6+
import org.springframework.boot.SpringApplication;
7+
import org.springframework.boot.autoconfigure.SpringBootApplication;
8+
9+
import java.io.IOException;
10+
11+
@SpringBootApplication
12+
public class CliApplication implements ApplicationRunner {
13+
14+
public CliApplication(DocumentIndexPlanner indexPlanner) {
15+
this.indexPlanner = indexPlanner;
16+
}
17+
18+
private final DocumentIndexPlanner indexPlanner;
19+
20+
public static void main(String[] args) {
21+
SpringApplication.run(CliApplication.class, args);
22+
}
23+
24+
@Override
25+
public void run(ApplicationArguments args) throws IOException {
26+
var from = args.getOptionValues("from");
27+
if (from == null || from.size() != 1) {
28+
System.err.println("argument --from is required.");
29+
System.exit(-1);
30+
}
31+
indexPlanner.buildFromFolder(from.get(0));
32+
}
33+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
package com.microsoft.azure.spring.chatgpt.sample.cli;
2+
3+
import com.azure.ai.openai.OpenAIClientBuilder;
4+
import com.azure.core.credential.AzureKeyCredential;
5+
import com.azure.cosmos.CosmosClientBuilder;
6+
import com.azure.cosmos.DirectConnectionConfig;
7+
import com.azure.spring.data.cosmos.config.AbstractCosmosConfiguration;
8+
import com.azure.spring.data.cosmos.repository.config.EnableCosmosRepositories;
9+
import com.microsoft.azure.spring.chatgpt.sample.common.AzureOpenAIClient;
10+
import com.microsoft.azure.spring.chatgpt.sample.common.DocumentIndexPlanner;
11+
import com.microsoft.azure.spring.chatgpt.sample.common.store.CosmosDBVectorStore;
12+
import com.microsoft.azure.spring.chatgpt.sample.common.store.CosmosEntityRepository;
13+
import com.microsoft.azure.spring.chatgpt.sample.common.store.CosmosProperties;
14+
import org.springframework.beans.factory.annotation.Autowired;
15+
import org.springframework.beans.factory.annotation.Value;
16+
import org.springframework.boot.context.properties.EnableConfigurationProperties;
17+
import org.springframework.context.ApplicationContext;
18+
import org.springframework.context.annotation.Bean;
19+
import org.springframework.context.annotation.Configuration;
20+
21+
import java.io.IOException;
22+
23+
@Configuration
24+
@EnableConfigurationProperties(CosmosProperties.class)
25+
@EnableCosmosRepositories(basePackages = "com.microsoft.azure.spring.chatgpt.sample.common.store")
26+
public class Config extends AbstractCosmosConfiguration {
27+
28+
@Value("${AZURE_OPENAI_EMBEDDINGDEPLOYMENTID}")
29+
private String embeddingDeploymentId;
30+
31+
@Value("${AZURE_OPENAI_CHATDEPLOYMENTID}")
32+
private String chatDeploymentId;
33+
34+
@Value("${AZURE_OPENAI_ENDPOINT}")
35+
private String endpoint;
36+
37+
@Value("${AZURE_OPENAI_APIKEY}")
38+
private String apiKey;
39+
40+
@Autowired
41+
private CosmosProperties properties;
42+
43+
@Autowired
44+
private CosmosEntityRepository cosmosEntityRepository;
45+
46+
@Autowired
47+
private ApplicationContext applicationContext;
48+
49+
public Config() throws IOException {
50+
}
51+
52+
@Override
53+
protected String getDatabaseName() {
54+
return properties.getDatabaseName();
55+
}
56+
57+
@Bean
58+
public DocumentIndexPlanner planner(AzureOpenAIClient openAIClient, CosmosDBVectorStore vectorStore) {
59+
return new DocumentIndexPlanner(openAIClient, vectorStore);
60+
}
61+
62+
@Bean
63+
public AzureOpenAIClient AzureOpenAIClient() {
64+
var innerClient = new OpenAIClientBuilder()
65+
.endpoint(endpoint)
66+
.credential(new AzureKeyCredential(apiKey))
67+
.buildClient();
68+
return new AzureOpenAIClient(innerClient, embeddingDeploymentId, null);
69+
}
70+
71+
@Bean
72+
public CosmosClientBuilder cosmosClientBuilder() {
73+
DirectConnectionConfig directConnectionConfig = DirectConnectionConfig.getDefaultConfig();
74+
return new CosmosClientBuilder()
75+
.endpoint(properties.getUri())
76+
.key(properties.getKey())
77+
.directMode(directConnectionConfig);
78+
}
79+
80+
@Bean
81+
public CosmosDBVectorStore vectorStore() {
82+
CosmosDBVectorStore store = new CosmosDBVectorStore(cosmosEntityRepository, properties.getContainerName(),
83+
properties.getDatabaseName(), applicationContext);
84+
return store;
85+
}
86+
}

0 commit comments

Comments
 (0)