You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: tutorial/markdown/generated/vector-search-cookbook/RAG_with_Couchbase_and_AzureOpenAI.md
+41-5Lines changed: 41 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,6 +29,31 @@ length: 60 Mins
29
29
# Introduction
30
30
In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database, [AzureOpenAI](https://azure.microsoft.com/) as the AI-powered embedding and language model provider. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch.
31
31
32
+
# How to run this tutorial
33
+
34
+
This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/azure/RAG_with_Couchbase_and_AzureOpenAI.ipynb).
35
+
36
+
You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment.
37
+
38
+
# Before you start
39
+
40
+
## Get Credentials for Azure OpenAI
41
+
42
+
Please follow the [instructions](https://learn.microsoft.com/en-us/azure/ai-services/openai/reference) to generate the Azure OpenAI credentials.
43
+
44
+
## Create and Deploy Your Free Tier Operational cluster on Capella
45
+
46
+
To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.
47
+
48
+
To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).
49
+
50
+
### Couchbase Capella Configuration
51
+
52
+
When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met.
53
+
54
+
* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the travel-sample bucket (Read and Write) used in the application.
55
+
*[Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.
56
+
32
57
# Setting the Stage: Installing Necessary Libraries
33
58
To build our semantic search engine, we need a robust set of tools. The libraries we install handle everything from connecting to databases to performing complex machine learning tasks. Each library has a specific role: Couchbase libraries manage database operations, LangChain handles AI model integrations, and AzureOpenAI provides advanced AI models for generating embeddings and understanding natural language. By setting up these libraries, we ensure our environment is equipped to handle the data-intensive and computationally complex tasks required for semantic search.
Semantic search requires an efficient way to retrieve relevant documents based on a user’s query. This is where the Couchbase Full-Text Search (FTS) index comes in. An FTS index is designed to enable full-text search capabilities, such as searching for words or phrases within documents stored in Couchbase. In this step, we load the FTS index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed and other parameters that determine how the search engine processes queries. By defining an index, we ensure that our search engine can quickly and accurately retrieve the most relevant documents, which is crucial for delivering fast and relevant search results to users.
236
+
# Loading Couchbase Vector Search Index
237
+
238
+
Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase **Vector Search Index** comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.
239
+
240
+
For more information on creating a vector search index, please follow the [instructions](https://docs.couchbase.com/cloud/vector-search/create-vector-search-index-ui.html).
# If you are running this script locally (not in Google Colab), uncomment the following line
246
+
# and provide the path to your index definition file.
247
+
248
+
# index_definition_path = '/path_to_your_index_file/azure_index.json' # Local setup: specify your file path here
218
249
219
-
#Prompt user to upload to google drive
250
+
#If you are running in Google Colab, use the following code to upload the index definition file
220
251
from google.colab import files
221
252
print("Upload your index definition file")
222
253
uploaded = files.upload()
@@ -236,8 +267,8 @@ except Exception as e:
236
267
237
268
238
269
# Creating or Updating Search Indexes
239
-
With the index definition loaded, the next step is to create or update the FTS index in Couchbase. This step is fundamental because it optimizes our database for text search operations, allowing us to perform searches based on the content of documents rather than just their metadata. By creating or updating an FTS index, we make it possible for our search engine to handle complex queries that involve finding specific text within documents, which is essential for a robust semantic search engine.
240
270
271
+
With the index definition loaded, the next step is to create or update the **Vector Search Index** in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Vector Search Index, we enable our search engine to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust semantic search engine.
241
272
242
273
243
274
```python
@@ -531,6 +562,11 @@ logging.info("Successfully created RAG chain")
531
562
532
563
533
564
565
+
```python
566
+
567
+
```
568
+
569
+
534
570
```python
535
571
# Get responses
536
572
logging.disable(sys.maxsize) # Disable logging to prevent tqdm output
Copy file name to clipboardExpand all lines: tutorial/markdown/generated/vector-search-cookbook/RAG_with_Couchbase_and_Claude(by_Anthropic).md
+37-5Lines changed: 37 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,6 +29,32 @@ length: 60 Mins
29
29
# Introduction
30
30
In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database, [OpenAI](https://openai.com/) as the AI-powered embedding and [Anthropic](https://claude.ai/) as the language model provider. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch.
31
31
32
+
# How to run this tutorial
33
+
34
+
This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/claudeai/RAG_with_Couchbase_and_Claude(by_Anthropic).ipynb).
35
+
36
+
You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment.
37
+
38
+
# Before you start
39
+
40
+
## Get Credentials for OpenAI and Anthropic
41
+
42
+
* Please follow the [instructions](https://platform.openai.com/docs/quickstart) to generate the OpenAI credentials.
43
+
* Please follow the [instructions](https://docs.anthropic.com/en/api/getting-started) to generate the Anthropic credentials.
44
+
45
+
## Create and Deploy Your Free Tier Operational cluster on Capella
46
+
47
+
To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.
48
+
49
+
To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).
50
+
51
+
### Couchbase Capella Configuration
52
+
53
+
When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met.
54
+
55
+
* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the travel-sample bucket (Read and Write) used in the application.
56
+
*[Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.
57
+
32
58
# Setting the Stage: Installing Necessary Libraries
33
59
To build our semantic search engine, we need a robust set of tools. The libraries we install handle everything from connecting to databases to performing complex machine learning tasks. Each library has a specific role: Couchbase libraries manage database operations, LangChain handles AI model integrations, and OpenAI provides advanced AI models for generating embeddings and Claude(by Anthropic) for understanding natural language. By setting up these libraries, we ensure our environment is equipped to handle the data-intensive and computationally complex tasks required for semantic search.
Semantic search requires an efficient way to retrieve relevant documents based on a user’s query. This is where the Couchbase Full-Text Search (FTS) index comes in. An FTS index is designed to enable full-text search capabilities, such as searching for words or phrases within documents stored in Couchbase. In this step, we load the FTS index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed and other parameters that determine how the search engine processes queries. By defining an index, we ensure that our search engine can quickly and accurately retrieve the most relevant documents, which is crucial for delivering fast and relevant search results to users.
230
+
# Loading Couchbase Vector Search Index
231
+
232
+
Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase **Vector Search Index** comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.
233
+
234
+
For more information on creating a vector search index, please follow the [instructions](https://docs.couchbase.com/cloud/vector-search/create-vector-search-index-ui.html).
# If you are running this script locally (not in Google Colab), uncomment the following line
240
+
# and provide the path to your index definition file.
241
+
242
+
# index_definition_path = '/path_to_your_index_file/claude_index.json' # Local setup: specify your file path here
211
243
212
-
#Prompt user to upload to google drive
244
+
#If you are running in Google Colab, use the following code to upload the index definition file
213
245
from google.colab import files
214
246
print("Upload your index definition file")
215
247
uploaded = files.upload()
@@ -229,8 +261,8 @@ except Exception as e:
229
261
230
262
231
263
# Creating or Updating Search Indexes
232
-
With the index definition loaded, the next step is to create or update the FTS index in Couchbase. This step is fundamental because it optimizes our database for text search operations, allowing us to perform searches based on the content of documents rather than just their metadata. By creating or updating an FTS index, we make it possible for our search engine to handle complex queries that involve finding specific text within documents, which is essential for a robust semantic search engine.
233
264
265
+
With the index definition loaded, the next step is to create or update the **Vector Search Index** in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Vector Search Index, we enable our search engine to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust semantic search engine.
Copy file name to clipboardExpand all lines: tutorial/markdown/generated/vector-search-cookbook/RAG_with_Couchbase_and_Cohere.md
+38-6Lines changed: 38 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,6 +30,31 @@ length: 60 Mins
30
30
In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database and [Cohere](https://cohere.com/)
31
31
as the AI-powered embedding and language model provider. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch.
32
32
33
+
# How to run this tutorial
34
+
35
+
This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/cohere/RAG_with_Couchbase_and_Cohere.ipynb).
36
+
37
+
You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment.
38
+
39
+
# Before you start
40
+
41
+
## Get Credentials for Cohere
42
+
43
+
Please follow the [instructions](https://cohere.com/generate) to generate the Cohere credentials.
44
+
45
+
## Create and Deploy Your Free Tier Operational cluster on Capella
46
+
47
+
To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.
48
+
49
+
To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).
50
+
51
+
### Couchbase Capella Configuration
52
+
53
+
When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met.
54
+
55
+
* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the travel-sample bucket (Read and Write) used in the application.
56
+
*[Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.
57
+
33
58
# Setting the Stage: Installing Necessary Libraries
34
59
To build our semantic search engine, we need a robust set of tools. The libraries we install handle everything from connecting to databases to performing complex machine learning tasks.
The search index definition is loaded from a JSON file. This index defines how the data in Couchbase should be indexed for fast search and retrieval. Indexing is critical for optimizing search queries, especially when dealing with large datasets. The JSON file contains details about the index, such as its name, source type, and parameters.
237
+
# Loading Couchbase Vector Search Index
238
+
239
+
Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase **Vector Search Index** comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.
240
+
241
+
For more information on creating a vector search index, please follow the [instructions](https://docs.couchbase.com/cloud/vector-search/create-vector-search-index-ui.html).
# If you are running this script locally (not in Google Colab), uncomment the following line
247
+
# and provide the path to your index definition file.
248
+
249
+
# index_definition_path = '/path_to_your_index_file/cohere_index.json' # Local setup: specify your file path here
218
250
219
-
#Prompt user to upload to google drive
251
+
#If you are running in Google Colab, use the following code to upload the index definition file
220
252
from google.colab import files
221
253
print("Upload your index definition file")
222
254
uploaded = files.upload()
@@ -235,9 +267,9 @@ except Exception as e:
235
267
Saving cohere_index.json to cohere_index.json
236
268
237
269
238
-
# Create or Update Search Index
239
-
The script checks if the search index already exists in Couchbase. If it exists, the index is updated; if not, a new index is created. This step ensures that the data is properly indexed, allowing for efficient search operations later in the script. The index is associated with a specific bucket, scope, and collection in Couchbase, which organizes the data.
270
+
# Creating or Updating Search Indexes
240
271
272
+
With the index definition loaded, the next step is to create or update the **Vector Search Index** in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Vector Search Index, we enable our search engine to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust semantic search engine.
0 commit comments