Skip to content

Commit a9a4688

Browse files
committed
Generated markdown tutorials from Jupyter Notebooks
Generated from: couchbase-examples/vector-search-cookbook
1 parent eb39246 commit a9a4688

File tree

5 files changed

+193
-27
lines changed

5 files changed

+193
-27
lines changed

tutorial/markdown/generated/vector-search-cookbook/RAG_with_Couchbase_and_AzureOpenAI.md

Lines changed: 41 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,31 @@ length: 60 Mins
2929
# Introduction
3030
In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database, [AzureOpenAI](https://azure.microsoft.com/) as the AI-powered embedding and language model provider. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch.
3131

32+
# How to run this tutorial
33+
34+
This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/azure/RAG_with_Couchbase_and_AzureOpenAI.ipynb).
35+
36+
You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment.
37+
38+
# Before you start
39+
40+
## Get Credentials for Azure OpenAI
41+
42+
Please follow the [instructions](https://learn.microsoft.com/en-us/azure/ai-services/openai/reference) to generate the Azure OpenAI credentials.
43+
44+
## Create and Deploy Your Free Tier Operational cluster on Capella
45+
46+
To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.
47+
48+
To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).
49+
50+
### Couchbase Capella Configuration
51+
52+
When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met.
53+
54+
* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the travel-sample bucket (Read and Write) used in the application.
55+
* [Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.
56+
3257
# Setting the Stage: Installing Necessary Libraries
3358
To build our semantic search engine, we need a robust set of tools. The libraries we install handle everything from connecting to databases to performing complex machine learning tasks. Each library has a specific role: Couchbase libraries manage database operations, LangChain handles AI model integrations, and AzureOpenAI provides advanced AI models for generating embeddings and understanding natural language. By setting up these libraries, we ensure our environment is equipped to handle the data-intensive and computationally complex tasks required for semantic search.
3459

@@ -208,15 +233,21 @@ setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, CACHE_COLLECTION)
208233

209234

210235

211-
# Loading the Index Definition
212-
Semantic search requires an efficient way to retrieve relevant documents based on a user’s query. This is where the Couchbase Full-Text Search (FTS) index comes in. An FTS index is designed to enable full-text search capabilities, such as searching for words or phrases within documents stored in Couchbase. In this step, we load the FTS index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed and other parameters that determine how the search engine processes queries. By defining an index, we ensure that our search engine can quickly and accurately retrieve the most relevant documents, which is crucial for delivering fast and relevant search results to users.
236+
# Loading Couchbase Vector Search Index
237+
238+
Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase **Vector Search Index** comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.
239+
240+
For more information on creating a vector search index, please follow the [instructions](https://docs.couchbase.com/cloud/vector-search/create-vector-search-index-ui.html).
213241

214242

215243

216244
```python
217-
# index_definition_path = '/path_to_your_index_file/azure_index.json'
245+
# If you are running this script locally (not in Google Colab), uncomment the following line
246+
# and provide the path to your index definition file.
247+
248+
# index_definition_path = '/path_to_your_index_file/azure_index.json' # Local setup: specify your file path here
218249

219-
# Prompt user to upload to google drive
250+
# If you are running in Google Colab, use the following code to upload the index definition file
220251
from google.colab import files
221252
print("Upload your index definition file")
222253
uploaded = files.upload()
@@ -236,8 +267,8 @@ except Exception as e:
236267

237268

238269
# Creating or Updating Search Indexes
239-
With the index definition loaded, the next step is to create or update the FTS index in Couchbase. This step is fundamental because it optimizes our database for text search operations, allowing us to perform searches based on the content of documents rather than just their metadata. By creating or updating an FTS index, we make it possible for our search engine to handle complex queries that involve finding specific text within documents, which is essential for a robust semantic search engine.
240270

271+
With the index definition loaded, the next step is to create or update the **Vector Search Index** in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Vector Search Index, we enable our search engine to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust semantic search engine.
241272

242273

243274
```python
@@ -531,6 +562,11 @@ logging.info("Successfully created RAG chain")
531562

532563

533564

565+
```python
566+
567+
```
568+
569+
534570
```python
535571
# Get responses
536572
logging.disable(sys.maxsize) # Disable logging to prevent tqdm output

tutorial/markdown/generated/vector-search-cookbook/RAG_with_Couchbase_and_Claude(by_Anthropic).md

Lines changed: 37 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,32 @@ length: 60 Mins
2929
# Introduction
3030
In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database, [OpenAI](https://openai.com/) as the AI-powered embedding and [Anthropic](https://claude.ai/) as the language model provider. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch.
3131

32+
# How to run this tutorial
33+
34+
This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/claudeai/RAG_with_Couchbase_and_Claude(by_Anthropic).ipynb).
35+
36+
You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment.
37+
38+
# Before you start
39+
40+
## Get Credentials for OpenAI and Anthropic
41+
42+
* Please follow the [instructions](https://platform.openai.com/docs/quickstart) to generate the OpenAI credentials.
43+
* Please follow the [instructions](https://docs.anthropic.com/en/api/getting-started) to generate the Anthropic credentials.
44+
45+
## Create and Deploy Your Free Tier Operational cluster on Capella
46+
47+
To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.
48+
49+
To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).
50+
51+
### Couchbase Capella Configuration
52+
53+
When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met.
54+
55+
* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the travel-sample bucket (Read and Write) used in the application.
56+
* [Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.
57+
3258
# Setting the Stage: Installing Necessary Libraries
3359
To build our semantic search engine, we need a robust set of tools. The libraries we install handle everything from connecting to databases to performing complex machine learning tasks. Each library has a specific role: Couchbase libraries manage database operations, LangChain handles AI model integrations, and OpenAI provides advanced AI models for generating embeddings and Claude(by Anthropic) for understanding natural language. By setting up these libraries, we ensure our environment is equipped to handle the data-intensive and computationally complex tasks required for semantic search.
3460

@@ -201,15 +227,21 @@ setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, CACHE_COLLECTION)
201227

202228

203229

204-
# Loading the Index Definition
205-
Semantic search requires an efficient way to retrieve relevant documents based on a user’s query. This is where the Couchbase Full-Text Search (FTS) index comes in. An FTS index is designed to enable full-text search capabilities, such as searching for words or phrases within documents stored in Couchbase. In this step, we load the FTS index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed and other parameters that determine how the search engine processes queries. By defining an index, we ensure that our search engine can quickly and accurately retrieve the most relevant documents, which is crucial for delivering fast and relevant search results to users.
230+
# Loading Couchbase Vector Search Index
231+
232+
Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase **Vector Search Index** comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.
233+
234+
For more information on creating a vector search index, please follow the [instructions](https://docs.couchbase.com/cloud/vector-search/create-vector-search-index-ui.html).
206235

207236

208237

209238
```python
210-
# index_definition_path = '/path_to_your_index_file/claude_index.json'
239+
# If you are running this script locally (not in Google Colab), uncomment the following line
240+
# and provide the path to your index definition file.
241+
242+
# index_definition_path = '/path_to_your_index_file/claude_index.json' # Local setup: specify your file path here
211243

212-
# Prompt user to upload to google drive
244+
# If you are running in Google Colab, use the following code to upload the index definition file
213245
from google.colab import files
214246
print("Upload your index definition file")
215247
uploaded = files.upload()
@@ -229,8 +261,8 @@ except Exception as e:
229261

230262

231263
# Creating or Updating Search Indexes
232-
With the index definition loaded, the next step is to create or update the FTS index in Couchbase. This step is fundamental because it optimizes our database for text search operations, allowing us to perform searches based on the content of documents rather than just their metadata. By creating or updating an FTS index, we make it possible for our search engine to handle complex queries that involve finding specific text within documents, which is essential for a robust semantic search engine.
233264

265+
With the index definition loaded, the next step is to create or update the **Vector Search Index** in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Vector Search Index, we enable our search engine to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust semantic search engine.
234266

235267

236268
```python

tutorial/markdown/generated/vector-search-cookbook/RAG_with_Couchbase_and_Cohere.md

Lines changed: 38 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,31 @@ length: 60 Mins
3030
In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database and [Cohere](https://cohere.com/)
3131
as the AI-powered embedding and language model provider. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch.
3232

33+
# How to run this tutorial
34+
35+
This tutorial is available as a Jupyter Notebook (`.ipynb` file) that you can run interactively. You can access the original notebook [here](https://github.com/couchbase-examples/vector-search-cookbook/blob/main/cohere/RAG_with_Couchbase_and_Cohere.ipynb).
36+
37+
You can either download the notebook file and run it on [Google Colab](https://colab.research.google.com/) or run it on your system by setting up the Python environment.
38+
39+
# Before you start
40+
41+
## Get Credentials for Cohere
42+
43+
Please follow the [instructions](https://cohere.com/generate) to generate the Cohere credentials.
44+
45+
## Create and Deploy Your Free Tier Operational cluster on Capella
46+
47+
To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with a environment where you can explore and learn about Capella with no time constraint.
48+
49+
To know more, please follow the [instructions](https://docs.couchbase.com/cloud/get-started/create-account.html).
50+
51+
### Couchbase Capella Configuration
52+
53+
When running Couchbase using [Capella](https://cloud.couchbase.com/sign-in), the following prerequisites need to be met.
54+
55+
* Create the [database credentials](https://docs.couchbase.com/cloud/clusters/manage-database-users.html) to access the travel-sample bucket (Read and Write) used in the application.
56+
* [Allow access](https://docs.couchbase.com/cloud/clusters/allow-ip-address.html) to the Cluster from the IP on which the application is running.
57+
3358
# Setting the Stage: Installing Necessary Libraries
3459
To build our semantic search engine, we need a robust set of tools. The libraries we install handle everything from connecting to databases to performing complex machine learning tasks.
3560

@@ -209,14 +234,21 @@ setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, CACHE_COLLECTION)
209234

210235

211236

212-
# Load Index Definition
213-
The search index definition is loaded from a JSON file. This index defines how the data in Couchbase should be indexed for fast search and retrieval. Indexing is critical for optimizing search queries, especially when dealing with large datasets. The JSON file contains details about the index, such as its name, source type, and parameters.
237+
# Loading Couchbase Vector Search Index
238+
239+
Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase **Vector Search Index** comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.
240+
241+
For more information on creating a vector search index, please follow the [instructions](https://docs.couchbase.com/cloud/vector-search/create-vector-search-index-ui.html).
242+
214243

215244

216245
```python
217-
# index_definition_path = '/path_to_your_index_file/cohere_index.json'
246+
# If you are running this script locally (not in Google Colab), uncomment the following line
247+
# and provide the path to your index definition file.
248+
249+
# index_definition_path = '/path_to_your_index_file/cohere_index.json' # Local setup: specify your file path here
218250

219-
# Prompt user to upload to google drive
251+
# If you are running in Google Colab, use the following code to upload the index definition file
220252
from google.colab import files
221253
print("Upload your index definition file")
222254
uploaded = files.upload()
@@ -235,9 +267,9 @@ except Exception as e:
235267
Saving cohere_index.json to cohere_index.json
236268

237269

238-
# Create or Update Search Index
239-
The script checks if the search index already exists in Couchbase. If it exists, the index is updated; if not, a new index is created. This step ensures that the data is properly indexed, allowing for efficient search operations later in the script. The index is associated with a specific bucket, scope, and collection in Couchbase, which organizes the data.
270+
# Creating or Updating Search Indexes
240271

272+
With the index definition loaded, the next step is to create or update the **Vector Search Index** in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Vector Search Index, we enable our search engine to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust semantic search engine.
241273

242274

243275
```python

0 commit comments

Comments
 (0)