Skip to content
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 0 additions & 20 deletions capella-model-services/llamaindex/__frontmatter.__md

This file was deleted.

Large diffs are not rendered by default.

22 changes: 22 additions & 0 deletions capella-model-services/llamaindex/query_based/frontmatter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
# frontmatter
path: "/tutorial-capella-model-services-llamaindex-rag-with-hyperscale-and-composite-vector-index"
title: "RAG with LlamaIndex, Capella Model Services and Couchbase Hyperscale & Composite Vector Indexes"
short_title: "RAG with LlamaIndex, Capella Model Services and Hyperscale & Composite Vector Indexes"
description:
- Learn how to build a semantic search engine using Couchbase Hyperscale and Composite Vector Indexes.
- This tutorial demonstrates how LlamaIndex integrates Couchbase vector search capabilities with embeddings generated by Capella Model Services.
- Perform Retrieval-Augmented Generation (RAG) using LlamaIndex with Couchbase and Capella Model Services.
content_type: tutorial
filter: sdk
technology:
- vector search
tags:
- Artificial Intelligence
- LlamaIndex
- Hyperscale Vector Index
- Composite Vector Index
sdk_language:
- python
length: 60 Mins
---
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@
"source": [
"# Introduction\n",
"\n",
"In this guide, we will walk you through building a Retrieval Augmented Generation (RAG) application using Couchbase Capella as the database, [Llama 3.1 8B Instruct](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1/) model as the large language model provided by Couchbase Capella AI Services. We will use the [e5-mistral-7b-instruct](https://huggingface.co/intfloat/e5-mistral-7b-instruct) model for generating embeddings via the Capella AI Services.\n",
"In this guide, we will walk you through building a Retrieval Augmented Generation (RAG) application with LlamaIndex orchestrating Capella Model Services and Couchbase Capella. We will use the models hosted on Capella Model Services for response generation and generating embeddings.\n",
"\n",
"This notebook demonstrates how to build a RAG system using:\n",
"- The [BBC News dataset](https://huggingface.co/datasets/RealTimeData/bbc_news_alltime) containing news articles\n",
"- Couchbase Capella as the vector store\n",
"- LlamaIndex framework for the RAG pipeline\n",
"- Capella AI Services for embeddings and text generation\n",
"\n",
"Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial will equip you with the knowledge to create a fully functional RAG system using Capella AI Services and LlamaIndex."
"Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial will equip you with the knowledge to create a fully functional RAG system using Capella Model Services and LlamaIndex."
]
},
{
Expand Down Expand Up @@ -46,19 +46,21 @@
"\n",
"In order to create the RAG application, we need an embedding model to ingest the documents for Vector Search and a large language model (LLM) for generating the responses based on the context. \n",
"\n",
"Capella Model Service allows you to create both the embedding model and the LLM in the same VPC as your database. Currently, the service offers Llama 3.1 Instruct model with 8 Billion parameters as an LLM and the mistral model for embeddings. \n",
"Capella Model Service allows you to create both the embedding model and the LLM in the same VPC as your database. There are multiple options for both the Embedding & Large Language Models, along with Value Adds to the models.\n",
"\n",
"Create the models using the Capella AI Services interface. While creating the model, it is possible to cache the responses (both standard and semantic cache) and apply guardrails to the LLM responses.\n",
"Create the models using the Capella Model Services interface. While creating the model, it is possible to cache the responses (both standard and semantic cache) and apply guardrails to the LLM responses.\n",
"\n",
"For more details, please refer to the [documentation](https://preview2.docs-test.couchbase.com/ai/get-started/about-ai-services.html#model).\n"
"For more details, please refer to the [documentation](https://docs.couchbase.com/ai/build/model-service/model-service.html). These models are compatible with the [Haystack OpenAI integration](https://haystack.deepset.ai/integrations/openai).\n",
"\n",
"After the models are deployed, please create the API keys for them and whitelist the keys on the IP on which the tutorial is being run. For more details, please refer to the documentation on [generating the API keys](https://docs.couchbase.com/ai/api-guide/api-start.html#model-service-keys).\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Installing Necessary Libraries\n",
"To build our RAG system, we need a set of libraries. The libraries we install handle everything from connecting to databases to performing AI tasks. Each library has a specific role: Couchbase libraries manage database operations, LlamaIndex handles AI model integrations, and we will use the OpenAI SDK for generating embeddings and calling the LLM in Capella AI services.\n"
"To build our RAG system, we need a set of libraries. The libraries we install handle everything from connecting to databases to performing AI tasks. Each library has a specific role: Couchbase libraries manage database operations, LlamaIndex handles AI model integrations, and we will use the OpenAI SDK (compatible with Capella Model Services) for generating embeddings and calling language models.\n"
]
},
{
Expand All @@ -68,7 +70,7 @@
"outputs": [],
"source": [
"# Install required packages\n",
"%pip install datasets llama-index-vector-stores-couchbase==0.4.0 llama-index-embeddings-openai==0.3.1 llama-index-llms-openai-like==0.3.5 llama-index==0.12.37"
"%pip install datasets llama-index-vector-stores-couchbase==0.6.0 llama-index-embeddings-openai==0.5.1 llama-index-llms-openai-like==0.5.3 llama-index==0.14.10"
]
},
{
Expand All @@ -86,7 +88,6 @@
"outputs": [],
"source": [
"import getpass\n",
"import base64\n",
"import logging\n",
"import sys\n",
"import time\n",
Expand Down Expand Up @@ -116,16 +117,16 @@
"\n",
"The script also validates that all required inputs are provided, raising an error if any crucial information is missing. This approach ensures that your integration is both secure and correctly configured without hardcoding sensitive information, enhancing the overall security and maintainability of your code.\n",
"\n",
"CAPELLA_AI_ENDPOINT is the Capella AI Services endpoint found in the models section.\n",
"CAPELLA_MODEL_SERVICES_ENDPOINT is the Capella Model Services endpoint found in the models section.\n",
"\n",
"> Note that the Capella AI Endpoint also requires an additional `/v1` from the endpoint shown on the UI if it is not shown on the UI.\n",
"> Note that the Capella Model Services Endpoint also requires an additional `/v1` from the endpoint shown on the UI if it is not shown on the UI.\n",
"\n",
"INDEX_NAME is the name of the search index we will use for the vector search."
]
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -135,15 +136,19 @@
"CB_BUCKET_NAME = input(\"Couchbase Bucket: \")\n",
"SCOPE_NAME = input(\"Couchbase Scope: \")\n",
"COLLECTION_NAME = input(\"Couchbase Collection: \")\n",
"INDEX_NAME = input(\"Vector Search Index: \")\n",
"CAPELLA_AI_ENDPOINT = input(\"Enter your Capella AI Services Endpoint: \")\n",
"INDEX_NAME = \"vector_search\" # need to be matched with the search index name in the search_index.json file\n",
"\n",
"# Check if the variables are correctly loaded\n",
"if not all([CB_CONNECTION_STRING, CB_USERNAME, CB_PASSWORD, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME, INDEX_NAME, CAPELLA_AI_ENDPOINT]):\n",
" raise ValueError(\"All configuration variables must be provided.\")\n",
"# Get Capella AI endpoint\n",
"CAPELLA_MODEL_SERVICES_ENDPOINT = input(\"Enter your Capella Model Services Endpoint: \")\n",
"LLM_MODEL_NAME = input(\"Enter the LLM name\")\n",
"LLM_API_KEY = getpass.getpass(\"Enter your Couchbase LLM API Key: \")\n",
"EMBEDDING_MODEL_NAME = input(\"Enter the Embedding Model name:\")\n",
"EMBEDDING_API_KEY = getpass.getpass(\"Enter your Couchbase Embedding Model API Key: \")\n",
"\n",
"# Generate a Capella AI key from the username and password\n",
"CAPELLA_AI_KEY = base64.b64encode(f\"{CB_USERNAME}:{CB_PASSWORD}\".encode(\"utf-8\")).decode(\"utf-8\")"
"# Check if the variables are correctly loaded\n",
"if not all([CB_CONNECTION_STRING, CB_USERNAME, CB_PASSWORD, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME, INDEX_NAME, \n",
"CAPELLA_MODEL_SERVICES_ENDPOINT, LLM_MODEL_NAME, LLM_API_KEY, EMBEDDING_MODEL_NAME, EMBEDDING_API_KEY]):\n",
" raise ValueError(\"All configuration variables must be provided.\")"
]
},
{
Expand All @@ -156,7 +161,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -264,7 +269,7 @@
"outputs": [],
"source": [
"# Create search index from search_index.json file at scope level\n",
"with open('fts_index.json', 'r') as search_file:\n",
"with open('search_index.json', 'r') as search_file:\n",
" search_index_definition = SearchIndex.from_json(json.load(search_file))\n",
" \n",
" # Update search index definition with user inputs\n",
Expand All @@ -287,7 +292,7 @@
" existing_index = scope_search_manager.get_index(search_index_name)\n",
" print(f\"Search index '{search_index_name}' already exists at scope level.\")\n",
" except Exception as e:\n",
" print(f\"Search index '{search_index_name}' does not exist at scope level. Creating search index from fts_index.json...\")\n",
" print(f\"Search index '{search_index_name}' does not exist at scope level. Creating search index from search_index.json...\")\n",
" scope_search_manager.upsert_index(search_index_definition)\n",
" print(f\"Search index '{search_index_name}' created successfully at scope level.\")"
]
Expand Down Expand Up @@ -370,8 +375,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Creating Embeddings using Capella AI Service\n",
"Embeddings are numerical representations of text that capture semantic meaning. Unlike keyword-based search, embeddings enable semantic search to understand context and retrieve documents that are conceptually similar even without exact keyword matches. We'll use Capella AI's OpenAI-compatible API to create embeddings with the intfloat/e5-mistral-7b-instruct model. This model transforms our text data into vector representations that can be efficiently searched, with a batch size of 30 for optimal processing.\n"
"# Creating Embeddings using Capella Model Service\n",
"Embeddings are numerical representations of text that capture semantic meaning. Unlike keyword-based search, embeddings enable semantic search to understand context and retrieve documents that are conceptually similar even without exact keyword matches. We'll use the model deployed on Capella Model Services to create high-quality embeddings. This model transforms our text data into vector representations that can be efficiently searched, with a batch size of 30 for optimal processing.\n"
]
},
{
Expand All @@ -383,9 +388,9 @@
"try:\n",
" # Set up the embedding model\n",
" embed_model = OpenAIEmbedding(\n",
" api_key=CAPELLA_AI_KEY,\n",
" api_base=CAPELLA_AI_ENDPOINT,\n",
" model_name=\"intfloat/e5-mistral-7b-instruct\",\n",
" api_key=EMBEDDING_API_KEY,\n",
" api_base=CAPELLA_MODEL_SERVICES_ENDPOINT,\n",
" model_name=EMBEDDING_MODEL_NAME,\n",
" embed_batch_size=30\n",
" )\n",
" \n",
Expand Down Expand Up @@ -528,12 +533,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Using the Large Language Model (LLM) in Capella AI\n",
"Language language models are AI systems that are trained to understand and generate human language. We'll be using the `Llama3.1-8B-Instruct` large language model via the Capella AI services inside the same network as the Capella operational database to process user queries and generate meaningful responses. This model is a key component of our RAG system, allowing it to go beyond simple keyword matching and truly understand the intent behind a query. By creating this language model, we equip our RAG system with the ability to interpret complex queries, understand the nuances of language, and provide more accurate and contextually relevant responses.\n",
"# Using Capella Model Services Large Language Model (LLM)\n",
"Large language models are AI systems that are trained to understand and generate human language. We'll be using the model deployed on Capella Model Services to process user queries and generate meaningful responses based on the retrieved context from our Couchbase vector store. This model is a key component of our RAG system, allowing it to go beyond simple keyword matching and truly understand the intent behind a query. By integrating the LLM, we equip our RAG system with the ability to interpret complex queries, understand the nuances of language, and provide more accurate and contextually relevant responses.\n",
"\n",
"The language model's ability to understand context and generate coherent responses is what makes our RAG system truly intelligent. It can not only find the right information but also present it in a way that is useful and understandable to the user.\n",
"\n",
"The LLM has been created using the LangChain OpenAI provider as well with the model name, URL and the API key based on the Capella AI Services."
"The LLM is configured using LlamaIndex's OpenAI-like provider with your Capella Model Services API key for seamless integration."
]
},
{
Expand All @@ -545,10 +550,9 @@
"try:\n",
" # Set up the LLM\n",
" llm = OpenAILike(\n",
" api_base=CAPELLA_AI_ENDPOINT,\n",
" api_key=CAPELLA_AI_KEY,\n",
" model=\"meta-llama/Llama-3.1-8B-Instruct\",\n",
" \n",
" api_base=CAPELLA_MODEL_SERVICES_ENDPOINT,\n",
" api_key=LLM_API_KEY,\n",
" model=LLM_MODEL_NAME,\n",
" )\n",
" \n",
" \n",
Expand Down Expand Up @@ -620,7 +624,7 @@
"\n",
" # Display search results\n",
" print(f\"\\nSemantic Search Results (completed in {search_elapsed_time:.2f} seconds):\")\n",
" print(response)\n",
" print(response.response)\n",
"\n",
"except RecursionError as e:\n",
" raise RuntimeError(f\"Error performing semantic search: {e}\")"
Expand Down Expand Up @@ -685,13 +689,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## LLM Guardrails in Capella AI Services\n",
"\n",
"Capella AI services also provide input and response moderation using configurable LLM guardrails. These services can integrate with the LlamaGuard3-8B model from Meta.\n",
"- Categories to be blocked can be configured during the model creation process.\n",
"- Helps prevent unsafe or undesirable interactions with the LLM.\n",
"\n",
"By implementing caching and moderation mechanisms, Capella AI services ensure an efficient, cost-effective, and responsible approach to AI-powered recommendations."
"# LLM Guardrails in Capella Model Services\n",
"Capella Model services also have the ability to moderate the user inputs and the responses generated by the LLM. Capella Model Services can be configured to use the [Llama 3.1 NemoGuard 8B safety model](https://build.nvidia.com/nvidia/llama-3_1-nemoguard-8b-content-safety/modelcard) guardrails model from Meta. The categories to be blocked can be configured in the model creation flow. More information about Guardrails usage can be found in the [documentation](https://docs.couchbase.com/ai/build/model-service/configure-guardrails-security.html#guardrails).\n",
" \n",
"Here is an example of the Guardrails in action"
]
},
{
Expand Down Expand Up @@ -727,7 +728,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "base",
"display_name": "haystack",
"language": "python",
"name": "python3"
},
Expand All @@ -741,7 +742,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
"version": "3.12.4"
}
},
"nbformat": 4,
Expand Down
21 changes: 21 additions & 0 deletions capella-model-services/llamaindex/search_based/frontmatter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
# frontmatter
path: "/tutorial-capella-model-services-llamaindex-rag-with-search-vector-index"
title: "RAG with LlamaIndex, Capella Model Services and Couchbase Search Vector Index"
short_title: "RAG with LlamaIndex, Capella Model Services and Couchbase Search Vector Index"
description:
- Learn how to build a semantic search engine using Couchbase Search Vector Index.
- This tutorial demonstrates how LlamaIndex integrates Couchbase vector search capabilities with embeddings generated by Capella Model Services.
- Perform Retrieval-Augmented Generation (RAG) using LlamaIndex with Couchbase and Capella Model Services.
content_type: tutorial
filter: sdk
technology:
- vector search
tags:
- Artificial Intelligence
- LlamaIndex
- Search Vector Index
sdk_language:
- python
length: 60 Mins
---
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
{
"vector_index_optimized_for": "recall",
"docvalues": true,
"dims": 4096,
"dims": 1024,
"include_in_all": false,
"include_term_vectors": false,
"index": true,
Expand Down
Loading