Skip to content

Commit 0889c96

Browse files
authored
Updated tutorial for Bedrock Serverless PDF Chat (#57)
* Updated tutorial for Bedrock Serverless PDF Chat: revised prerequisites to include uv, modified dependency installation instructions, clarified connection string requirements, and updated embedding model references. Fixed minor typos and improved clarity in several sections. * Clarified connection string requirements in Bedrock Serverless PDF Chat tutorial by removing the `?tls_verify=none` part for improved accuracy.
1 parent bcb96b4 commit 0889c96

File tree

1 file changed

+25
-30
lines changed

1 file changed

+25
-30
lines changed

tutorial/markdown/python/bedrock-serverless-pdf-chat/bedrock-serverless-pdf-chat.md

Lines changed: 25 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,7 @@ This tutorial will demonstrate how to -
4040

4141
## Prerequisites
4242

43-
- [Python](https://www.python.org/downloads/) 3.10 or higher installed.
44-
- Ensure that the Python version is [compatible](https://docs.couchbase.com/python-sdk/current/project-docs/compatibility.html#python-version-compat) with the Couchbase SDK.
43+
- Ensure [`uv`](https://docs.astral.sh/uv/) installed. `uv` helps us manage Python versions and create a lockfile for all the project's dependencies for ease of use.
4544
- Couchbase Cluster (Self Managed or Capella) version 7.6.2+ with [Search Service](https://docs.couchbase.com/server/current/fts/fts-introduction.html) and [Eventing Service](https://docs.couchbase.com/server/current/eventing/eventing-overview.html)
4645

4746
> Note that this tutorial is designed to work with the latest Python SDK version (4.3.0+) for Couchbase. It will not work with the older Python SDK versions.
@@ -58,10 +57,10 @@ git clone https://github.com/couchbase-examples/rag-aws-bedrock-serverless.git
5857

5958
### Install Dependencies
6059

61-
Any dependencies should be installed through `pip`, the default package manager for Python. You may use [virtual environment](https://docs.python.org/3/tutorial/venv.html) as well.
60+
Dependencies for the project are installed via `uv` as mentioned earlier. You can install the dependencies using the following command:
6261

6362
```shell
64-
python -m pip install -r requirements.txt
63+
uv sync
6564
```
6665

6766
### Setup Database Configuration
@@ -199,7 +198,7 @@ CB_COLLECTION=name_of_collection_to_store_documents
199198
INDEX_NAME=name_of_fts_index_with_vector_support
200199
```
201200

202-
> The [connection string](https://docs.couchbase.com/python-sdk/current/howtos/managing-connections.html#connection-strings) expects the `couchbases://` or `couchbase://` part.
201+
> The [connection string](https://docs.couchbase.com/python-sdk/current/howtos/managing-connections.html#connection-strings) expects the `couchbases://` or `couchbase://` part. In the end, the connection string must look something like this: `couchbases://capella.connection.string.com`.
203202
204203
> For this tutorial, `CB_BUCKET = pdf-chat`, `CB_SCOPE = shared`, `CB_COLLECTION = docs` and `INDEX_NAME = pdf_search`.
205204
@@ -209,22 +208,22 @@ We need to set up our AWS Environment and run all the necessary services.
209208

210209
#### Deploy Lambdas to ECR
211210

212-
We will need to use Lambdas deployed as docker container in the [AWS Elastic Container Registry Service](https://aws.amazon.com/ecr/). We have two lambdas in the application at directory `src/lambads`. For Each of the lambdas a new ECR Repository needs to be created.
211+
We will need to use Lambdas deployed as docker container in the [AWS Elastic Container Registry Service](https://aws.amazon.com/ecr/). We have two lambdas in the application at directory `src/lambdas`. For Each of the lambdas a new ECR Repository needs to be created.
213212

214213
Firstly build the docker image for the two of them using `docker build` in the respective folder
215214

216215
```bash
217216
docker build -t <lambda name> .
218217
```
219218

220-
Lambda name will be chat and ingest for the respective folders.
219+
Lambda name will be `chat` and `ingest` for the respective folders.
221220
Then use [this guide from AWS](https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-push-ecr-image.html) to understand how to push an image to ECR.
222221

223222
Once it's pushed, we are ready for the next steps.
224223

225224
#### Enable Bedrock
226225

227-
You may need to allow access to models used in this example via [Amazon Bedrock](https://console.aws.amazon.com/bedrock). Click on get started to open Bedrock console. in the sidebar, there will be option of `Bedrock configurations`. Click on model access inside it. For this tutorial we are using models `Titan Multimodal Embeddings G1` and `Llama 3 70B Instruct`. You can change the lambda named chat if you need any other models. Click on modify model access and select models required. Accept terms and conditions and now the AI Model will be ready to use.
226+
You may need to allow access to models used in this example via [Amazon Bedrock](https://console.aws.amazon.com/bedrock). Click on get started to open Bedrock console. in the sidebar, there will be option of `Bedrock configurations`. Click on model access inside it. For this tutorial we are using models `Titan Text Embedding v2` and `Llama 3 70B Instruct`. You can change the lambda named chat if you need any other models. Click on modify model access and select models required. Accept terms and conditions and now the AI Model will be ready to use.
228227

229228
#### Setup AWS CLI
230229

@@ -247,10 +246,10 @@ You may enter a couple of `y` to approve the deployment. This step will ensure 2
247246
We will need to update the environment file to include one more variable at the end of `.env` file we created at [Setup Environment Config](#setup-environment-config). The variable name is `CHAT_URL`
248247

249248
```bash
250-
CHAT_URL=API_Gateway_Endpoint_of_ChatBedrockStack
249+
CHAT_URL=API_Gateway_Endpoint_of_CouchbaseChatStack
251250
```
252251

253-
This variable must have the API gateway endpoint of the ChatBedrockStack which came as part of cdk deploy. The exact endpoint also includes `/chat` so append it to the back of URL.
252+
This variable must have the API gateway endpoint of the CouchbaseChatStack which came as part of cdk deploy. The exact endpoint also includes `/chat` so append it to the back of URL.
254253

255254
### Setup Couchbase Eventing
256255

@@ -260,7 +259,7 @@ We will use import function feature of couchbase eventing. Go to eventing tab in
260259

261260
> We are using _default scope and _default collection for storing eventing temp files, you may use some other collection of your choice.
262261
263-
Now, in the create bindings step, Change the `API_URL` variable to URL of the API gateway endpoint of `IngestBedrockStack`. The URL should be appended with `/send` to indicate exact endpoint of the function.
262+
Now, in the create bindings step, Change the `API_URL` variable to URL of the API gateway endpoint of `CouchbaseBedrockStack`. The URL should be appended with `/send` to indicate exact endpoint of the function.
264263

265264
Once it's updated, click next, and you can see your JS eventing function. You may add any logging or any features you may require here. The default function sends data to API gateway which inside AWS calls SQS and Lambda which will update the file with necessary embeddings.
266265

@@ -462,34 +461,30 @@ The first step will be connecting to Couchbase. Couchbase Vector Store is requir
462461
The connection string and credentials are read from the environment variables. We perform some basic required checks for the environment variable not being set in the `.env`, and then proceed to connect to the Couchbase cluster. We connect to the cluster using [connect](https://docs.couchbase.com/python-sdk/current/hello-world/start-using-sdk.html#connect) method.
463462

464463
```python
465-
def connect_to_couchbase(connection_string, db_username, db_password):
466-
"""Connect to Couchbase"""
467-
from couchbase.cluster import Cluster
468-
from couchbase.auth import PasswordAuthenticator
469-
from couchbase.options import ClusterOptions
470-
from datetime import timedelta
471-
472-
auth = PasswordAuthenticator(db_username, db_password)
473-
options = ClusterOptions(auth)
474-
connect_string = connection_string
475-
cluster = Cluster(connect_string, options)
476-
477-
# Wait until the cluster is ready for use.
478-
cluster.wait_until_ready(timedelta(seconds=5))
479-
480-
return cluster
464+
def connect_to_couchbase(connection_string, username, password):
465+
try:
466+
auth = PasswordAuthenticator(username, password)
467+
options = ClusterOptions(auth)
468+
options.apply_profile("wan_development")
469+
cluster = Cluster(connection_string, options)
470+
cluster.wait_until_ready(timedelta(seconds=5))
471+
logging.info("Cluster is ready")
472+
return cluster
473+
except Exception as e:
474+
logging.error(f'Error while connecting: {str(e)}')
475+
raise e
481476
```
482477

483478
#### Get bedrock embeddings
484479

485-
This part of code uses AWS SDK to get bedrock. Then initializes the embedding object. For this example we are using `amazon.titan-embed-image-v1` embedding model.
480+
This part of code uses AWS SDK to get bedrock. Then initializes the embedding object. For this example we are using `amazon.titan-embed-text-v2` embedding model.
486481

487482
```python
488483
from langchain_aws.embeddings import BedrockEmbeddings
489484

490485
bedrock = boto3.client('bedrock-runtime')
491486
cluster = connect_to_couchbase(connection_string, username, password)
492-
embedding = BedrockEmbeddings(client=bedrock, model_id="amazon.titan-embed-image-v1")
487+
embedding = BedrockEmbeddings(client=bedrock, model_id="amazon.titan-embed-text-v2:0")
493488
```
494489

495490
#### Use LangChain to get embeddings and store text
@@ -520,7 +515,7 @@ Similar to the last section, here again we will set up Couchbase Cluster using P
520515
```python
521516
cluster = connect_to_couchbase(connection_string, username, password)
522517
bedrock = boto3.client('bedrock-runtime')
523-
embedding = BedrockEmbeddings(client=bedrock, model_id="amazon.titan-embed-image-v1")
518+
embedding = BedrockEmbeddings(client=bedrock, model_id="amazon.titan-embed-text-v2:0")
524519
```
525520

526521
#### Create Vector Store Retriever

0 commit comments

Comments
 (0)