Replies: 1 comment
-
Hey @crman! 👋 I'm here to help you out with any bugs, questions, or contributions you have. Let's work together to solve this issue! To achieve dynamic index refresh for your Retrieval-Augmented Generation (RAG) application on AWS with S3 integration using LlamaIndex, you can leverage AWS Lambda functions to handle S3 events and update the index and vector store accordingly. Here’s a step-by-step guide: Step-by-Step Guide
import json
import boto3
import os
from llama_index.vector_stores.relyt import RelytVectorStore
from pgvecto_rs.sdk import PGVectoRs, Record
from llama_index.core.schema import BaseNode, MetadataMode
# Initialize the Relyt client
URL = "postgresql+psycopg://{username}:{password}@{host}:{port}/{db_name}".format(
port=os.getenv("RELYT_PORT", "5432"),
host=os.getenv("RELYT_HOST", "localhost"),
username=os.getenv("RELYT_USER", "postgres"),
password=os.getenv("RELYT_PASS", "mysecretpassword"),
db_name=os.getenv("RELYT_NAME", "postgres"),
)
client = PGVectoRs(
db_url=URL,
collection_name="example",
dimension=1536, # Using OpenAI’s text-embedding-ada-002
)
vector_store = RelytVectorStore(client=client)
def lambda_handler(event, context):
# Process the S3 event
s3 = boto3.client('s3')
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
# Get the object from S3
response = s3.get_object(Bucket=bucket, Key=key)
content = response['Body'].read().decode('utf-8')
# Create a BaseNode from the content
node = BaseNode(
id_=key,
text=content,
metadata={"ref_doc_id": key}
)
# Add the node to the vector store
vector_store.add([node])
return {
'statusCode': 200,
'body': json.dumps('Successfully processed S3 event and updated vector store')
}
Best Practices
This setup will ensure that whenever a new document is added or an existing document is modified in the S3 bucket, the Lambda function will be triggered, read the content of the document, create a |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I am working on creating a Retrieval-Augmented Generation (RAG) application on the AWS Cloud, with a specific requirement to dynamically update the index and vector store upon the arrival of new or modified documents in S3 storage.
Use Case:
Current Progress:
Question:
Objective:
I am looking for guidance on:
Thank you for your assistance!
Beta Was this translation helpful? Give feedback.
All reactions