You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -14,47 +14,78 @@ Scaleway's robust infrastructure makes it easier than ever to implement RAG, as
14
14
By utilizing our managed inference services, managed databases, and object storage, you can effortlessly build and deploy a customized model tailored to your specific needs.
15
15
16
16
<Macroid="requirements" />
17
+
17
18
- A Scaleway account logged into the [console](https://console.scaleway.com)
18
19
-[Owner](/identity-and-access-management/iam/concepts/#owner) status or [IAM permissions](/identity-and-access-management/iam/concepts/#permission) allowing you to perform actions in the intended Organization
19
20
-[Inference Deployment](/ai-data/managed-inference/how-to/create-deployment/): Set up an inference deployment using [sentence-transformers/sentence-t5-xxl](/ai-data/managed-inference/reference-content/sentence-t5-xxl/) on an L4 instance to efficiently process embeddings.
20
21
-[Inference Deployment](/ai-data/managed-inference/how-to/create-deployment/) with the model of your choice.
21
22
-[Object Storage Bucket](/storage/object/how-to/create-a-bucket/) to store all the data you want to inject into your LLM model.
22
23
-[Managed Database](/managed-databases/postgresql-and-mysql/how-to/create-a-database/) to securely store all your embeddings.
23
24
24
-
## Configure your developement environnement
25
+
## Configure your development environment
26
+
25
27
1. Install necessary packages: run the following command to install the required packages:
2. Configure your environnement variables: create a .env file and add the following variables. These will store your API keys, database connection details, and other configuration values.
32
+
2. Configure your environment variables: create a .env file and add the following variables. These will store your API keys, database connection details, and other configuration values.
1. Connect to your PostgreSQL instance and install the pg_vector extension.
69
+
70
+
```python
71
+
conn = psycopg2.connect(
72
+
database="your_database_name",
73
+
user="your_db_user",
74
+
password=os.getenv("SCW_DB_PASSWORD"),
75
+
host="your_db_host",
76
+
port="your_db_port"
77
+
)
78
+
79
+
cur = conn.cursor()
80
+
81
+
# Install pg_vector extension
82
+
cur.execute("CREATE EXTENSION IF NOT EXISTS vector;")
83
+
conn.commit()
84
+
```
85
+
2. To avoid reprocessing documents that have already been loaded and vectorized, create a table in your PostgreSQL database to track them. This ensures that new documents added to your object storage bucket are processed only once, preventing duplicate downloads and redundant vectorization.
86
+
87
+
```python
88
+
cur.execute("CREATE TABLE IF NOT EXISTS object_loaded (id SERIAL PRIMARY KEY, object_key TEXT)")
0 commit comments