You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: Build a Retrieval Augmented Generation (RAG) AI
7
7
products:
8
8
- Workers
9
+
- D1
9
10
- Vectorize
10
11
tags:
11
12
- AI
@@ -24,7 +25,7 @@ At the end of this tutorial, you will have built an AI tool that allows you to s
24
25
25
26
<Renderfile="prereqs"product="workers" />
26
27
27
-
You will also need access to [Vectorize](/vectorize/platform/pricing/).
28
+
You will also need access to [Vectorize](/vectorize/platform/pricing/). During this tutorial, we will show how you can optionally integrate with [Anthropic Claude](http://anthropic.com) as well. You will need an [Anthropic API key](https://docs.anthropic.com/en/api/getting-started) to do so.
28
29
29
30
## 1. Create a new Worker project
30
31
@@ -182,7 +183,42 @@ Now, we can add a new note to our database using `wrangler d1 execute`:
182
183
npx wrangler d1 execute database --remote --command "INSERT INTO notes (text) VALUES ('The best pizza topping is pepperoni')"
183
184
```
184
185
185
-
## 5. Creating notes and adding them to Vectorize
186
+
## 5. Creating a workflow
187
+
188
+
Before we begin creating notes, we will introduce a [Cloudflare Workflow](/workflows). This will allow us to define a durable workflow that can safely and robustly execute all the steps of the RAG process.
189
+
190
+
To begin, add a new `[[workflows]]` block to `wrangler.toml`:
191
+
192
+
```toml
193
+
# ... existing wrangler configuration
194
+
195
+
[[workflows]]
196
+
name = "rag"
197
+
binding = "RAG_WORKFLOW"
198
+
class_name = "RAGWorkflow"
199
+
```
200
+
201
+
In `src/index.js`, add a new class called `RAGWorkflow` that extends `Workflow`:
202
+
203
+
```js
204
+
exportclassRAGWorkflow {
205
+
asyncrun(event, step) {
206
+
awaitstep.do('example step', async () => {
207
+
console.log("Hello World!")
208
+
})
209
+
}
210
+
}
211
+
```
212
+
213
+
This class will define a single workflow step that will log "Hello World!" to the console. You can add as many steps as you need to your workflow.
214
+
215
+
On its own, this workflow will not do anything. To execute the workflow, we will call the `RAG_WORKFLOW` binding, passing in any parameters that the workflow needs to properly complete. Here is an example of how we can call the workflow:
216
+
217
+
```js
218
+
env.RAG_WORKFLOW.create({ params: { text } })
219
+
```
220
+
221
+
## 6. Creating notes and adding them to Vectorize
186
222
187
223
To expand on your Workers function in order to handle multiple routes, we will add `hono`, a routing library for Workers. This will allow us to create a new route for adding notes to our database. Install `hono` using `npm`:
188
224
@@ -207,61 +243,69 @@ app.get("/", async (c) => {
207
243
exportdefaultapp;
208
244
```
209
245
210
-
This will establish a route at the root path `/` that is functionally equivalent to the previous version of your application. Now, we can add a new route for adding notes to our database.
246
+
This will establish a route at the root path `/` that is functionally equivalent to the previous version of your application.
247
+
248
+
Now, we can update our workflow to begin adding notes to our database, and generating the related embeddings for them.
211
249
212
250
This example features the [`@cf/baai/bge-base-en-v1.5` model](/workers-ai/models/bge-base-en-v1.5/), which can be used to create an embedding. Embeddings are stored and retrieved inside [Vectorize](/vectorize/), Cloudflare's vector database. The user query is also turned into an embedding so that it can be used for searching within Vectorize.
213
251
214
252
```js
215
-
app.post("/notes", async (c) => {
216
-
const { text } =awaitc.req.json();
217
-
if (!text) {
218
-
returnc.text("Missing text", 400);
219
-
}
220
-
221
-
const { results } =awaitc.env.DB.prepare(
222
-
"INSERT INTO notes (text) VALUES (?) RETURNING *",
constembeddings=awaitenv.AI.run('@cf/baai/bge-base-en-v1.5', { text: text })
271
+
constvalues=embeddings.data[0]
272
+
if (!values) thrownewError("Failed to generate vector embedding")
273
+
return values
274
+
})
275
+
276
+
awaitstep.do(`insert vector`, async () => {
277
+
returnenv.VECTOR_INDEX.upsert([
278
+
{
279
+
id:record.id.toString(),
280
+
values: embedding,
281
+
}
282
+
]);
283
+
})
231
284
}
232
-
233
-
const { data } =awaitc.env.AI.run("@cf/baai/bge-base-en-v1.5", {
234
-
text: [text],
235
-
});
236
-
constvalues= data[0];
237
-
238
-
if (!values) {
239
-
returnc.text("Failed to generate vector embedding", 500);
240
-
}
241
-
242
-
const { id } = record;
243
-
constinserted=awaitc.env.VECTOR_INDEX.upsert([
244
-
{
245
-
id:id.toString(),
246
-
values,
247
-
},
248
-
]);
249
-
250
-
returnc.json({ id, text, inserted });
251
-
});
285
+
}
252
286
```
253
287
254
-
This function does the following things:
288
+
The workflow does the following things:
255
289
256
-
1.Parse the JSON body of the request to get the `text`field.
290
+
1.Accepts a `text`parameter.
257
291
2. Insert a new row into the `notes` table in D1, and retrieve the `id` of the new row.
258
292
3. Convert the `text` into a vector using the `embeddings` model of the LLM binding.
259
293
4. Upsert the `id` and `vectors` into the `vector-index` index in Vectorize.
260
-
5. Return the `id` and `text` of the new note as JSON.
261
294
262
295
By doing this, you will create a new vector representation of the note, which can be used to retrieve the note later.
263
296
264
-
## 6. Querying Vectorize to retrieve notes
297
+
To complete the code, we will add a route that allows users to submit notes to the database. This route will parse the JSON request body, get the `note` parameter, and create a new instance of the workflow, passing the parameter:
298
+
299
+
```js
300
+
app.post('/notes', async (c) => {
301
+
const { text } =awaitc.req.json();
302
+
if (!text) returnc.text("Missing text", 400);
303
+
awaitc.env.RAG_WORKFLOW.create({ params: { text } })
304
+
returnc.text("Created note", 201);
305
+
})
306
+
```
307
+
308
+
## 7. Querying Vectorize to retrieve notes
265
309
266
310
To complete your code, you can update the root path (`/`) to query Vectorize. You will convert the query into a vector, and then use the `vector-index` index to find the most similar vectors.
267
311
@@ -319,7 +363,6 @@ app.get('/', async (c) => {
319
363
)
320
364
321
365
returnc.text(answer);
322
-
323
366
});
324
367
325
368
app.onError((err, c) => {
@@ -329,7 +372,7 @@ app.onError((err, c) => {
329
372
exportdefaultapp;
330
373
```
331
374
332
-
## 7. Deleting notes and vectors
375
+
## 8. Deleting notes and vectors
333
376
334
377
If you no longer need a note, you can delete it from the database. Any time that you delete a note, you will also need to delete the corresponding vector from Vectorize. You can implement this by building a `DELETE /notes/:id` route in your `src/index.js` file:
For large pieces of text, it is recommended to split the text into smaller chunks. This allows LLMs to more effectively gather relevant context, without receiving _too much_ information.
395
+
396
+
To implement this, we'll add a new NPM package to our project, `@langchain/textsplitters':
397
+
398
+
<PackageManagers
399
+
type="install"
400
+
pkg="@cloudflare/textsplitters"
401
+
/>
402
+
403
+
The `RecursiveCharacterTextSplitter` class provided by this package will split the text into smaller chunks. It can be customized to your liking, but the default config works in most cases:
Now, when large pieces of text are submitted to the `/notes` endpoint, they will be split into smaller chunks, and each chunk will be processed by the workflow.
470
+
471
+
## 10. Deploy your project
350
472
351
473
If you did not deploy your Worker during [step 1](/workers/get-started/guide/#1-create-a-new-worker-project), deploy your Worker via Wrangler, to a `*.workers.dev` subdomain, or a [Custom Domain](/workers/configuration/routing/custom-domains/), if you have one configured. If you have not configured any subdomain or domain, Wrangler will prompt you during the publish process to set one up.
352
474
@@ -374,4 +496,4 @@ To do more:
374
496
- Explore [Examples](/workers/examples/) to experiment with copy and paste Worker code.
375
497
- Understand how Workers works in [Reference](/workers/reference/).
376
498
- Learn about Workers features and functionality in [Platform](/workers/platform/).
377
-
- Set up [Wrangler](/workers/wrangler/install-and-update/) to programmatically create, test, and deploy your Worker projects.
499
+
- Set up [Wrangler](/workers/wrangler/install-and-update/) to programmatically create, test, and deploy your Worker projects.
0 commit comments