Skip to content

Commit bb9186f

Browse files
Update RAG tutorial to use workflows
1 parent 969ed00 commit bb9186f

File tree

1 file changed

+170
-48
lines changed

1 file changed

+170
-48
lines changed

src/content/docs/workers-ai/tutorials/build-a-retrieval-augmented-generation-ai.mdx

Lines changed: 170 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
11
---
2-
updated: 2024-08-19
2+
updated: 2024-11-21
33
difficulty: Beginner
44
content_type: 📝 Tutorial
55
pcx_content_type: tutorial
66
title: Build a Retrieval Augmented Generation (RAG) AI
77
products:
88
- Workers
9+
- D1
910
- Vectorize
1011
tags:
1112
- AI
@@ -24,7 +25,7 @@ At the end of this tutorial, you will have built an AI tool that allows you to s
2425

2526
<Render file="prereqs" product="workers" />
2627

27-
You will also need access to [Vectorize](/vectorize/platform/pricing/).
28+
You will also need access to [Vectorize](/vectorize/platform/pricing/). During this tutorial, we will show how you can optionally integrate with [Anthropic Claude](http://anthropic.com) as well. You will need an [Anthropic API key](https://docs.anthropic.com/en/api/getting-started) to do so.
2829

2930
## 1. Create a new Worker project
3031

@@ -182,7 +183,42 @@ Now, we can add a new note to our database using `wrangler d1 execute`:
182183
npx wrangler d1 execute database --remote --command "INSERT INTO notes (text) VALUES ('The best pizza topping is pepperoni')"
183184
```
184185

185-
## 5. Creating notes and adding them to Vectorize
186+
## 5. Creating a workflow
187+
188+
Before we begin creating notes, we will introduce a [Cloudflare Workflow](/workflows). This will allow us to define a durable workflow that can safely and robustly execute all the steps of the RAG process.
189+
190+
To begin, add a new `[[workflows]]` block to `wrangler.toml`:
191+
192+
```toml
193+
# ... existing wrangler configuration
194+
195+
[[workflows]]
196+
name = "rag"
197+
binding = "RAG_WORKFLOW"
198+
class_name = "RAGWorkflow"
199+
```
200+
201+
In `src/index.js`, add a new class called `RAGWorkflow` that extends `Workflow`:
202+
203+
```js
204+
export class RAGWorkflow {
205+
async run(event, step) {
206+
await step.do('example step', async () => {
207+
console.log("Hello World!")
208+
})
209+
}
210+
}
211+
```
212+
213+
This class will define a single workflow step that will log "Hello World!" to the console. You can add as many steps as you need to your workflow.
214+
215+
On its own, this workflow will not do anything. To execute the workflow, we will call the `RAG_WORKFLOW` binding, passing in any parameters that the workflow needs to properly complete. Here is an example of how we can call the workflow:
216+
217+
```js
218+
env.RAG_WORKFLOW.create({ params: { text } })
219+
```
220+
221+
## 6. Creating notes and adding them to Vectorize
186222

187223
To expand on your Workers function in order to handle multiple routes, we will add `hono`, a routing library for Workers. This will allow us to create a new route for adding notes to our database. Install `hono` using `npm`:
188224

@@ -207,61 +243,69 @@ app.get("/", async (c) => {
207243
export default app;
208244
```
209245

210-
This will establish a route at the root path `/` that is functionally equivalent to the previous version of your application. Now, we can add a new route for adding notes to our database.
246+
This will establish a route at the root path `/` that is functionally equivalent to the previous version of your application.
247+
248+
Now, we can update our workflow to begin adding notes to our database, and generating the related embeddings for them.
211249

212250
This example features the [`@cf/baai/bge-base-en-v1.5` model](/workers-ai/models/bge-base-en-v1.5/), which can be used to create an embedding. Embeddings are stored and retrieved inside [Vectorize](/vectorize/), Cloudflare's vector database. The user query is also turned into an embedding so that it can be used for searching within Vectorize.
213251

214252
```js
215-
app.post("/notes", async (c) => {
216-
const { text } = await c.req.json();
217-
if (!text) {
218-
return c.text("Missing text", 400);
219-
}
220-
221-
const { results } = await c.env.DB.prepare(
222-
"INSERT INTO notes (text) VALUES (?) RETURNING *",
223-
)
224-
.bind(text)
225-
.run();
226-
227-
const record = results.length ? results[0] : null;
228-
229-
if (!record) {
230-
return c.text("Failed to create note", 500);
253+
export class RAGWorkflow {
254+
async run(event, step) {
255+
const { text } = event.params
256+
257+
const record = await step.do(`create database record`, async () => {
258+
const query = "INSERT INTO notes (text) VALUES (?) RETURNING *"
259+
260+
const { results } = await env.DATABASE.prepare(query)
261+
.bind(text)
262+
.run()
263+
264+
const record = results[0]
265+
if (!record) throw new Error("Failed to create note")
266+
return record;
267+
})
268+
269+
const embedding = await step.do(`generate embedding`, async () => {
270+
const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: text })
271+
const values = embeddings.data[0]
272+
if (!values) throw new Error("Failed to generate vector embedding")
273+
return values
274+
})
275+
276+
await step.do(`insert vector`, async () => {
277+
return env.VECTOR_INDEX.upsert([
278+
{
279+
id: record.id.toString(),
280+
values: embedding,
281+
}
282+
]);
283+
})
231284
}
232-
233-
const { data } = await c.env.AI.run("@cf/baai/bge-base-en-v1.5", {
234-
text: [text],
235-
});
236-
const values = data[0];
237-
238-
if (!values) {
239-
return c.text("Failed to generate vector embedding", 500);
240-
}
241-
242-
const { id } = record;
243-
const inserted = await c.env.VECTOR_INDEX.upsert([
244-
{
245-
id: id.toString(),
246-
values,
247-
},
248-
]);
249-
250-
return c.json({ id, text, inserted });
251-
});
285+
}
252286
```
253287

254-
This function does the following things:
288+
The workflow does the following things:
255289

256-
1. Parse the JSON body of the request to get the `text` field.
290+
1. Accepts a `text` parameter.
257291
2. Insert a new row into the `notes` table in D1, and retrieve the `id` of the new row.
258292
3. Convert the `text` into a vector using the `embeddings` model of the LLM binding.
259293
4. Upsert the `id` and `vectors` into the `vector-index` index in Vectorize.
260-
5. Return the `id` and `text` of the new note as JSON.
261294

262295
By doing this, you will create a new vector representation of the note, which can be used to retrieve the note later.
263296

264-
## 6. Querying Vectorize to retrieve notes
297+
To complete the code, we will add a route that allows users to submit notes to the database. This route will parse the JSON request body, get the `note` parameter, and create a new instance of the workflow, passing the parameter:
298+
299+
```js
300+
app.post('/notes', async (c) => {
301+
const { text } = await c.req.json();
302+
if (!text) return c.text("Missing text", 400);
303+
await c.env.RAG_WORKFLOW.create({ params: { text } })
304+
return c.text("Created note", 201);
305+
})
306+
```
307+
308+
## 7. Querying Vectorize to retrieve notes
265309

266310
To complete your code, you can update the root path (`/`) to query Vectorize. You will convert the query into a vector, and then use the `vector-index` index to find the most similar vectors.
267311

@@ -319,7 +363,6 @@ app.get('/', async (c) => {
319363
)
320364

321365
return c.text(answer);
322-
323366
});
324367

325368
app.onError((err, c) => {
@@ -329,7 +372,7 @@ app.onError((err, c) => {
329372
export default app;
330373
```
331374

332-
## 7. Deleting notes and vectors
375+
## 8. Deleting notes and vectors
333376

334377
If you no longer need a note, you can delete it from the database. Any time that you delete a note, you will also need to delete the corresponding vector from Vectorize. You can implement this by building a `DELETE /notes/:id` route in your `src/index.js` file:
335378

@@ -346,7 +389,86 @@ app.delete("/notes/:id", async (c) => {
346389
});
347390
```
348391

349-
## 8. Deploy your project
392+
## 9. Text splitting (optional)
393+
394+
For large pieces of text, it is recommended to split the text into smaller chunks. This allows LLMs to more effectively gather relevant context, without receiving _too much_ information.
395+
396+
To implement this, we'll add a new NPM package to our project, `@langchain/textsplitters':
397+
398+
<PackageManagers
399+
type="install"
400+
pkg="@cloudflare/textsplitters"
401+
/>
402+
403+
The `RecursiveCharacterTextSplitter` class provided by this package will split the text into smaller chunks. It can be customized to your liking, but the default config works in most cases:
404+
405+
```js
406+
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
407+
408+
const text = "Some long piece of text...";
409+
410+
const splitter = new RecursiveCharacterTextSplitter({
411+
// These can be customized to change the chunking size
412+
// chunkSize: 1000,
413+
// chunkOverlap: 200,
414+
});
415+
416+
const output = await splitter.createDocuments([text]);
417+
console.log(output) // [{ pageContent: 'Some long piece of text...' }]
418+
```
419+
420+
To use this splitter, we'll update the workflow to split the text into smaller chunks, and then run the rest of the workflow for every chunk of text:
421+
422+
```js
423+
export class RAGWorkflow {
424+
async run(event, step) {
425+
const env = this.env
426+
const { text } = event.payload;
427+
let texts = await step.do('split text', async () => {
428+
const splitter = new RecursiveCharacterTextSplitter();
429+
const output = await splitter.createDocuments([text]);
430+
return output.map(doc => doc.pageContent);
431+
})
432+
433+
console.log("RecursiveCharacterTextSplitter generated ${texts.length} chunks")
434+
435+
for (const index in texts) {
436+
const text = texts[index]
437+
const record = await step.do(`create database record: ${index}/${texts.length}`, async () => {
438+
const query = "INSERT INTO notes (text) VALUES (?) RETURNING *"
439+
440+
const { results } = await env.DATABASE.prepare(query)
441+
.bind(text)
442+
.run()
443+
444+
const record = results[0]
445+
if (!record) throw new Error("Failed to create note")
446+
return record;
447+
})
448+
449+
const embedding = await step.do(`generate embedding: ${index}/${texts.length}`, async () => {
450+
const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: text })
451+
const values = embeddings.data[0]
452+
if (!values) throw new Error("Failed to generate vector embedding")
453+
return values
454+
})
455+
456+
await step.do(`insert vector: ${index}/${texts.length}`, async () => {
457+
return env.VECTOR_INDEX.upsert([
458+
{
459+
id: record.id.toString(),
460+
values: embedding,
461+
}
462+
]);
463+
})
464+
}
465+
}
466+
}
467+
```
468+
469+
Now, when large pieces of text are submitted to the `/notes` endpoint, they will be split into smaller chunks, and each chunk will be processed by the workflow.
470+
471+
## 10. Deploy your project
350472

351473
If you did not deploy your Worker during [step 1](/workers/get-started/guide/#1-create-a-new-worker-project), deploy your Worker via Wrangler, to a `*.workers.dev` subdomain, or a [Custom Domain](/workers/configuration/routing/custom-domains/), if you have one configured. If you have not configured any subdomain or domain, Wrangler will prompt you during the publish process to set one up.
352474

@@ -374,4 +496,4 @@ To do more:
374496
- Explore [Examples](/workers/examples/) to experiment with copy and paste Worker code.
375497
- Understand how Workers works in [Reference](/workers/reference/).
376498
- Learn about Workers features and functionality in [Platform](/workers/platform/).
377-
- Set up [Wrangler](/workers/wrangler/install-and-update/) to programmatically create, test, and deploy your Worker projects.
499+
- Set up [Wrangler](/workers/wrangler/install-and-update/) to programmatically create, test, and deploy your Worker projects.

0 commit comments

Comments
 (0)