Skip to content

Commit 28aced7

Browse files
Update RAG tutorial to use workflows (#18338)
1 parent 382a95e commit 28aced7

File tree

1 file changed

+242
-48
lines changed

1 file changed

+242
-48
lines changed

src/content/docs/workers-ai/tutorials/build-a-retrieval-augmented-generation-ai.mdx

Lines changed: 242 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
11
---
2-
updated: 2024-08-19
2+
updated: 2024-11-21
33
difficulty: Beginner
44
content_type: 📝 Tutorial
55
pcx_content_type: tutorial
66
title: Build a Retrieval Augmented Generation (RAG) AI
77
products:
88
- Workers
9+
- D1
910
- Vectorize
1011
tags:
1112
- AI
@@ -24,7 +25,7 @@ At the end of this tutorial, you will have built an AI tool that allows you to s
2425

2526
<Render file="prereqs" product="workers" />
2627

27-
You will also need access to [Vectorize](/vectorize/platform/pricing/).
28+
You will also need access to [Vectorize](/vectorize/platform/pricing/). During this tutorial, we will show how you can optionally integrate with [Anthropic Claude](http://anthropic.com) as well. You will need an [Anthropic API key](https://docs.anthropic.com/en/api/getting-started) to do so.
2829

2930
## 1. Create a new Worker project
3031

@@ -196,7 +197,42 @@ Now, we can add a new note to our database using `wrangler d1 execute`:
196197
npx wrangler d1 execute database --remote --command "INSERT INTO notes (text) VALUES ('The best pizza topping is pepperoni')"
197198
```
198199

199-
## 5. Creating notes and adding them to Vectorize
200+
## 5. Creating a workflow
201+
202+
Before we begin creating notes, we will introduce a [Cloudflare Workflow](/workflows). This will allow us to define a durable workflow that can safely and robustly execute all the steps of the RAG process.
203+
204+
To begin, add a new `[[workflows]]` block to `wrangler.toml`:
205+
206+
```toml
207+
# ... existing wrangler configuration
208+
209+
[[workflows]]
210+
name = "rag"
211+
binding = "RAG_WORKFLOW"
212+
class_name = "RAGWorkflow"
213+
```
214+
215+
In `src/index.js`, add a new class called `RAGWorkflow` that extends `Workflow`:
216+
217+
```js
218+
export class RAGWorkflow {
219+
async run(event, step) {
220+
await step.do('example step', async () => {
221+
console.log("Hello World!")
222+
})
223+
}
224+
}
225+
```
226+
227+
This class will define a single workflow step that will log "Hello World!" to the console. You can add as many steps as you need to your workflow.
228+
229+
On its own, this workflow will not do anything. To execute the workflow, we will call the `RAG_WORKFLOW` binding, passing in any parameters that the workflow needs to properly complete. Here is an example of how we can call the workflow:
230+
231+
```js
232+
env.RAG_WORKFLOW.create({ params: { text } })
233+
```
234+
235+
## 6. Creating notes and adding them to Vectorize
200236

201237
To expand on your Workers function in order to handle multiple routes, we will add `hono`, a routing library for Workers. This will allow us to create a new route for adding notes to our database. Install `hono` using `npm`:
202238

@@ -221,61 +257,69 @@ app.get("/", async (c) => {
221257
export default app;
222258
```
223259

224-
This will establish a route at the root path `/` that is functionally equivalent to the previous version of your application. Now, we can add a new route for adding notes to our database.
260+
This will establish a route at the root path `/` that is functionally equivalent to the previous version of your application.
261+
262+
Now, we can update our workflow to begin adding notes to our database, and generating the related embeddings for them.
225263

226264
This example features the [`@cf/baai/bge-base-en-v1.5` model](/workers-ai/models/bge-base-en-v1.5/), which can be used to create an embedding. Embeddings are stored and retrieved inside [Vectorize](/vectorize/), Cloudflare's vector database. The user query is also turned into an embedding so that it can be used for searching within Vectorize.
227265

228266
```js
229-
app.post("/notes", async (c) => {
230-
const { text } = await c.req.json();
231-
if (!text) {
232-
return c.text("Missing text", 400);
233-
}
234-
235-
const { results } = await c.env.DB.prepare(
236-
"INSERT INTO notes (text) VALUES (?) RETURNING *",
237-
)
238-
.bind(text)
239-
.run();
240-
241-
const record = results.length ? results[0] : null;
242-
243-
if (!record) {
244-
return c.text("Failed to create note", 500);
267+
export class RAGWorkflow {
268+
async run(event, step) {
269+
const { text } = event.params
270+
271+
const record = await step.do(`create database record`, async () => {
272+
const query = "INSERT INTO notes (text) VALUES (?) RETURNING *"
273+
274+
const { results } = await env.DATABASE.prepare(query)
275+
.bind(text)
276+
.run()
277+
278+
const record = results[0]
279+
if (!record) throw new Error("Failed to create note")
280+
return record;
281+
})
282+
283+
const embedding = await step.do(`generate embedding`, async () => {
284+
const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: text })
285+
const values = embeddings.data[0]
286+
if (!values) throw new Error("Failed to generate vector embedding")
287+
return values
288+
})
289+
290+
await step.do(`insert vector`, async () => {
291+
return env.VECTOR_INDEX.upsert([
292+
{
293+
id: record.id.toString(),
294+
values: embedding,
295+
}
296+
]);
297+
})
245298
}
246-
247-
const { data } = await c.env.AI.run("@cf/baai/bge-base-en-v1.5", {
248-
text: [text],
249-
});
250-
const values = data[0];
251-
252-
if (!values) {
253-
return c.text("Failed to generate vector embedding", 500);
254-
}
255-
256-
const { id } = record;
257-
const inserted = await c.env.VECTOR_INDEX.upsert([
258-
{
259-
id: id.toString(),
260-
values,
261-
},
262-
]);
263-
264-
return c.json({ id, text, inserted });
265-
});
299+
}
266300
```
267301

268-
This function does the following things:
302+
The workflow does the following things:
269303

270-
1. Parse the JSON body of the request to get the `text` field.
304+
1. Accepts a `text` parameter.
271305
2. Insert a new row into the `notes` table in D1, and retrieve the `id` of the new row.
272306
3. Convert the `text` into a vector using the `embeddings` model of the LLM binding.
273307
4. Upsert the `id` and `vectors` into the `vector-index` index in Vectorize.
274-
5. Return the `id` and `text` of the new note as JSON.
275308

276309
By doing this, you will create a new vector representation of the note, which can be used to retrieve the note later.
277310

278-
## 6. Querying Vectorize to retrieve notes
311+
To complete the code, we will add a route that allows users to submit notes to the database. This route will parse the JSON request body, get the `note` parameter, and create a new instance of the workflow, passing the parameter:
312+
313+
```js
314+
app.post('/notes', async (c) => {
315+
const { text } = await c.req.json();
316+
if (!text) return c.text("Missing text", 400);
317+
await c.env.RAG_WORKFLOW.create({ params: { text } })
318+
return c.text("Created note", 201);
319+
})
320+
```
321+
322+
## 7. Querying Vectorize to retrieve notes
279323

280324
To complete your code, you can update the root path (`/`) to query Vectorize. You will convert the query into a vector, and then use the `vector-index` index to find the most similar vectors.
281325

@@ -333,7 +377,6 @@ app.get('/', async (c) => {
333377
)
334378

335379
return c.text(answer);
336-
337380
});
338381

339382
app.onError((err, c) => {
@@ -343,7 +386,80 @@ app.onError((err, c) => {
343386
export default app;
344387
```
345388

346-
## 7. Deleting notes and vectors
389+
## 8. Adding Anthropic Claude model (optional)
390+
391+
If you are working with larger documents, you have the option to use Anthropic's [Claude models](https://claude.ai/), which have large context windows and are well-suited to RAG workflows.
392+
393+
To begin, install the `@anthropic-ai/sdk` package:
394+
395+
```sh
396+
npm install @anthropic-ai/sdk
397+
```
398+
399+
In `src/index.js`, you can update the `GET /` route to check for the `ANTHROPIC_API_KEY` environment variable. If it's set, we can generate text using the Anthropic SDK. If it isn't set, we'll fall back to the existing Workers AI code:
400+
401+
```js
402+
import Anthropic from '@anthropic-ai/sdk';
403+
404+
app.get('/', async (c) => {
405+
// ... Existing code
406+
const systemPrompt = `When answering the question or responding, use the context provided, if it is provided and relevant.`
407+
408+
let modelUsed: string = ""
409+
let response = null
410+
411+
if (c.env.ANTHROPIC_API_KEY) {
412+
const anthropic = new Anthropic({
413+
apiKey: c.env.ANTHROPIC_API_KEY
414+
})
415+
416+
const model = "claude-3-5-sonnet-latest"
417+
modelUsed = model
418+
419+
const message = await anthropic.messages.create({
420+
max_tokens: 1024,
421+
model,
422+
messages: [
423+
{ role: 'user', content: question }
424+
],
425+
system: [systemPrompt, notes ? contextMessage : ''].join(" ")
426+
})
427+
428+
response = {
429+
response: message.content.map(content => content.text).join("\n")
430+
}
431+
} else {
432+
const model = "@cf/meta/llama-3.1-8b-instruct"
433+
modelUsed = model
434+
435+
response = await c.env.AI.run(
436+
model,
437+
{
438+
messages: [
439+
...(notes.length ? [{ role: 'system', content: contextMessage }] : []),
440+
{ role: 'system', content: systemPrompt },
441+
{ role: 'user', content: question }
442+
]
443+
}
444+
)
445+
}
446+
447+
if (response) {
448+
c.header('x-model-used', modelUsed)
449+
return c.text(response.response)
450+
} else {
451+
return c.text("We were unable to generate output", 500)
452+
}
453+
})
454+
```
455+
456+
Finally, you'll need to set the `ANTHROPIC_API_KEY` environment variable in your Workers application. You can do this by using `wrangler secret put`:
457+
458+
```sh
459+
$ npx wrangler secret put ANTHROPIC_API_KEY
460+
```
461+
462+
## 9. Deleting notes and vectors
347463

348464
If you no longer need a note, you can delete it from the database. Any time that you delete a note, you will also need to delete the corresponding vector from Vectorize. You can implement this by building a `DELETE /notes/:id` route in your `src/index.js` file:
349465

@@ -360,7 +476,85 @@ app.delete("/notes/:id", async (c) => {
360476
});
361477
```
362478

363-
## 8. Deploy your project
479+
## 10. Text splitting (optional)
480+
481+
For large pieces of text, it is recommended to split the text into smaller chunks. This allows LLMs to more effectively gather relevant context, without needing to retrieve large pieces of text.
482+
483+
To implement this, we'll add a new NPM package to our project, `@langchain/textsplitters':
484+
485+
```sh
486+
npm install @cloudflare/textsplitters
487+
```
488+
489+
The `RecursiveCharacterTextSplitter` class provided by this package will split the text into smaller chunks. It can be customized to your liking, but the default config works in most cases:
490+
491+
```js
492+
import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
493+
494+
const text = "Some long piece of text...";
495+
496+
const splitter = new RecursiveCharacterTextSplitter({
497+
// These can be customized to change the chunking size
498+
// chunkSize: 1000,
499+
// chunkOverlap: 200,
500+
});
501+
502+
const output = await splitter.createDocuments([text]);
503+
console.log(output) // [{ pageContent: 'Some long piece of text...' }]
504+
```
505+
506+
To use this splitter, we'll update the workflow to split the text into smaller chunks. We'll then iterate over the chunks and run the rest of the workflow for each chunk of text:
507+
508+
```js
509+
export class RAGWorkflow {
510+
async run(event, step) {
511+
const env = this.env
512+
const { text } = event.payload;
513+
let texts = await step.do('split text', async () => {
514+
const splitter = new RecursiveCharacterTextSplitter();
515+
const output = await splitter.createDocuments([text]);
516+
return output.map(doc => doc.pageContent);
517+
})
518+
519+
console.log("RecursiveCharacterTextSplitter generated ${texts.length} chunks")
520+
521+
for (const index in texts) {
522+
const text = texts[index]
523+
const record = await step.do(`create database record: ${index}/${texts.length}`, async () => {
524+
const query = "INSERT INTO notes (text) VALUES (?) RETURNING *"
525+
526+
const { results } = await env.DATABASE.prepare(query)
527+
.bind(text)
528+
.run()
529+
530+
const record = results[0]
531+
if (!record) throw new Error("Failed to create note")
532+
return record;
533+
})
534+
535+
const embedding = await step.do(`generate embedding: ${index}/${texts.length}`, async () => {
536+
const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: text })
537+
const values = embeddings.data[0]
538+
if (!values) throw new Error("Failed to generate vector embedding")
539+
return values
540+
})
541+
542+
await step.do(`insert vector: ${index}/${texts.length}`, async () => {
543+
return env.VECTOR_INDEX.upsert([
544+
{
545+
id: record.id.toString(),
546+
values: embedding,
547+
}
548+
]);
549+
})
550+
}
551+
}
552+
}
553+
```
554+
555+
Now, when large pieces of text are submitted to the `/notes` endpoint, they will be split into smaller chunks, and each chunk will be processed by the workflow.
556+
557+
## 11. Deploy your project
364558

365559
If you did not deploy your Worker during [step 1](/workers/get-started/guide/#1-create-a-new-worker-project), deploy your Worker via Wrangler, to a `*.workers.dev` subdomain, or a [Custom Domain](/workers/configuration/routing/custom-domains/), if you have one configured. If you have not configured any subdomain or domain, Wrangler will prompt you during the publish process to set one up.
366560

@@ -388,4 +582,4 @@ To do more:
388582
- Explore [Examples](/workers/examples/) to experiment with copy and paste Worker code.
389583
- Understand how Workers works in [Reference](/workers/reference/).
390584
- Learn about Workers features and functionality in [Platform](/workers/platform/).
391-
- Set up [Wrangler](/workers/wrangler/install-and-update/) to programmatically create, test, and deploy your Worker projects.
585+
- Set up [Wrangler](/workers/wrangler/install-and-update/) to programmatically create, test, and deploy your Worker projects.

0 commit comments

Comments
 (0)