Update RAG tutorial to use workflows

kristianfreeman · kristianfreeman · commit bb9186f296fa · 2024-11-21T11:37:22.000-06:00
diff --git a/src/content/docs/workers-ai/tutorials/build-a-retrieval-augmented-generation-ai.mdx b/src/content/docs/workers-ai/tutorials/build-a-retrieval-augmented-generation-ai.mdx
@@ -1,11 +1,12 @@
 ---
-updated: 2024-08-19
+updated: 2024-11-21
 difficulty: Beginner
 content_type: 📝 Tutorial
 pcx_content_type: tutorial
 title: Build a Retrieval Augmented Generation (RAG) AI
 products:
   - Workers
+  - D1
   - Vectorize
 tags:
   - AI
@@ -24,7 +25,7 @@ At the end of this tutorial, you will have built an AI tool that allows you to s
 
 <Render file="prereqs" product="workers" />
 
-You will also need access to [Vectorize](/vectorize/platform/pricing/).
+You will also need access to [Vectorize](/vectorize/platform/pricing/). During this tutorial, we will show how you can optionally integrate with [Anthropic Claude](http://anthropic.com) as well. You will need an [Anthropic API key](https://docs.anthropic.com/en/api/getting-started) to do so.
 
 ## 1. Create a new Worker project
 
@@ -182,7 +183,42 @@ Now, we can add a new note to our database using `wrangler d1 execute`:
 npx wrangler d1 execute database --remote --command "INSERT INTO notes (text) VALUES ('The best pizza topping is pepperoni')"
 ```
 
-## 5. Creating notes and adding them to Vectorize
+## 5. Creating a workflow
+
+Before we begin creating notes, we will introduce a [Cloudflare Workflow](/workflows). This will allow us to define a durable workflow that can safely and robustly execute all the steps of the RAG process.
+
+To begin, add a new `[[workflows]]` block to `wrangler.toml`:
+
+```toml
+# ... existing wrangler configuration
+
+[[workflows]]
+name = "rag"
+binding = "RAG_WORKFLOW"
+class_name = "RAGWorkflow"
+```
+
+In `src/index.js`, add a new class called `RAGWorkflow` that extends `Workflow`:
+
+```js
+export class RAGWorkflow {
+	async run(event, step) {
+		await step.do('example step', async () => {
+			console.log("Hello World!")
+		})
+	}
+}
+```
+
+This class will define a single workflow step that will log "Hello World!" to the console. You can add as many steps as you need to your workflow.
+
+On its own, this workflow will not do anything. To execute the workflow, we will call the `RAG_WORKFLOW` binding, passing in any parameters that the workflow needs to properly complete. Here is an example of how we can call the workflow:
+
+```js
+env.RAG_WORKFLOW.create({ params: { text } })
+```
+
+## 6. Creating notes and adding them to Vectorize
 
 To expand on your Workers function in order to handle multiple routes, we will add `hono`, a routing library for Workers. This will allow us to create a new route for adding notes to our database. Install `hono` using `npm`:
 
@@ -207,61 +243,69 @@ app.get("/", async (c) => {
 export default app;
 ```
 
-This will establish a route at the root path `/` that is functionally equivalent to the previous version of your application. Now, we can add a new route for adding notes to our database.
+This will establish a route at the root path `/` that is functionally equivalent to the previous version of your application.
+
+Now, we can update our workflow to begin adding notes to our database, and generating the related embeddings for them.
 
 This example features the [`@cf/baai/bge-base-en-v1.5` model](/workers-ai/models/bge-base-en-v1.5/), which can be used to create an embedding. Embeddings are stored and retrieved inside [Vectorize](/vectorize/), Cloudflare's vector database. The user query is also turned into an embedding so that it can be used for searching within Vectorize.
 
 ```js
-app.post("/notes", async (c) => {
-	const { text } = await c.req.json();
-	if (!text) {
-		return c.text("Missing text", 400);
-	}
-
-	const { results } = await c.env.DB.prepare(
-		"INSERT INTO notes (text) VALUES (?) RETURNING *",
-	)
-		.bind(text)
-		.run();
-
-	const record = results.length ? results[0] : null;
-
-	if (!record) {
-		return c.text("Failed to create note", 500);
+export class RAGWorkflow {
+	async run(event, step) {
+		const { text } = event.params
+
+		const record = await step.do(`create database record`, async () => {
+			const query = "INSERT INTO notes (text) VALUES (?) RETURNING *"
+
+			const { results } = await env.DATABASE.prepare(query)
+				.bind(text)
+				.run()
+
+			const record = results[0]
+			if (!record) throw new Error("Failed to create note")
+			return record;
+		})
+
+		const embedding = await step.do(`generate embedding`, async () => {
+			const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: text })
+			const values = embeddings.data[0]
+			if (!values) throw new Error("Failed to generate vector embedding")
+			return values
+		})
+
+		await step.do(`insert vector`, async () => {
+			return env.VECTOR_INDEX.upsert([
+				{
+					id: record.id.toString(),
+					values: embedding,
+				}
+			]);
+		})
 	}
-
-	const { data } = await c.env.AI.run("@cf/baai/bge-base-en-v1.5", {
-		text: [text],
-	});
-	const values = data[0];
-
-	if (!values) {
-		return c.text("Failed to generate vector embedding", 500);
-	}
-
-	const { id } = record;
-	const inserted = await c.env.VECTOR_INDEX.upsert([
-		{
-			id: id.toString(),
-			values,
-		},
-	]);
-
-	return c.json({ id, text, inserted });
-});
+}
 ```
 
-This function does the following things:
+The workflow does the following things:
 
-1. Parse the JSON body of the request to get the `text` field.
+1. Accepts a `text` parameter.
 2. Insert a new row into the `notes` table in D1, and retrieve the `id` of the new row.
 3. Convert the `text` into a vector using the `embeddings` model of the LLM binding.
 4. Upsert the `id` and `vectors` into the `vector-index` index in Vectorize.
-5. Return the `id` and `text` of the new note as JSON.
 
 By doing this, you will create a new vector representation of the note, which can be used to retrieve the note later.
 
-## 6. Querying Vectorize to retrieve notes
+To complete the code, we will add a route that allows users to submit notes to the database. This route will parse the JSON request body, get the `note` parameter, and create a new instance of the workflow, passing the parameter:
+
+```js
+app.post('/notes', async (c) => {
+	const { text } = await c.req.json();
+	if (!text) return c.text("Missing text", 400);
+	await c.env.RAG_WORKFLOW.create({ params: { text } })
+	return c.text("Created note", 201);
+})
+```
+
+## 7. Querying Vectorize to retrieve notes
 
 To complete your code, you can update the root path (`/`) to query Vectorize. You will convert the query into a vector, and then use the `vector-index` index to find the most similar vectors.
 
@@ -319,7 +363,6 @@ app.get('/', async (c) => {
   )
 
   return c.text(answer);
-
 });
 
 app.onError((err, c) => {
@@ -329,7 +372,7 @@ app.onError((err, c) => {
 export default app;
 ```
 
-## 7. Deleting notes and vectors
+## 8. Deleting notes and vectors
 
 If you no longer need a note, you can delete it from the database. Any time that you delete a note, you will also need to delete the corresponding vector from Vectorize. You can implement this by building a `DELETE /notes/:id` route in your `src/index.js` file:
 
@@ -346,7 +389,86 @@ app.delete("/notes/:id", async (c) => {
 });
 ```
 
-## 8. Deploy your project
+## 9. Text splitting (optional)
+
+For large pieces of text, it is recommended to split the text into smaller chunks. This allows LLMs to more effectively gather relevant context, without receiving _too much_ information.
+
+To implement this, we'll add a new NPM package to our project, `@langchain/textsplitters':
+
+<PackageManagers
+	type="install"
+	pkg="@cloudflare/textsplitters"
+/>
+
+The `RecursiveCharacterTextSplitter` class provided by this package will split the text into smaller chunks. It can be customized to your liking, but the default config works in most cases:
+
+```js
+import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";
+
+const text = "Some long piece of text...";
+
+const splitter = new RecursiveCharacterTextSplitter({
+	// These can be customized to change the chunking size
+	// chunkSize: 1000,
+	// chunkOverlap: 200,
+});
+
+const output = await splitter.createDocuments([text]);
+console.log(output) // [{ pageContent: 'Some long piece of text...' }]
+```
+
+To use this splitter, we'll update the workflow to split the text into smaller chunks, and then run the rest of the workflow for every chunk of text:
+
+```js
+export class RAGWorkflow {
+	async run(event, step) {
+		const env = this.env
+		const { text } = event.payload;
+		let texts = await step.do('split text', async () => {
+			const splitter = new RecursiveCharacterTextSplitter();
+			const output = await splitter.createDocuments([text]);
+			return output.map(doc => doc.pageContent);
+		})
+
+		console.log("RecursiveCharacterTextSplitter generated ${texts.length} chunks")
+
+		for (const index in texts) {
+			const text = texts[index]
+			const record = await step.do(`create database record: ${index}/${texts.length}`, async () => {
+				const query = "INSERT INTO notes (text) VALUES (?) RETURNING *"
+
+				const { results } = await env.DATABASE.prepare(query)
+					.bind(text)
+					.run()
+
+				const record = results[0]
+				if (!record) throw new Error("Failed to create note")
+				return record;
+			})
+
+			const embedding = await step.do(`generate embedding: ${index}/${texts.length}`, async () => {
+				const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', { text: text })
+				const values = embeddings.data[0]
+				if (!values) throw new Error("Failed to generate vector embedding")
+				return values
+			})
+
+			await step.do(`insert vector: ${index}/${texts.length}`, async () => {
+				return env.VECTOR_INDEX.upsert([
+					{
+						id: record.id.toString(),
+						values: embedding,
+					}
+				]);
+			})
+		}
+	}
+}
+```
+
+Now, when large pieces of text are submitted to the `/notes` endpoint, they will be split into smaller chunks, and each chunk will be processed by the workflow.
+
+## 10. Deploy your project
 
 If you did not deploy your Worker during [step 1](/workers/get-started/guide/#1-create-a-new-worker-project), deploy your Worker via Wrangler, to a `*.workers.dev` subdomain, or a [Custom Domain](/workers/configuration/routing/custom-domains/), if you have one configured. If you have not configured any subdomain or domain, Wrangler will prompt you during the publish process to set one up.
 
@@ -374,4 +496,4 @@ To do more:
 - Explore [Examples](/workers/examples/) to experiment with copy and paste Worker code.
 - Understand how Workers works in [Reference](/workers/reference/).
 - Learn about Workers features and functionality in [Platform](/workers/platform/).
-- Set up [Wrangler](/workers/wrangler/install-and-update/) to programmatically create, test, and deploy your Worker projects.
+- Set up [Wrangler](/workers/wrangler/install-and-update/) to programmatically create, test, and deploy your Worker projects.