draft

daisyfaithauma · daisyfaithauma · commit 85ade057d6b9 · 2025-04-03T16:06:23.000+01:00
diff --git a/src/content/docs/workers-ai/tutorials/build-a-workers-ai-whisper-with-chunking.mdx b/src/content/docs/workers-ai/tutorials/build-a-workers-ai-whisper-with-chunking.mdx
@@ -0,0 +1,244 @@
+---
+updated: 2025-03-04
+difficulty: Beginner
+pcx_content_type: tutorial
+title: Whisper-large-v3-turbo with Cloudflare Workers AI
+tags:
+  - AI
+---
+
+In this tutorial you will learn how to:
+
+- **Transcribe large audio files:** Use the [Whisper-large-v3-turbo](/workers-ai/models/whisper-large-v3-turbo/) model from Cloudflare Workers AI to perform automatic speech recognition (ASR) or translation.
+- **Handle large files:** Split large audio files into smaller chunks for processing, which helps overcome memory and execution time limitations.
+- **Deploy using Cloudflare Workers:** Create a scalable, low‑latency transcription pipeline in a serverless environment.
+
+## Step 1: Create a New Cloudflare Worker Project
+
+import { Render, PackageManagers, WranglerConfig } from "~/components";
+
+This guide will instruct you through setting up and deploying your first Workers AI project. You will use [Workers](/workers/), a Workers AI binding, and a large language model (LLM) to deploy your first AI-powered application on the Cloudflare global network.
+
+<Render file="prereqs" product="workers" />
+
+## 1. Create a Worker project
+
+You will create a new Worker project using the `create-cloudflare` CLI (C3). [C3](https://github.com/cloudflare/workers-sdk/tree/main/packages/create-cloudflare) is a command-line tool designed to help you set up and deploy new applications to Cloudflare.
+
+Create a new project named `hello-ai` by running:
+
+<PackageManagers type="create" pkg="cloudflare@latest" args={"hello-ai"} />
+
+Running `npm create cloudflare@latest` will prompt you to install the [`create-cloudflare` package](https://www.npmjs.com/package/create-cloudflare), and lead you through setup. C3 will also install [Wrangler](/workers/wrangler/), the Cloudflare Developer Platform CLI.
+
+<Render
+	file="c3-post-run-steps"
+	product="workers"
+	params={{
+		category: "hello-world",
+		type: "Worker only",
+		lang: "TypeScript",
+	}}
+/>
+
+This will create a new `hello-ai` directory. Your new `hello-ai` directory will include:
+
+- A `"Hello World"` [Worker](/workers/get-started/guide/#3-write-code) at `src/index.ts`.
+- A [`wrangler.jsonc`](/workers/wrangler/configuration/) configuration file.
+
+Go to your application directory:
+
+```sh
+cd hello-ai
+```
+
+## 2. Connect your Worker to Workers AI
+
+You must create an AI binding for your Worker to connect to Workers AI. [Bindings](/workers/runtime-apis/bindings/) allow your Workers to interact with resources, like Workers AI, on the Cloudflare Developer Platform.
+
+To bind Workers AI to your Worker, add the following to the end of your Wrangler file:
+
+<WranglerConfig>
+
+```toml
+[ai]
+binding = "AI"
+```
+
+</WranglerConfig>
+
+Your binding is [available in your Worker code](/workers/reference/migrate-to-module-workers/#bindings-in-es-modules-format) on [`env.AI`](/workers/runtime-apis/handlers/fetch/).
+
+3. **Navigate to Your Project Directory:**
+
+```
+cd whisper-tutorial
+```
+
+## Step 2: Configure Wrangler
+
+1. **Enable Node.js Compatibility:**
+
+   In your `wrangler.toml` file, add or update the following settings to enable Node.js APIs and polyfills (with a compatibility date of 2024‑09‑23 or later):
+
+```
+compatibility_date = "2024-09-23"
+nodejs_compat = true
+```
+
+2. **Add the AI Binding:**
+
+   In the same file, add an AI binding so that you can use Cloudflare’s AI models in your Worker:
+
+```
+[ai]
+binding = "AI"
+```
+
+## Step 3: Full TypeScript Code – Handling Large Audio Files with Chunking
+
+Replace the contents of your `src/index.ts` file with the following integrated code. This sample demonstrates how to:
+
+- Extract an audio file URL from the query parameters.
+- Fetch the audio file while explicitly following redirects.
+- Split the audio file into smaller chunks (e.g., 1MB chunks).
+- Transcribe each chunk using the Whisper‑large‑v3‑turbo model via the Cloudflare AI binding.
+- Return the aggregated transcription as plain text.
+
+```
+import { Buffer } from "node:buffer";
+import type { Ai } from "workers-ai";
+
+export interface Env {
+  AI: Ai;
+  // If needed, add your KV namespace for storing transcripts.
+  // MY_KV_NAMESPACE: KVNamespace;
+}
+
+/**
+ * Fetches the audio file from the provided URL and splits it into chunks.
+ * This function explicitly follows redirects.
+ *
+ * @param audioUrl - The URL of the audio file.
+ * @returns An array of ArrayBuffers, each representing a chunk of the audio.
+ */
+async function getAudioChunks(audioUrl: string): Promise<ArrayBuffer[]> {
+  const response = await fetch(audioUrl, { redirect: "follow" });
+  if (!response.ok) {
+    throw new Error(`Failed to fetch audio: ${response.status}`);
+  }
+  const arrayBuffer = await response.arrayBuffer();
+
+  // Example: Split the audio into 1MB chunks.
+  const chunkSize = 1024 * 1024; // 1MB
+  const chunks: ArrayBuffer[] = [];
+  for (let i = 0; i < arrayBuffer.byteLength; i += chunkSize) {
+    const chunk = arrayBuffer.slice(i, i + chunkSize);
+    chunks.push(chunk);
+  }
+  return chunks;
+}
+
+/**
+ * Transcribes a single audio chunk using the Whisper‑large‑v3‑turbo model.
+ * The function converts the audio chunk to a Base64-encoded string and
+ * sends it to the model via the AI binding.
+ *
+ * @param chunkBuffer - The audio chunk as an ArrayBuffer.
+ * @param env - The Cloudflare Worker environment, including the AI binding.
+ * @returns The transcription text from the model.
+ */
+async function transcribeChunk(chunkBuffer: ArrayBuffer, env: Env): Promise<string> {
+  const base64 = Buffer.from(chunkBuffer, "binary").toString("base64");
+  const res = await env.AI.run("@cf/openai/whisper-large-v3-turbo", {
+    audio: base64,
+    // Optional parameters (uncomment and set if needed):
+    // task: "transcribe",   // or "translate"
+    // language: "en",
+    // vad_filter: "false",
+    // initial_prompt: "Provide context if needed.",
+    // prefix: "Transcription:",
+  });
+  return res.text; // Assumes the transcription result includes a "text" property.
+}
+
+/**
+ * The main fetch handler. It extracts the 'url' query parameter, fetches the audio,
+ * processes it in chunks, and returns the full transcription.
+ */
+export default {
+  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
+    // Extract the audio URL from the query parameters.
+    const { searchParams } = new URL(request.url);
+    const audioUrl = searchParams.get("url");
+
+    if (!audioUrl) {
+      return new Response("Missing 'url' query parameter", { status: 400 });
+    }
+
+    // Get the audio chunks.
+    const audioChunks: ArrayBuffer[] = await getAudioChunks(audioUrl);
+    let fullTranscript = "";
+
+    // Process each chunk and build the full transcript.
+    for (const chunk of audioChunks) {
+      try {
+        const transcript = await transcribeChunk(chunk, env);
+        fullTranscript += transcript + "\n";
+      } catch (error) {
+        fullTranscript += "[Error transcribing chunk]\n";
+      }
+    }
+
+    return new Response(fullTranscript, {
+      headers: { "Content-Type": "text/plain" },
+    });
+  },
+} satisfies ExportedHandler<Env>;
+```
+
+---
+
+## Step 4: Develop, Test, and Deploy
+
+1. **Run the Worker Locally:**
+
+   Use Wrangler's development mode to test your Worker locally:
+
+```
+npx wrangler dev --remote
+```
+
+   Open your browser and visit [http://localhost:8787](http://localhost:8787), or use curl:
+
+```
+curl "http://localhost:8787?url=https://raw.githubusercontent.com/your-username/your-repo/main/your-audio-file.mp3"
+```
+
+   Replace the URL query parameter with the direct link to your audio file. (For GitHub-hosted files, ensure you use the raw file URL.)
+
+
+
+2. **Deploy the Worker:**
+
+   Once testing is complete, deploy your Worker with:
+
+```
+npx wrangler deploy
+```
+
+3. **Test the Deployed Worker:**
+
+   After deployment, test your Worker by passing the audio URL as a query parameter:
+
+```
+curl "https://<your-worker-subdomain>.workers.dev?url=https://raw.githubusercontent.com/your-username/your-repo/main/your-audio-file.mp3"
+```
+
+   Make sure to replace `<your-worker-subdomain>`, `your-username`, `your-repo`, and `your-audio-file.mp3` with your actual details.
+
+If successful, the Worker will return a transcript of the audio file:
+
+```sh
+This is the transcript of the audio...
+``