Vision tutorial

daisyfaithauma · daisyfaithauma · commit c37fc9c52a4d · 2025-02-04T22:20:58.000Z
diff --git a/src/content/docs/workers-ai/tutorials/llama-vision-tutorial.mdx b/src/content/docs/workers-ai/tutorials/llama-vision-tutorial.mdx
@@ -0,0 +1,144 @@
+---
+updated: 2025-01-17
+difficulty: Beginner
+content_type: 📝 Tutorial
+pcx_content_type: tutorial
+title: Llama 3.2 11B Vision Instruct model on Cloudflare Workers AI
+tags:
+  - AI
+---
+
+import { Details, Render, PackageManagers } from "~/components";
+
+## 1: Prerequisites
+
+Before you begin, ensure you have the following:
+
+1. A [Cloudflare account](https://dash.cloudflare.com/sign-up) with Workers and Workers AI enabled.
+2. Your `CLOUDFLARE_ACCOUNT_ID` and `CLOUDFLARE_AUTH_TOKEN`.
+   - You can generate an API token in your Cloudflare dashboard under API Tokens.
+3. Node.js installed for working with Cloudflare Workers (optional but recommended).
+
+## 2: Agree to Meta's license
+
+The first time you use the [Llama 3.2 11B Vision Instruct](/workers-ai/models/llama-3.2-11b-vision-instruct) model, you need to agree to Meta's License and Acceptable Use Policy.
+
+```bash title="curl"
+curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/meta/llama-3.2-11b-vision-instruct \
+  -X POST \
+  -H "Authorization: Bearer $CLOUDFLARE_AUTH_TOKEN" \
+  -d '{ "prompt": "agree" }'
+```
+
+Replace `$CLOUDFLARE_ACCOUNT_ID` and `$CLOUDFLARE_AUTH_TOKEN` with your actual account ID and token.
+
+## 3: Set up your Cloudflare Worker
+
+1. Create a Worker Project
+   You will create a new Worker project using the `create-cloudflare` CLI (`C3`). This tool simplifies setting up and deploying new applications to Cloudflare.
+
+   Run the following command in your terminal:
+
+<PackageManagers
+	type="create"
+	pkg="cloudflare@latest"
+	args={"llama-vision-tutorial"}
+/>
+
+<Render
+	file="c3-post-run-steps"
+	product="workers"
+	params={{
+		category: "hello-world",
+		type: "Hello World Worker",
+		lang: "JavaScript",
+	}}
+/>
+
+After completing the setup, a new directory called `llama-vision-tutorial` will be created.
+
+3. Navigate to your application directory
+   Change into the project directory:
+
+   ```bash
+   cd llama-vision-tutorial
+   ```
+
+4. Project structure
+   Your `llama-vision-tutorial` directory will include:
+   - A "Hello World" Worker at `src/index.ts`.
+   - A `wrangler.toml` configuration file for managing deployment settings.
+
+## 4: Write the Worker code
+
+Edit the `src/index.ts` (or `index.js` if you're not using TypeScript) file and replace the content with the following code:
+
+```javascript
+export interface Env {
+  AI: Ai;
+}
+
+export default {
+  async fetch(request, env): Promise<Response> {
+    const messages = [
+      { role: "system", content: "You are a helpful assistant." },
+      { role: "user", content: "Describe the image I'm providing." },
+    ];
+
+    // Replace this with your image data encoded as base64 or a URL
+    const imageBase64 = "data:image/png;base64,IMAGE_DATA_HERE";
+
+    const response = await env.AI.run("@cf/meta/llama-3.2-11b-vision-instruct", {
+      messages,
+      image: imageBase64,
+    });
+
+    return Response.json(response);
+  },
+} satisfies ExportedHandler<Env>;
+```
+
+## 5: Bind Workers AI to your Worker
+
+1. Open `wrangler.toml` and add the following configuration:
+
+```toml
+[env]
+[ai]
+binding="AI"
+model = "@cf/meta/llama-3.2-11b-vision-instruct"
+```
+
+2. Save the file.
+
+## 6: Deploy the Worker
+
+Run the following command to deploy your Worker:
+
+```bash
+wrangler deploy
+```
+
+## 7: Test Your Worker
+
+1. After deployment, you will receive a unique URL for your Worker (e.g., `https://llama-vision-tutorial.<your-subdomain>.workers.dev`).
+2. Use a tool like `curl` or Postman to send a request to your Worker:
+
+```bash
+curl -X POST https://llama-vision-tutorial.<your-subdomain>.workers.dev \
+  -d '{ "image": "BASE64_ENCODED_IMAGE" }'
+```
+
+Replace `BASE64_ENCODED_IMAGE` with an actual base64-encoded image string.
+
+## 8: Verify the Response
+
+The response will include the model's output, such as a description or answer to your prompt based on the image provided.
+
+Example response:
+
+```json
+{
+	"result": "This is a golden retriever sitting in a grassy park."
+}
+```