Add Meta_LLaMA3_SyntheticData notebook

Dhivya-Bharathy · Dhivya-Bharathy · commit 0d5935c3bef8 · 2025-06-18T07:36:01.000Z
diff --git a/examples/cookbooks/Meta_LLaMA3_SyntheticData.ipynb b/examples/cookbooks/Meta_LLaMA3_SyntheticData.ipynb
@@ -0,0 +1,157 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "id": "930bc11c",
+      "metadata": {
+        "id": "930bc11c"
+      },
+      "source": [
+        "# Meta Synthetic Data Generator (LLaMA 3.2 - 3B)\n",
+        "\n",
+        "This notebook demonstrates how to use Meta's LLaMA 3.2 3B model to generate synthetic data for use in AI training or application prototyping."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhivyaBharathy-web/PraisonAI/blob/main/examples/cookbooks/Meta_LLaMA3_SyntheticData.ipynb)\n"
+      ],
+      "metadata": {
+        "id": "KPi7FpbV9J2d"
+      },
+      "id": "KPi7FpbV9J2d"
+    },
+    {
+      "cell_type": "markdown",
+      "id": "80f68ecf",
+      "metadata": {
+        "id": "80f68ecf"
+      },
+      "source": [
+        "## Dependencies"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "7eeb508d",
+      "metadata": {
+        "id": "7eeb508d"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install -q transformers accelerate bitsandbytes"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "id": "eda75a0c",
+      "metadata": {
+        "id": "eda75a0c"
+      },
+      "source": [
+        "## Tools\n",
+        "* `transformers` for model loading and text generation\n",
+        "* `pipeline` for simplified inference\n",
+        "* `AutoTokenizer`, `AutoModelForCausalLM` for LLaMA 3.2"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "id": "6ee96383",
+      "metadata": {
+        "id": "6ee96383"
+      },
+      "source": [
+        "## YAML Prompt"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "aa4e56ef",
+      "metadata": {
+        "id": "aa4e56ef"
+      },
+      "outputs": [],
+      "source": [
+        "prompt: |\n",
+        "  task: \"Generate a customer complaint email\"\n",
+        "  style: \"Professional\"\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "id": "e9fbe400",
+      "metadata": {
+        "id": "e9fbe400"
+      },
+      "source": [
+        "## Main"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "id": "3e35336f",
+      "metadata": {
+        "id": "3e35336f"
+      },
+      "outputs": [],
+      "source": [
+        "from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline\n",
+        "\n",
+        "# Load tokenizer and model (LLaMA 3.2 - 3B)\n",
+        "model_id = \"meta-llama/Meta-Llama-3-3B-Instruct\"\n",
+        "tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
+        "model = AutoModelForCausalLM.from_pretrained(model_id)\n",
+        "\n",
+        "# Create a simple text generation pipeline\n",
+        "generator = pipeline(\"text-generation\", model=model, tokenizer=tokenizer)\n",
+        "\n",
+        "# Example synthetic data prompt\n",
+        "prompt = \"Create a customer support query about a late delivery.\"\n",
+        "\n",
+        "# Generate synthetic text\n",
+        "output = generator(prompt, max_length=60, do_sample=True)[0]['generated_text']\n",
+        "print(\"📝 Synthetic Output:\", output)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "id": "ed28cc2e",
+      "metadata": {
+        "id": "ed28cc2e"
+      },
+      "source": [
+        "## Output"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "id": "6fc8f19e",
+      "metadata": {
+        "id": "6fc8f19e"
+      },
+      "source": [
+        "🖼️ Output Preview (Text Summary):\n",
+        "\n",
+        "Prompt: \"Create a customer support query about a late delivery.\"\n",
+        "\n",
+        "📝 Output: The LLaMA model generates a realistic complaint, such as:\n",
+        "\n",
+        "\"Dear Support Team, I placed an order two weeks ago and have yet to receive it...\"\n",
+        "\n",
+        "🎯 This illustrates how the model can be used to generate realistic synthetic data for tasks like training chatbots or support models.\n"
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "provenance": []
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 5
+}