Add Gemma3-270M model export example colab.

sirakiin · copybara-github · commit fb75534d3b79 · 2025-09-22T13:48:23.000-07:00
PiperOrigin-RevId: 810140711
diff --git a/ai_edge_torch/generative/colabs/Gemma3_270M_convertion.ipynb b/ai_edge_torch/generative/colabs/Gemma3_270M_convertion.ipynb
@@ -0,0 +1,233 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "1wRAjEfxE-rV"
+      },
+      "source": [
+        "##### Copyright 2025 The AI Edge Torch Authors."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "cellView": "form",
+        "id": "qG29JvSAGKht"
+      },
+      "outputs": [],
+      "source": [
+        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+        "# you may not use this file except in compliance with the License.\n",
+        "# You may obtain a copy of the License at\n",
+        "#\n",
+        "# https://www.apache.org/licenses/LICENSE-2.0\n",
+        "#\n",
+        "# Unless required by applicable law or agreed to in writing, software\n",
+        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+        "# See the License for the specific language governing permissions and\n",
+        "# limitations under the License."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "O4F5iVzh--fg"
+      },
+      "source": [
+        "# Exporting Gemma3 270M with AI Edge Torch\n",
+        "\n",
+        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/google-ai-edge/ai_edge_torch/generative/colabs/Gemma3_270M_convertion.ipynb)\n",
+        "\n",
+        "In this colab, we will show you how to export a Gemma-3-270M model to LiteRT-LM format with AI Edge Torch.\n",
+        "\n",
+        "It works with the base Gemma-3-270M-it model and its fine-tuned models. For later, checkout the [Full Model Fine-Tune using Hugging Face Transformers](https://ai.google.dev/gemma/docs/core/huggingface_text_full_finetune) tutorial.\n",
+        "\n",
+        "#Prerequisite for exporting google/gemma-3-270m-it\n",
+        "\n",
+        "- Create HuggingFace token with permission access to\n",
+        "  - google/gemma-3-270m-it\n",
+        "\n",
+        "  This is needed to download the checkpoint and tokenizer.\n",
+        "\n",
+        "- Open Colab Secrets: In your Google Colab notebook, locate the Secrets icon in the left-hand sidebar and click on it.\n",
+        "- Add a new secret: Click the \"Add Secret\" button.\n",
+        "- Name your secret: Enter \"HF_TOKEN\" for your token in the \"Name\" field.\n",
+        "- Paste your token: In the \"Value\" field, paste the actual token you want to store.\n",
+        "\n",
+        "#Prerequisite for exporting fine-tuned model\n",
+        "\n",
+        "- Access to the finetuned repo in Hugging Face Hub, or\n",
+        "\n",
+        "- Access to the finetuned checkpoint\n",
+        "\n",
+        "\n",
+        "## Note: When running notebooks in this repository with Google Colab, some users may see the following warning message:\n",
+        "\n",
+        "![Colab warning](https://github.com/google-ai-edge/ai-edge-torch/blob/main/docs/data/colab_warning.jpg?raw=true)\n",
+        "\n",
+        "Please click `Restart Session` and run again.\n",
+        "\n",
+        "\n",
+        "This colab works with a free tier colab runtime.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "Stdvqj8A-5sj"
+      },
+      "outputs": [],
+      "source": [
+        "# @title Install dependencies and environment setup\n",
+        "\n",
+        "!pip install ai-edge-litert-nightly==2.0.2.dev20250917\n",
+        "!pip uninstall -y tensorflow\n",
+        "!pip install ai-edge-torch-nightly==0.7.0.dev20250920  --force-reinstall"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "H-XUP1wA_oT5"
+      },
+      "outputs": [],
+      "source": [
+        "# Setup Hugging Face Hub credentials\n",
+        "\n",
+        "import os\n",
+        "from google.colab import userdata\n",
+        "os.environ[\"HF_TOKEN\"] = userdata.get('HF_TOKEN')"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "5gT64EZu_28k"
+      },
+      "outputs": [],
+      "source": [
+        "# @title Import needed packages.\n",
+        "from huggingface_hub import snapshot_download\n",
+        "from ai_edge_torch.generative.examples.gemma3 import gemma3\n",
+        "from ai_edge_torch.generative.utilities import converter\n",
+        "from ai_edge_torch.generative.utilities.export_config import ExportConfig\n",
+        "from ai_edge_torch.generative.layers import kv_cache"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "7T3G3Uj6MVk0"
+      },
+      "source": [
+        "# Exporting the checkpoint to LiteRT-LM format.\n",
+        "\n",
+        "In this example, we directly use the google/gemma-3-270m-it repo. But you can also replace it with your fine-tuned model directory or repo ID.\n",
+        "\n",
+        "If you are following the fine-tune colab and storing your checkpoint to Google Drive as the default setup, you can point to the checkpoint with the followings instead of downloading the base checkpoint.\n",
+        "\n",
+        "```\n",
+        "from google.colab import drive\n",
+        "drive.mount('/content/drive')\n",
+        "checkpoint_dir = '/content/drive/MyDrive/MyGemmaNPC'\n",
+        "```\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "l8DSwaSq_8Er"
+      },
+      "outputs": [],
+      "source": [
+        "# @title Download checkpoint\n",
+        "\n",
+        "checkpoint_dir = snapshot_download('google/gemma-3-270m-it')\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "trGuvI-bAM1j"
+      },
+      "source": [
+        "# Convert to LiteRT-LM format\n",
+        "\n",
+        "After the following cell, you will be able to download the exported `.litertlm` file under `/contents/`, which will be accessible from the `Files` pannel."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "waCdEPbcADAm"
+      },
+      "outputs": [],
+      "source": [
+        "output_dir = '/content/'\n",
+        "\n",
+        "# Import the weights and build the PyTorch model\n",
+        "pytorch_model = gemma3.build_model_270m(checkpoint_dir)\n",
+        "\n",
+        "# Setup the export configurations and parameters for text generation models.\n",
+        "export_config = ExportConfig()\n",
+        "export_config.kvcache_layout = kv_cache.KV_LAYOUT_TRANSPOSED\n",
+        "export_config.mask_as_input = True\n",
+        "\n",
+        "# Configs specific for text generation models.\n",
+        "litertlm_config = {\n",
+        "    \"tokenizer_model_path\": os.path.join(checkpoint_dir, 'tokenizer.model'),\n",
+        "    \"start_token_id\": 2,  # \"\u003cbos\u003e\"\n",
+        "    \"stop_token_ids\": [1, 106],  # [\"\u003ceos\u003e\", \"\u003cend_of_turn\u003e\"]\n",
+        "    \"prompt_prefix\": \"\u003cstart_of_turn\u003euser\\n\",\n",
+        "    \"prompt_suffix\": \"\u003cend_of_turn\u003e\\n\u003cstart_of_turn\u003emodel\\n\",\n",
+        "    \"model_prompt_prefix\": \"\u003cstart_of_turn\u003emodel\\n\",\n",
+        "    \"model_prompt_suffix\": \"\u003cend_of_turn\u003e\\n\",\n",
+        "    \"user_prompt_prefix\": \"\u003cstart_of_turn\u003euser\\n\",\n",
+        "    \"user_prompt_suffix\": \"\u003cend_of_turn\u003e\\n\",\n",
+        "    \"output_format\": \"litertlm\",\n",
+        "}\n",
+        "\n",
+        "# Convert to LiteRT or LiteRT-LM Format\n",
+        "converter.convert_to_litert(\n",
+        "    pytorch_model,\n",
+        "    output_path=output_dir,\n",
+        "    output_name_prefix=\"gemma\",\n",
+        "    prefill_seq_len=2048,\n",
+        "    kv_cache_max_len=4096,\n",
+        "    quantize=\"dynamic_int8\",\n",
+        "    export_config=export_config,\n",
+        "    **litertlm_config\n",
+        ")\n"
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "private_outputs": true,
+      "provenance": [
+        {
+          "file_id": "1P33SZyxx2s_k8INd5dYEnKsBaJVTTGoF",
+          "timestamp": 1758487391162
+        }
+      ],
+      "toc_visible": true
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}