|
102 | 102 | "\n", |
103 | 103 | "After you've completed the Gemma setup, move on to the next section, where you'll set environment variables for your Colab environment.\n", |
104 | 104 | "\n", |
105 | | - "### Set environment variables\n", |
| 105 | + "### 2. Set environment variables\n", |
106 | 106 | "\n", |
107 | 107 | "Set environment variables for `KAGGLE_USERNAME` and `KAGGLE_KEY`. When prompted with the \"Grant access?\" messages, agree to provide secret access." |
108 | 108 | ] |
|
128 | 128 | "id": "m1UE1CEnE9ql" |
129 | 129 | }, |
130 | 130 | "source": [ |
131 | | - "### 2. Install the `gemma` library\n", |
| 131 | + "### 3. Install the `gemma` library\n", |
132 | 132 | "\n", |
133 | 133 | "Free Colab hardware acceleration is currently *insufficient* to run this notebook. If you are using [Colab Pay As You Go or Colab Pro](https://colab.research.google.com/signup), click on **Edit** > **Notebook settings** > Select **A100 GPU** > **Save** to enable hardware acceleration.\n", |
134 | 134 | "\n", |
|
170 | 170 | "id": "-mRkkT-iPYoq" |
171 | 171 | }, |
172 | 172 | "source": [ |
173 | | - "### 3. Import libraries\n", |
| 173 | + "### 4. Import libraries\n", |
174 | 174 | "\n", |
175 | 175 | "This notebook uses [Flax](https://flax.readthedocs.io) (for neural networks), core [JAX](https://jax.readthedocs.io), [SentencePiece](https://github.com/google/sentencepiece) (for tokenization), [Chex](https://chex.readthedocs.io/en/latest/) (a library of utilities for writing reliable JAX code), and TensorFlow Datasets." |
176 | 176 | ] |
|
912 | 912 | "source": [ |
913 | 913 | "## Configure the model\n", |
914 | 914 | "\n", |
915 | | - "Before you begin fine-tuning the Gemma model, configure it as follows:\n", |
| 915 | + "Before you begin fine-tuning the Gemma model, you need to configure it.\n", |
916 | 916 | "\n", |
917 | | - "Load and format the Gemma model checkpoint with the [`gemma.params`](https://github.com/google-deepmind/gemma/blob/main/gemma/params.py) method:" |
| 917 | + "First, load and format the Gemma model checkpoint with the [`gemma.params.load_and_format_params`](https://github.com/google-deepmind/gemma/blob/c6bd156c246530e1620a7c62de98542a377e3934/gemma/params.py#L27) method:" |
918 | 918 | ] |
919 | 919 | }, |
920 | 920 | { |
|
934 | 934 | "id": "BtJhJkkZzsy1" |
935 | 935 | }, |
936 | 936 | "source": [ |
937 | | - "To automatically load the correct configuration from the Gemma model checkpoint, use [`gemma.transformer.TransformerConfig`](https://github.com/google-deepmind/gemma/blob/56e501ce147af4ea5c23cc0ddf5a9c4a6b7bd0d0/gemma/transformer.py#L65). The `cache_size` argument is the number of time steps in the Gemma `transformer` cache. Afterwards, instantiate the Gemma model as `transformer` with [`gemma.transformer.Transformer`](https://github.com/google-deepmind/gemma/blob/56e501ce147af4ea5c23cc0ddf5a9c4a6b7bd0d0/gemma/transformer.py#L136) (which inherits from [`flax.linen.Module`](https://flax.readthedocs.io/en/latest/api_reference/flax.linen/module.html).\n", |
| 937 | + "To automatically load the correct configuration from the Gemma model checkpoint, use [`gemma.transformer.TransformerConfig`](https://github.com/google-deepmind/gemma/blob/56e501ce147af4ea5c23cc0ddf5a9c4a6b7bd0d0/gemma/transformer.py#L65). The `cache_size` argument is the number of time steps in the Gemma `Transformer` cache. Afterwards, instantiate the Gemma model as `model_2b` with [`gemma.transformer.Transformer`](https://github.com/google-deepmind/gemma/blob/56e501ce147af4ea5c23cc0ddf5a9c4a6b7bd0d0/gemma/transformer.py#L136) (which inherits from [`flax.linen.Module`](https://flax.readthedocs.io/en/latest/api_reference/flax.linen/module.html)).\n", |
938 | 938 | "\n", |
939 | 939 | "**Note:** The vocabulary size is smaller than the number of input embeddings because of unused tokens in the current Gemma release." |
940 | 940 | ] |
|
1375 | 1375 | "source": [ |
1376 | 1376 | "## Learn more\n", |
1377 | 1377 | "\n", |
1378 | | - "- You can learn more about the Google DeepMind [`gemma` library on GitHub](https://github.com/google-deepmind/gemma), which contains docstrings of methods you used in this tutorial, such as [`gemma.params`](https://github.com/google-deepmind/gemma/blob/main/gemma/params.py),\n", |
| 1378 | + "- You can learn more about the Google DeepMind [`gemma` library on GitHub](https://github.com/google-deepmind/gemma), which contains docstrings of modules you used in this tutorial, such as [`gemma.params`](https://github.com/google-deepmind/gemma/blob/main/gemma/params.py),\n", |
1379 | 1379 | "[`gemma.transformer`](https://github.com/google-deepmind/gemma/blob/main/gemma/transformer.py), and\n", |
1380 | 1380 | "[`gemma.sampler`](https://github.com/google-deepmind/gemma/blob/main/gemma/sampler.py).\n", |
1381 | 1381 | "- The following libraries have their own documentation sites: [core JAX](https://jax.readthedocs.io), [Flax](https://flax.readthedocs.io), [Chex](https://chex.readthedocs.io/en/latest/), [Optax](https://optax.readthedocs.io/en/latest/), and [Orbax](https://orbax.readthedocs.io/).\n", |
|
0 commit comments