Add files via upload

init27 · web-flow · commit cb05f6e01a01 · 2024-08-14T14:23:14.000-07:00
diff --git a/recipes/quickstart/Prompt_Engineering_with_Llama_3.ipynb b/recipes/quickstart/Prompt_Engineering_with_Llama_3.ipynb
@@ -7,11 +7,11 @@
    "source": [
     "<a href=\"https://colab.research.google.com/github/meta-llama/llama-recipes/blob/main/recipes/quickstart/Prompt_Engineering_with_Llama_3.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
     "\n",
-    "# Prompt Engineering with Llama 3\n",
+    "# Prompt Engineering with Llama 3.1\n",
     "\n",
     "Prompt engineering is using natural language to produce a desired response from a large language model (LLM).\n",
     "\n",
-    "This interactive guide covers prompt engineering & best practices with Llama 3."
+    "This interactive guide covers prompt engineering & best practices with Llama 3.1."
    ]
   },
   {
@@ -45,6 +45,15 @@
     "\n",
     "Llama models come in varying parameter sizes. The smaller models are cheaper to deploy and run; the larger models are more capable.\n",
     "\n",
+    "#### Llama 3.1\n",
+    "1. `llama-3.1-8b` - base pretrained 8 billion parameter model\n",
+    "1. `llama-3.1-70b` - base pretrained 70 billion parameter model\n",
+    "1. `llama-3.1-405b` - base pretrained 405 billion parameter model\n",
+    "1. `llama-3.1-8b-instruct` - instruction fine-tuned 8 billion parameter model\n",
+    "1. `llama-3.1-70b-instruct` - instruction fine-tuned 70 billion parameter model\n",
+    "1. `llama-3.1-405b-instruct` - instruction fine-tuned 405 billion parameter model (flagship)\n",
+    "\n",
+    "\n",
     "#### Llama 3\n",
     "1. `llama-3-8b` - base pretrained 8 billion parameter model\n",
     "1. `llama-3-70b` - base pretrained 70 billion parameter model\n",
@@ -133,7 +142,7 @@
     "\n",
     "Tokens matter most when you consider API pricing and internal behavior (ex. hyperparameters).\n",
     "\n",
-    "Each model has a maximum context length that your prompt cannot exceed. That's 8K tokens for Llama 3, 4K for Llama 2, and 100K for Code Llama. \n"
+    "Each model has a maximum context length that your prompt cannot exceed. That's 128k tokens for Llama 3.1, 4K for Llama 2, and 100K for Code Llama.\n"
    ]
   },
   {
@@ -143,7 +152,7 @@
    "source": [
     "## Notebook Setup\n",
     "\n",
-    "The following APIs will be used to call LLMs throughout the guide. As an example, we'll call Llama 3 chat using [Grok](https://console.groq.com/playground?model=llama3-70b-8192).\n",
+    "The following APIs will be used to call LLMs throughout the guide. As an example, we'll call Llama 3.1 chat using [Grok](https://console.groq.com/playground?model=llama3-70b-8192).\n",
     "\n",
     "To install prerequisites run:"
    ]
@@ -171,8 +180,9 @@
     "# Get a free API key from https://console.groq.com/keys\n",
     "os.environ[\"GROQ_API_KEY\"] = \"YOUR_GROQ_API_KEY\"\n",
     "\n",
-    "LLAMA3_70B_INSTRUCT = \"llama3-70b-8192\"\n",
-    "LLAMA3_8B_INSTRUCT = \"llama3-8b-8192\"\n",
+    "LLAMA3_405B_INSTRUCT = \"llama-3.1-405b-reasoning\" # Note: Groq currently only gives access here to paying customers for 405B model\n",
+    "LLAMA3_70B_INSTRUCT = \"llama-3.1-70b-versatile\"\n",
+    "LLAMA3_8B_INSTRUCT = \"llama3.1-8b-instant\"\n",
     "\n",
     "DEFAULT_MODEL = LLAMA3_70B_INSTRUCT\n",
     "\n",
@@ -225,7 +235,7 @@
    "source": [
     "### Completion APIs\n",
     "\n",
-    "Let's try Llama 3!"
+    "Let's try Llama 3.1!"
    ]
   },
   {
@@ -488,7 +498,7 @@
     "\n",
     "Simply adding a phrase encouraging step-by-step thinking \"significantly improves the ability of large language models to perform complex reasoning\" ([Wei et al. (2022)](https://arxiv.org/abs/2201.11903)). This technique is called \"CoT\" or \"Chain-of-Thought\" prompting.\n",
     "\n",
-    "Llama 3 now reasons step-by-step naturally without the addition of the phrase. This section remains for completeness."
+    "Llama 3.1 now reasons step-by-step naturally without the addition of the phrase. This section remains for completeness."
    ]
   },
   {
@@ -704,7 +714,7 @@
    "source": [
     "### Limiting Extraneous Tokens\n",
     "\n",
-    "A common struggle with Llama 2 is getting output without extraneous tokens (ex. \"Sure! Here's more information on...\"), even if explicit instructions are given to Llama 2 to be concise and no preamble. Llama 3 can better follow instructions.\n",
+    "A common struggle with Llama 2 is getting output without extraneous tokens (ex. \"Sure! Here's more information on...\"), even if explicit instructions are given to Llama 2 to be concise and no preamble. Llama 3.x can better follow instructions.\n",
     "\n",
     "Check out this improvement that combines a role, rules and restrictions, explicit instructions, and an example:"
    ]