nerdy-tech-com-gitub
diff --git a/‎recipes/quickstart/Running_Llama2_Anywhere/Running_Llama_on_Mac.ipynb
Lines changed: 0 additions & 219 deletions b/‎recipes/quickstart/Running_Llama2_Anywhere/Running_Llama_on_Mac.ipynb
Lines changed: 0 additions & 219 deletions
diff --git a/‎recipes/quickstart/Running_Llama2_Anywhere/Running_Llama_on_Mac_Windows_Linux.ipynb
Lines changed: 166 additions & 0 deletions b/‎recipes/quickstart/Running_Llama2_Anywhere/Running_Llama_on_Mac_Windows_Linux.ipynb
Lines changed: 166 additions & 0 deletions
@@ -0,0 +1,166 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Running Llama 3 on Mac, Windows or Linux\n",
+    "This notebook goes over how you can set up and run Llama 3 locally on a Mac, Windows or Linux using [Ollama](https://ollama.com/)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Steps at a glance:\n",
+    "1. Download and install Ollama.\n",
+    "2. Download and test run Llama 3.\n",
+    "3. Use local Llama 3 via Python.\n",
+    "4. Use local Llama 3 via LangChain.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### 1. Download and install Ollama\n",
+    "\n",
+    "On Mac or Windows, go to the Ollama download page [here](https://ollama.com/download) and select your platform to download it, then double click the downloaded file to install Ollama.\n",
+    "\n",
+    "On Linux, you can simply run on a terminal `curl -fsSL https://ollama.com/install.sh | sh` to download and install Ollama."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### 2. Download and test run Llama 3\n",
+    "\n",
+    "On a terminal or console, run `ollama pull llama3` to download the Llama 3 8b chat model, in the 4-bit quantized format with size about 4.7 GB.\n",
+    "\n",
+    "Run `ollama pull llama3:70b` to download the Llama 3 70b chat model, also in the 4-bit quantized format with size 39GB.\n",
+    "\n",
+    "Then you can run `ollama run llama3` and ask Llama 3 questions such as \"who wrote the book godfather?\" or \"who wrote the book godfather? answer in one sentence.\" You can also try `ollama run llama3:70b`, but the inference speed will most likely be too slow - for example, on an Apple M1 Pro with 32GB RAM, it takes over 10 seconds to generate one token (vs over 10 tokens per second with Llama 3 7b chat).\n",
+    "\n",
+    "You can also run the following command to test Llama 3 (7b chat):\n",
+    "```\n",
+    " curl http://localhost:11434/api/chat -d '{\n",
+    "  \"model\": \"llama3\",\n",
+    "  \"messages\": [\n",
+    "    {\n",
+    "      \"role\": \"user\",\n",
+    "      \"content\": \"who wrote the book godfather?\"\n",
+    "    }\n",
+    "  ],\n",
+    "  \"stream\": false\n",
+    "}'\n",
+    "```\n",
+    "\n",
+    "The complete Ollama API doc is [here](https://github.com/ollama/ollama/blob/main/docs/api.md)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### 3. Use local Llama 3 via Python\n",
+    "\n",
+    "The Python code below is the port of the curl command above."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import requests\n",
+    "import json\n",
+    "\n",
+    "url = \"http://localhost:11434/api/chat\"\n",
+    "\n",
+    "def llama3(prompt):\n",
+    "    data = {\n",
+    "        \"model\": \"llama3\",\n",
+    "        \"messages\": [\n",
+    "            {\n",
+    "              \"role\": \"user\",\n",
+    "              \"content\": prompt\n",
+    "            }\n",
+    "        ],\n",
+    "        \"stream\": False\n",
+    "    }\n",
+    "    \n",
+    "    headers = {\n",
+    "        'Content-Type': 'application/json'\n",
+    "    }\n",
+    "    \n",
+    "    response = requests.post(url, headers=headers, json=data)\n",
+    "    \n",
+    "    return(response.json()['message']['content'])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "response = llama3(\"who wrote the book godfather\")\n",
+    "print(response)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### 4. Use local Llama 3 via LangChain\n",
+    "\n",
+    "Code below use LangChain with Ollama to query Llama 3 running locally. For a more advanced example of using local Llama 3 with LangChain and agent-powered RAG, see [this](https://github.com/langchain-ai/langgraph/blob/main/examples/rag/langgraph_rag_agent_llama3_local.ipynb)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install langchain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_community.chat_models import ChatOllama\n",
+    "\n",
+    "llm = ChatOllama(model=\"llama3\", temperature=0)\n",
+    "response = llm.invoke(\"who wrote the book godfather?\")\n",
+    "print(response.content)\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}