Skip to content

Commit 0e09458

Browse files
authored
updated Running_Llama_on_Mac and HelloLlamaCloud.ipynb for Llama 3 (meta-llama#459)
2 parents e9d1ba9 + b1aad34 commit 0e09458

File tree

5 files changed

+228
-664
lines changed

5 files changed

+228
-664
lines changed

recipes/quickstart/Running_Llama2_Anywhere/Running_Llama_on_Mac.ipynb

Lines changed: 0 additions & 219 deletions
This file was deleted.
Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"## Running Llama 3 on Mac, Windows or Linux\n",
8+
"This notebook goes over how you can set up and run Llama 3 locally on a Mac, Windows or Linux using [Ollama](https://ollama.com/)."
9+
]
10+
},
11+
{
12+
"cell_type": "markdown",
13+
"metadata": {},
14+
"source": [
15+
"### Steps at a glance:\n",
16+
"1. Download and install Ollama.\n",
17+
"2. Download and test run Llama 3.\n",
18+
"3. Use local Llama 3 via Python.\n",
19+
"4. Use local Llama 3 via LangChain.\n"
20+
]
21+
},
22+
{
23+
"cell_type": "markdown",
24+
"metadata": {},
25+
"source": [
26+
"#### 1. Download and install Ollama\n",
27+
"\n",
28+
"On Mac or Windows, go to the Ollama download page [here](https://ollama.com/download) and select your platform to download it, then double click the downloaded file to install Ollama.\n",
29+
"\n",
30+
"On Linux, you can simply run on a terminal `curl -fsSL https://ollama.com/install.sh | sh` to download and install Ollama."
31+
]
32+
},
33+
{
34+
"cell_type": "markdown",
35+
"metadata": {},
36+
"source": [
37+
"#### 2. Download and test run Llama 3\n",
38+
"\n",
39+
"On a terminal or console, run `ollama pull llama3` to download the Llama 3 8b chat model, in the 4-bit quantized format with size about 4.7 GB.\n",
40+
"\n",
41+
"Run `ollama pull llama3:70b` to download the Llama 3 70b chat model, also in the 4-bit quantized format with size 39GB.\n",
42+
"\n",
43+
"Then you can run `ollama run llama3` and ask Llama 3 questions such as \"who wrote the book godfather?\" or \"who wrote the book godfather? answer in one sentence.\" You can also try `ollama run llama3:70b`, but the inference speed will most likely be too slow - for example, on an Apple M1 Pro with 32GB RAM, it takes over 10 seconds to generate one token (vs over 10 tokens per second with Llama 3 7b chat).\n",
44+
"\n",
45+
"You can also run the following command to test Llama 3 (7b chat):\n",
46+
"```\n",
47+
" curl http://localhost:11434/api/chat -d '{\n",
48+
" \"model\": \"llama3\",\n",
49+
" \"messages\": [\n",
50+
" {\n",
51+
" \"role\": \"user\",\n",
52+
" \"content\": \"who wrote the book godfather?\"\n",
53+
" }\n",
54+
" ],\n",
55+
" \"stream\": false\n",
56+
"}'\n",
57+
"```\n",
58+
"\n",
59+
"The complete Ollama API doc is [here](https://github.com/ollama/ollama/blob/main/docs/api.md)."
60+
]
61+
},
62+
{
63+
"cell_type": "markdown",
64+
"metadata": {},
65+
"source": [
66+
"#### 3. Use local Llama 3 via Python\n",
67+
"\n",
68+
"The Python code below is the port of the curl command above."
69+
]
70+
},
71+
{
72+
"cell_type": "code",
73+
"execution_count": null,
74+
"metadata": {},
75+
"outputs": [],
76+
"source": [
77+
"import requests\n",
78+
"import json\n",
79+
"\n",
80+
"url = \"http://localhost:11434/api/chat\"\n",
81+
"\n",
82+
"def llama3(prompt):\n",
83+
" data = {\n",
84+
" \"model\": \"llama3\",\n",
85+
" \"messages\": [\n",
86+
" {\n",
87+
" \"role\": \"user\",\n",
88+
" \"content\": prompt\n",
89+
" }\n",
90+
" ],\n",
91+
" \"stream\": False\n",
92+
" }\n",
93+
" \n",
94+
" headers = {\n",
95+
" 'Content-Type': 'application/json'\n",
96+
" }\n",
97+
" \n",
98+
" response = requests.post(url, headers=headers, json=data)\n",
99+
" \n",
100+
" return(response.json()['message']['content'])"
101+
]
102+
},
103+
{
104+
"cell_type": "code",
105+
"execution_count": null,
106+
"metadata": {},
107+
"outputs": [],
108+
"source": [
109+
"response = llama3(\"who wrote the book godfather\")\n",
110+
"print(response)"
111+
]
112+
},
113+
{
114+
"cell_type": "markdown",
115+
"metadata": {},
116+
"source": [
117+
"#### 4. Use local Llama 3 via LangChain\n",
118+
"\n",
119+
"Code below use LangChain with Ollama to query Llama 3 running locally. For a more advanced example of using local Llama 3 with LangChain and agent-powered RAG, see [this](https://github.com/langchain-ai/langgraph/blob/main/examples/rag/langgraph_rag_agent_llama3_local.ipynb)."
120+
]
121+
},
122+
{
123+
"cell_type": "code",
124+
"execution_count": null,
125+
"metadata": {},
126+
"outputs": [],
127+
"source": [
128+
"!pip install langchain"
129+
]
130+
},
131+
{
132+
"cell_type": "code",
133+
"execution_count": null,
134+
"metadata": {},
135+
"outputs": [],
136+
"source": [
137+
"from langchain_community.chat_models import ChatOllama\n",
138+
"\n",
139+
"llm = ChatOllama(model=\"llama3\", temperature=0)\n",
140+
"response = llm.invoke(\"who wrote the book godfather?\")\n",
141+
"print(response.content)\n"
142+
]
143+
}
144+
],
145+
"metadata": {
146+
"kernelspec": {
147+
"display_name": "Python 3 (ipykernel)",
148+
"language": "python",
149+
"name": "python3"
150+
},
151+
"language_info": {
152+
"codemirror_mode": {
153+
"name": "ipython",
154+
"version": 3
155+
},
156+
"file_extension": ".py",
157+
"mimetype": "text/x-python",
158+
"name": "python",
159+
"nbconvert_exporter": "python",
160+
"pygments_lexer": "ipython3",
161+
"version": "3.11.9"
162+
}
163+
},
164+
"nbformat": 4,
165+
"nbformat_minor": 4
166+
}

0 commit comments

Comments
 (0)