Skip to content

Commit d24ea27

Browse files
authored
replace groq llama 2 with replicate (meta-llama#546)
2 parents f3a5388 + b1939b1 commit d24ea27

File tree

1 file changed

+110
-79
lines changed

1 file changed

+110
-79
lines changed

recipes/quickstart/Getting_to_know_Llama.ipynb

Lines changed: 110 additions & 79 deletions
Original file line numberDiff line numberDiff line change
@@ -196,7 +196,7 @@
196196
"### **1.1 - What is Llama 3?**\n",
197197
"\n",
198198
"* State of the art (SOTA), Open Source LLM\n",
199-
"* 8B, 70B\n",
199+
"* 8B, 70B - base and instruct models\n",
200200
"* Choosing model: Size, Quality, Cost, Speed\n",
201201
"* Pretrained + Chat\n",
202202
"* [Meta Llama 3 Blog](https://ai.meta.com/blog/meta-llama-3/)\n",
@@ -275,9 +275,7 @@
275275
"source": [
276276
"## **2 - Using and Comparing Llama 3 and Llama 2**\n",
277277
"\n",
278-
"In this notebook, we will use the Llama 2 70b chat and Llama 3 8b and 70b instruct models hosted on [Groq](https://console.groq.com/). You'll need to first [sign in](https://console.groq.com/) with your github or gmail account, then get an [API token](https://console.groq.com/keys) to try Groq out for free. (Groq runs Llama models very fast and they only support one Llama 2 model: the Llama 2 70b chat).\n",
279-
"\n",
280-
"**Note: You can also use other Llama hosting providers such as [Replicate](https://replicate.com/blog/run-llama-3-with-an-api?input=python), [Togther](https://docs.together.ai/docs/quickstart). Simply click the links here to see how to run `pip install` and use their freel trial API key with example code to modify the following three cells in 2.1 and 2.2.**\n"
278+
"We will be using Llama 2 7b & 70b chat and Llama 3 8b & 70b instruct models hosted on [Replicate](https://replicate.com/search?query=llama) to run the examples here. You will need to first sign in with Replicate with your github account, then create a free API token [here](https://replicate.com/account/api-tokens) that you can use for a while. You can also use other Llama 3 cloud providers such as [Groq](https://console.groq.com/), [Together](https://api.together.xyz/playground/language/meta-llama/Llama-3-8b-hf), or [Anyscale](https://app.endpoints.anyscale.com/playground).\n"
281279
]
282280
},
283281
{
@@ -297,15 +295,15 @@
297295
},
298296
"outputs": [],
299297
"source": [
300-
"!pip install groq"
298+
"!pip install replicate"
301299
]
302300
},
303301
{
304302
"cell_type": "markdown",
305303
"metadata": {},
306304
"source": [
307305
"### **2.2 - Create helpers for Llama 2 and Llama 3**\n",
308-
"First, set your Groq API token as environment variables.\n"
306+
"First, set your Replicate API token as environment variables.\n"
309307
]
310308
},
311309
{
@@ -319,16 +317,16 @@
319317
"import os\n",
320318
"from getpass import getpass\n",
321319
"\n",
322-
"GROQ_API_TOKEN = getpass()\n",
320+
"REPLICATE_API_TOKEN = getpass()\n",
323321
"\n",
324-
"os.environ[\"GROQ_API_KEY\"] = GROQ_API_TOKEN"
322+
"os.environ[\"REPLICATE_API_TOKEN\"] = REPLICATE_API_TOKEN"
325323
]
326324
},
327325
{
328326
"cell_type": "markdown",
329327
"metadata": {},
330328
"source": [
331-
"Create Llama 2 and Llama 3 helper functions - for chatbot type of apps, we'll use Llama 3 8b/70b instruct models, not the base models."
329+
"Create Llama 2 and Llama 3 helper functions - for chatbot type of apps, we'll use Llama 3 instruct and Llama 2 chat models, not the base models."
332330
]
333331
},
334332
{
@@ -339,53 +337,35 @@
339337
},
340338
"outputs": [],
341339
"source": [
342-
"from groq import Groq\n",
343-
"\n",
344-
"client = Groq(\n",
345-
" api_key=os.environ.get(\"GROQ_API_KEY\"),\n",
346-
")\n",
340+
"import replicate\n",
347341
"\n",
348-
"def llama2(prompt, temperature=0.0, input_print=True):\n",
349-
" chat_completion = client.chat.completions.create(\n",
350-
" messages=[\n",
351-
" {\n",
352-
" \"role\": \"user\",\n",
353-
" \"content\": prompt,\n",
354-
" }\n",
355-
" ],\n",
356-
" model=\"llama2-70b-4096\",\n",
357-
" temperature=temperature,\n",
358-
" )\n",
342+
"def llama2_7b(prompt):\n",
343+
" output = replicate.run(\n",
344+
" \"meta/llama-2-7b-chat\",\n",
345+
" input={\"prompt\": prompt}\n",
346+
" )\n",
347+
" return ''.join(output)\n",
359348
"\n",
360-
" return (chat_completion.choices[0].message.content)\n",
349+
"def llama2_70b(prompt):\n",
350+
" output = replicate.run(\n",
351+
" \"meta/llama-2-70b-chat\",\n",
352+
" input={\"prompt\": prompt}\n",
353+
" )\n",
354+
" return ''.join(output)\n",
361355
"\n",
362-
"def llama3_8b(prompt, temperature=0.0, input_print=True):\n",
363-
" chat_completion = client.chat.completions.create(\n",
364-
" messages=[\n",
365-
" {\n",
366-
" \"role\": \"user\",\n",
367-
" \"content\": prompt,\n",
368-
" }\n",
369-
" ],\n",
370-
" model=\"llama3-8b-8192\",\n",
371-
" temperature=temperature,\n",
372-
" )\n",
356+
"def llama3_8b(prompt):\n",
357+
" output = replicate.run(\n",
358+
" \"meta/meta-llama-3-8b-instruct\",\n",
359+
" input={\"prompt\": prompt}\n",
360+
" )\n",
361+
" return ''.join(output)\n",
373362
"\n",
374-
" return (chat_completion.choices[0].message.content)\n",
375-
"\n",
376-
"def llama3_70b(prompt, temperature=0.0, input_print=True):\n",
377-
" chat_completion = client.chat.completions.create(\n",
378-
" messages=[\n",
379-
" {\n",
380-
" \"role\": \"user\",\n",
381-
" \"content\": prompt,\n",
382-
" }\n",
383-
" ],\n",
384-
" model=\"llama3-70b-8192\",\n",
385-
" temperature=temperature,\n",
386-
" )\n",
387-
"\n",
388-
" return (chat_completion.choices[0].message.content)"
363+
"def llama3_70b(prompt):\n",
364+
" output = replicate.run(\n",
365+
" \"meta/meta-llama-3-70b-instruct\",\n",
366+
" input={\"prompt\": prompt}\n",
367+
" )\n",
368+
" return ''.join(output)"
389369
]
390370
},
391371
{
@@ -406,7 +386,7 @@
406386
"outputs": [],
407387
"source": [
408388
"prompt = \"The typical color of a llama is: \"\n",
409-
"output = llama2(prompt)\n",
389+
"output = llama2_7b(prompt)\n",
410390
"md(output)"
411391
]
412392
},
@@ -420,6 +400,16 @@
420400
"md(output)"
421401
]
422402
},
403+
{
404+
"cell_type": "code",
405+
"execution_count": null,
406+
"metadata": {},
407+
"outputs": [],
408+
"source": [
409+
"output = llama2_7b(\"The typical color of a llama is what? Answer in one word.\")\n",
410+
"md(output)"
411+
]
412+
},
423413
{
424414
"cell_type": "code",
425415
"execution_count": null,
@@ -430,6 +420,13 @@
430420
"md(output)"
431421
]
432422
},
423+
{
424+
"cell_type": "markdown",
425+
"metadata": {},
426+
"source": [
427+
"**Note: Llama 3 follows instructions better than Llama 2 in single-turn chat.**"
428+
]
429+
},
433430
{
434431
"cell_type": "markdown",
435432
"metadata": {
@@ -457,7 +454,7 @@
457454
"outputs": [],
458455
"source": [
459456
"prompt_chat = \"What is the average lifespan of a Llama? Answer the question in few words.\"\n",
460-
"output = llama2(prompt_chat)\n",
457+
"output = llama2_7b(prompt_chat)\n",
461458
"md(output)"
462459
]
463460
},
@@ -483,7 +480,7 @@
483480
"source": [
484481
"# example without previous context. LLM's are stateless and cannot understand \"they\" without previous context\n",
485482
"prompt_chat = \"What animal family are they? Answer the question in few words.\"\n",
486-
"output = llama2(prompt_chat)\n",
483+
"output = llama2_7b(prompt_chat)\n",
487484
"md(output)"
488485
]
489486
},
@@ -497,6 +494,16 @@
497494
"md(output)"
498495
]
499496
},
497+
{
498+
"cell_type": "code",
499+
"execution_count": null,
500+
"metadata": {},
501+
"outputs": [],
502+
"source": [
503+
"output = llama2_70b(prompt_chat)\n",
504+
"md(output)"
505+
]
506+
},
500507
{
501508
"cell_type": "code",
502509
"execution_count": null,
@@ -536,7 +543,7 @@
536543
"Assistant: 15-20 years.\n",
537544
"User: What animal family are they?\n",
538545
"\"\"\"\n",
539-
"output = llama2(prompt_chat)\n",
546+
"output = llama2_7b(prompt_chat)\n",
540547
"md(output)"
541548
]
542549
},
@@ -579,7 +586,17 @@
579586
"\n",
580587
"Answer the question with one word.\n",
581588
"\"\"\"\n",
582-
"output = llama2(prompt_chat)\n",
589+
"output = llama2_7b(prompt_chat)\n",
590+
"md(output)"
591+
]
592+
},
593+
{
594+
"cell_type": "code",
595+
"execution_count": null,
596+
"metadata": {},
597+
"outputs": [],
598+
"source": [
599+
"output = llama2_70b(prompt_chat)\n",
583600
"md(output)"
584601
]
585602
},
@@ -597,7 +614,7 @@
597614
"cell_type": "markdown",
598615
"metadata": {},
599616
"source": [
600-
"**Both Llama 3 8b and Llama 2 70b follows instructions (e.g. \"Answer the question with one word\") better than Llama 2 7b.**"
617+
"**Both Llama 3 8b and Llama 2 70b follows instructions (e.g. \"Answer the question with one word\") better than Llama 2 7b in multi-turn chat.**"
601618
]
602619
},
603620
{
@@ -640,7 +657,7 @@
640657
"\n",
641658
"Give one word response.\n",
642659
"'''\n",
643-
"output = llama2(prompt)\n",
660+
"output = llama2_7b(prompt)\n",
644661
"md(output)"
645662
]
646663
},
@@ -684,7 +701,7 @@
684701
"Give one word response.\n",
685702
"'''\n",
686703
"\n",
687-
"output = llama2(prompt)\n",
704+
"output = llama2_7b(prompt)\n",
688705
"md(output)"
689706
]
690707
},
@@ -704,7 +721,7 @@
704721
"cell_type": "markdown",
705722
"metadata": {},
706723
"source": [
707-
"**Note: Llama 2, with few shots, has the same output \"Neutral\" as Llama 3.**"
724+
"**Note: Llama 2, with few shots, has the same output \"Neutral\" as Llama 3, but Llama 2 doesn't follow instructions (Give one word response) well.**"
708725
]
709726
},
710727
{
@@ -894,6 +911,7 @@
894911
"outputs": [],
895912
"source": [
896913
"!pip install langchain\n",
914+
"!pip install langchain-community\n",
897915
"!pip install sentence-transformers\n",
898916
"!pip install faiss-cpu\n",
899917
"!pip install bs4\n",
@@ -936,40 +954,53 @@
936954
"vectorstore = FAISS.from_documents(all_splits, HuggingFaceEmbeddings(model_name=\"sentence-transformers/all-mpnet-base-v2\"))"
937955
]
938956
},
957+
{
958+
"cell_type": "markdown",
959+
"metadata": {},
960+
"source": [
961+
"You'll need to first sign in at [Groq](https://console.groq.com/login) with your github or gmail account, then get an API token to try Groq out for free."
962+
]
963+
},
939964
{
940965
"cell_type": "code",
941966
"execution_count": null,
942967
"metadata": {},
943968
"outputs": [],
944969
"source": [
945-
"from langchain_groq import ChatGroq\n",
946-
"llm = ChatGroq(temperature=0, model_name=\"llama3-8b-8192\")\n",
970+
"import os\n",
971+
"from getpass import getpass\n",
947972
"\n",
948-
"from langchain.chains import ConversationalRetrievalChain\n",
949-
"chain = ConversationalRetrievalChain.from_llm(llm,\n",
950-
" vectorstore.as_retriever(),\n",
951-
" return_source_documents=True)\n",
973+
"GROQ_API_TOKEN = getpass()\n",
952974
"\n",
953-
"result = chain({\"question\": \"What’s new with Llama 3?\", \"chat_history\": []})\n",
954-
"md(result['answer'])\n"
975+
"os.environ[\"GROQ_API_KEY\"] = GROQ_API_TOKEN"
955976
]
956977
},
957978
{
958979
"cell_type": "code",
959980
"execution_count": null,
960-
"metadata": {
961-
"id": "NmEhBe3Kiyre"
962-
},
981+
"metadata": {},
982+
"outputs": [],
983+
"source": [
984+
"from langchain_groq import ChatGroq\n",
985+
"llm = ChatGroq(temperature=0, model_name=\"llama3-8b-8192\")"
986+
]
987+
},
988+
{
989+
"cell_type": "code",
990+
"execution_count": null,
991+
"metadata": {},
963992
"outputs": [],
964993
"source": [
965-
"# Query against your own data\n",
966994
"from langchain.chains import ConversationalRetrievalChain\n",
967-
"chain = ConversationalRetrievalChain.from_llm(llm, vectorstore.as_retriever(), return_source_documents=True)\n",
968995
"\n",
969-
"chat_history = []\n",
970-
"query = \"What’s new with Llama 3?\"\n",
971-
"result = chain({\"question\": query, \"chat_history\": chat_history})\n",
972-
"md(result['answer'])"
996+
"# Query against your own data\n",
997+
"chain = ConversationalRetrievalChain.from_llm(llm,\n",
998+
" vectorstore.as_retriever(),\n",
999+
" return_source_documents=True)\n",
1000+
"\n",
1001+
"# no chat history passed\n",
1002+
"result = chain({\"question\": \"What’s new with Llama 3?\", \"chat_history\": []})\n",
1003+
"md(result['answer'])\n"
9731004
]
9741005
},
9751006
{
@@ -1083,7 +1114,7 @@
10831114
"name": "python",
10841115
"nbconvert_exporter": "python",
10851116
"pygments_lexer": "ipython3",
1086-
"version": "3.11.7"
1117+
"version": "3.10.14"
10871118
}
10881119
},
10891120
"nbformat": 4,

0 commit comments

Comments
 (0)