updating API examples to use Llama3 and latest langchain function signature

tmoreau89 · tmoreau89 · commit b203238af44a · 2024-05-09T16:09:54.000-04:00
diff --git a/recipes/llama_api_providers/OctoAI_API_examples/Getting_to_know_Llama.ipynb b/recipes/llama_api_providers/OctoAI_API_examples/Getting_to_know_Llama.ipynb
@@ -10,6 +10,41 @@
     "Our goal in this session is to provide a guided tour of Llama 3, including understanding different Llama 3 models, how and where to access them, Generative AI and Chatbot architectures, prompt engineering, RAG (Retrieval Augmented Generation), Fine-tuning and more. All this is implemented with a starter code for you to take it and use it in your Llama 3 projects."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "h3YGMDJidHtH"
+   },
+   "source": [
+    "### **Install dependencies**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "VhN6hXwx7FCp"
+   },
+   "outputs": [],
+   "source": [
+    "# Install dependencies and initialize\n",
+    "%pip install \\\n",
+    "    langchain==0.1.19 \\\n",
+    "    matplotlib \\\n",
+    "    octoai-sdk==0.10.1 \\\n",
+    "    openai \\\n",
+    "    sentence_transformers \\\n",
+    "    pdf2image \\\n",
+    "    pdfminer \\\n",
+    "    pdfminer.six \\\n",
+    "    unstructured \\\n",
+    "    faiss-cpu \\\n",
+    "    pillow-heif \\\n",
+    "    opencv-python \\\n",
+    "    unstructured-inference \\\n",
+    "    pikepdf"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {
@@ -244,40 +279,6 @@
     "In this notebook, we are going to access [Llama 3 8b instruct model](https://octoai.cloud/text/chat?model=meta-llama-3-8b-instruct&mode=api) using hosted API from OctoAI."
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {
-    "id": "h3YGMDJidHtH"
-   },
-   "source": [
-    "### **2.1 - Install dependencies**"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "id": "VhN6hXwx7FCp"
-   },
-   "outputs": [],
-   "source": [
-    "# Install dependencies and initialize\n",
-    "%pip install -qU \\\n",
-    "    langchain==0.1.19 \\\n",
-    "    octoai-sdk==0.10.1 \\\n",
-    "    openai \\\n",
-    "    sentence_transformers \\\n",
-    "    pdf2image \\\n",
-    "    pdfminer \\\n",
-    "    pdfminer.six \\\n",
-    "    unstructured \\\n",
-    "    faiss-cpu \\\n",
-    "    pillow-heif \\\n",
-    "    opencv-python \\\n",
-    "    unstructured-inference \\\n",
-    "    pikepdf"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -359,7 +360,7 @@
     "id": "5Jxq0pmf6L73"
    },
    "source": [
-    "### **2.2 - Basic completion**"
+    "# **2.1 - Basic completion**"
    ]
   },
   {
@@ -380,7 +381,7 @@
     "id": "StccjUDh6W0Q"
    },
    "source": [
-    "### **2.3 - System prompts**\n"
+    "## **2.2 - System prompts**\n"
    ]
   },
   {
@@ -404,7 +405,7 @@
     "id": "Hp4GNa066pYy"
    },
    "source": [
-    "### **2.4 - Response formats**\n",
+    "### **2.3 - Response formats**\n",
     "* Can support different formatted outputs e.g. text, JSON, etc."
    ]
   },
diff --git a/recipes/llama_api_providers/OctoAI_API_examples/HelloLlamaCloud.ipynb b/recipes/llama_api_providers/OctoAI_API_examples/HelloLlamaCloud.ipynb
@@ -6,13 +6,12 @@
    "metadata": {},
    "source": [
     "## This demo app shows:\n",
-    "* How to run Llama2 in the cloud hosted on OctoAI\n",
+    "* How to run Llama 3 in the cloud hosted on OctoAI\n",
     "* How to use LangChain to ask Llama general questions and follow up questions\n",
-    "* How to use LangChain to load a recent PDF doc - the Llama2 paper pdf - and chat about it. This is the well known RAG (Retrieval Augmented Generation) method to let LLM such as Llama2 be able to answer questions about the data not publicly available when Llama2 was trained, or about your own data. RAG is one way to prevent LLM's hallucination\n",
-    "* You should also review the [HelloLlamaLocal](HelloLlamaLocal.ipynb) notebook for more information on RAG\n",
+    "* How to use LangChain to load a recent PDF doc - the Llama paper pdf - and chat about it. This is the well known RAG (Retrieval Augmented Generation) method to let LLM such as Llama be able to answer questions about your own data. RAG is one way to prevent LLM's hallucination\n",
     "\n",
     "**Note** We will be using OctoAI to run the examples here. You will need to first sign into [OctoAI](https://octoai.cloud/) with your Github or Google account, then create a free API token [here](https://octo.ai/docs/getting-started/how-to-create-an-octoai-access-token) that you can use for a while (a month or $10 in OctoAI credits, whichever one runs out first).\n",
-    "After the free trial ends, you will need to enter billing info to continue to use Llama2 hosted on OctoAI."
+    "After the free trial ends, you will need to enter billing info to continue to use Llama 3 hosted on OctoAI."
    ]
   },
   {
@@ -35,7 +34,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!pip install langchain octoai-sdk sentence-transformers chromadb pypdf"
+    "%pip install langchain==0.1.19 octoai-sdk==0.10.1 openai sentence-transformers chromadb pypdf"
    ]
   },
   {
@@ -57,15 +56,17 @@
    "id": "3e8870c1",
    "metadata": {},
    "source": [
-    "Next we call the Llama 2 model from OctoAI. In this example we will use the Llama 2 13b chat FP16 model. You can find more on Llama 2 models on the [OctoAI text generation solution page](https://octoai.cloud/tools/text).\n",
+    "Next we call the Llama 3 model from OctoAI. In this example we will use the Llama 3 8b instruct model. You can find more on Llama models on the [OctoAI text generation solution page](https://octoai.cloud/text).\n",
     "\n",
     "At the time of writing this notebook the following Llama models are available on OctoAI:\n",
-    "* llama-2-13b-chat\n",
-    "* llama-2-70b-chat\n",
+    "* meta-llama-3-8b-instruct\n",
+    "* meta-llama-3-70b-instruct\n",
     "* codellama-7b-instruct\n",
     "* codellama-13b-instruct\n",
     "* codellama-34b-instruct\n",
-    "* codellama-70b-instruct"
+    "* llama-2-13b-chat\n",
+    "* llama-2-70b-chat\n",
+    "* llamaguard-7b"
    ]
   },
   {
@@ -77,21 +78,11 @@
    "source": [
     "from langchain.llms.octoai_endpoint import OctoAIEndpoint\n",
     "\n",
-    "llama2_13b = \"llama-2-13b-chat-fp16\"\n",
+    "llama3_8b = \"meta-llama-3-8b-instruct\"\n",
     "llm = OctoAIEndpoint(\n",
-    "    endpoint_url=\"https://text.octoai.run/v1/chat/completions\",\n",
-    "    model_kwargs={\n",
-    "        \"model\": llama2_13b,\n",
-    "        \"messages\": [\n",
-    "            {\n",
-    "                \"role\": \"system\",\n",
-    "                \"content\": \"You are a helpful, respectful and honest assistant.\"\n",
-    "            }\n",
-    "        ],\n",
-    "        \"max_tokens\": 500,\n",
-    "        \"top_p\": 1,\n",
-    "        \"temperature\": 0.01\n",
-    "    },\n",
+    "    model=llama3_8b,\n",
+    "    max_tokens=500,\n",
+    "    temperature=0.01\n",
     ")"
    ]
   },
@@ -111,7 +102,7 @@
    "outputs": [],
    "source": [
     "question = \"who wrote the book Innovator's dilemma?\"\n",
-    "answer = llm(question)\n",
+    "answer = llm.invoke(question)\n",
     "print(answer)"
    ]
   },
@@ -134,7 +125,7 @@
    "source": [
     "# chat history not passed so Llama doesn't have the context and doesn't know this is more about the book\n",
     "followup = \"tell me more\"\n",
-    "followup_answer = llm(followup)\n",
+    "followup_answer = llm.invoke(followup)\n",
     "print(followup_answer)"
    ]
   },
@@ -162,7 +153,7 @@
     "memory = ConversationBufferMemory()\n",
     "conversation = ConversationChain(\n",
     "    llm=llm, \n",
-    "    memory = memory,\n",
+    "    memory=memory,\n",
     "    verbose=False\n",
     ")"
    ]
@@ -208,11 +199,10 @@
    "id": "fc436163",
    "metadata": {},
    "source": [
-    "Next, let's explore using Llama 2 to answer questions using documents for context. \n",
-    "This gives us the ability to update Llama 2's knowledge thus giving it better context without needing to finetune. \n",
-    "For a more in-depth study of this, see the notebook on using Llama 2 locally [here](HelloLlamaLocal.ipynb)\n",
+    "Next, let's explore using Llama 3 to answer questions using documents for context. \n",
+    "This gives us the ability to update Llama 3's knowledge thus giving it better context without needing to finetune. \n",
     "\n",
-    "We will use the PyPDFLoader to load in a pdf, in this case, the Llama 2 paper."
+    "We will use the PyPDFLoader to load in a pdf, in this case, the Llama paper."
    ]
   },
   {
@@ -301,7 +291,7 @@
    "id": "54ad02d7",
    "metadata": {},
    "source": [
-    "We then use ` RetrievalQA` to retrieve the documents from the vector database and give the model more context on Llama 2, thereby increasing its knowledge.\n",
+    "We then use ` RetrievalQA` to retrieve the documents from the vector database and give the model more context on Llama, thereby increasing its knowledge.\n",
     "\n",
     "For each question, LangChain performs a semantic similarity search of it in the vector db, then passes the search results as the context to Llama to answer the question."
    ]
@@ -321,7 +311,7 @@
     "    retriever=vectordb.as_retriever()\n",
     ")\n",
     "\n",
-    "question = \"What is llama2?\"\n",
+    "question = \"What is llama?\"\n",
     "result = qa_chain({\"query\": question})\n",
     "print(result['result'])"
    ]
@@ -344,7 +334,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# no context passed so Llama2 doesn't have enough context to answer so it lets its imagination go wild\n",
+    "# no context passed so Llama doesn't have enough context to answer so it lets its imagination go wild\n",
     "result = qa_chain({\"query\": \"what are its use cases?\"})\n",
     "print(result['result'])"
    ]
@@ -376,7 +366,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# let's ask the original question \"What is llama2?\" again\n",
+    "# let's ask the original question \"What is llama?\" again\n",
     "result = chat_chain({\"question\": question, \"chat_history\": []})\n",
     "print(result['answer'])"
    ]
diff --git a/recipes/llama_api_providers/OctoAI_API_examples/LiveData.ipynb b/recipes/llama_api_providers/OctoAI_API_examples/LiveData.ipynb