Skip to content

Commit 850ad18

Browse files
committed
updated LiveData notebook for Llama 3
1 parent 5f9ac63 commit 850ad18

File tree

1 file changed

+44
-135
lines changed

1 file changed

+44
-135
lines changed

recipes/use_cases/LiveData.ipynb

Lines changed: 44 additions & 135 deletions
Original file line numberDiff line numberDiff line change
@@ -7,22 +7,7 @@
77
"source": [
88
"## This demo app shows:\n",
99
"* How to use LlamaIndex, an open source library to help you build custom data augmented LLM applications\n",
10-
"* How to ask Llama questions about recent live data via the You.com live search API and LlamaIndex\n",
11-
"\n",
12-
"The LangChain package is used to facilitate the call to Llama2 hosted on Replicate\n",
13-
"\n",
14-
"**Note** We will be using Replicate to run the examples here. You will need to first sign in with Replicate with your github account, then create a free API token [here](https://replicate.com/account/api-tokens) that you can use for a while. \n",
15-
"After the free trial ends, you will need to enter billing info to continue to use Llama2 hosted on Replicate."
16-
]
17-
},
18-
{
19-
"cell_type": "markdown",
20-
"id": "68cf076e",
21-
"metadata": {},
22-
"source": [
23-
"We start by installing the necessary packages:\n",
24-
"- [langchain](https://python.langchain.com/docs/get_started/introduction) which provides RAG capabilities\n",
25-
"- [llama-index](https://docs.llamaindex.ai/en/stable/) for data augmentation."
10+
"* How to ask Llama 3 questions about recent live data via the [Trvily](https://tavily.com) live search API"
2611
]
2712
},
2813
{
@@ -32,37 +17,27 @@
3217
"metadata": {},
3318
"outputs": [],
3419
"source": [
35-
"!pip install llama-index langchain"
36-
]
37-
},
38-
{
39-
"cell_type": "code",
40-
"execution_count": null,
41-
"id": "21fe3849",
42-
"metadata": {},
43-
"outputs": [],
44-
"source": [
45-
"# use ServiceContext to configure the LLM used and the custom embeddings \n",
46-
"from llama_index import ServiceContext\n",
47-
"\n",
48-
"# VectorStoreIndex is used to index custom data \n",
49-
"from llama_index import VectorStoreIndex\n",
50-
"\n",
51-
"from langchain.llms import Replicate"
20+
"!pip install llama-index \n",
21+
"!pip install llama-index-core\n",
22+
"!pip install llama-index-llms-replicate\n",
23+
"!pip install llama-index-embeddings-huggingface\n",
24+
"!pip install tavily-python"
5225
]
5326
},
5427
{
5528
"cell_type": "markdown",
56-
"id": "73e8e661",
29+
"id": "83639e83-2baa-4156-93a2-b9b6d4baf7d6",
5730
"metadata": {},
5831
"source": [
59-
"Next we set up the Replicate token."
32+
"You will be using [Replicate](https://replicate.com/meta/meta-llama-3-8b-instruct) to run the examples here. You will need to first sign in with Replicate with your github account, then create a free API token [here](https://replicate.com/account/api-tokens) that you can use for a while. You can also use other Llama 3 cloud providers such as [Groq](https://console.groq.com/), [Together](https://api.together.xyz/playground/language/meta-llama/Llama-3-8b-hf), or [Anyscale](https://app.endpoints.anyscale.com/playground) - see Section 2 of the Getting to Know Llama [notebook](https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/Getting_to_know_Llama.ipynb) for more information.\n",
33+
"\n",
34+
"If you'd like to run Llama 3 locally for the benefits of privacy, no cost or no rate limit (some Llama 3 hosting providers set limits for free plan of queries or tokens per second or minute), see [Running Llama Locally](https://github.com/meta-llama/llama-recipes/blob/main/recipes/quickstart/Running_Llama2_Anywhere/Running_Llama_on_Mac_Windows_Linux.ipynb)."
6035
]
6136
},
6237
{
6338
"cell_type": "code",
6439
"execution_count": null,
65-
"id": "d9d76e33",
40+
"id": "e6affd70-c909-4340-924f-f282912765d5",
6641
"metadata": {},
6742
"outputs": [],
6843
"source": [
@@ -75,58 +50,61 @@
7550
},
7651
{
7752
"cell_type": "markdown",
78-
"id": "f8ff812b",
53+
"id": "18582e1f-30b1-4dc5-918a-de2995eb5b46",
7954
"metadata": {},
8055
"source": [
81-
"In this example we will use the [YOU.com](https://you.com/) search engine to augment the LLM's responses.\n",
82-
"To use the You.com Search API, you can email [email protected] to request an API key. "
56+
"You'll set up the Llama 3 8b chat model from Replicate. You can also use Llama 3 70b model by replacing the `model` name with \"meta/meta-llama-3-70b-instruct\"."
8357
]
8458
},
8559
{
8660
"cell_type": "code",
8761
"execution_count": null,
88-
"id": "75275628-5235-4b55-8033-601c76107528",
62+
"id": "21fe3849",
8963
"metadata": {},
9064
"outputs": [],
9165
"source": [
66+
"from llama_index.core import Settings, VectorStoreIndex\n",
67+
"from llama_index.embeddings.huggingface import HuggingFaceEmbedding\n",
68+
"from llama_index.llms.replicate import Replicate\n",
69+
"\n",
70+
"Settings.llm = Replicate(\n",
71+
" model=\"meta/meta-llama-3-8b-instruct\",\n",
72+
" temperature=0.0,\n",
73+
" additional_kwargs={\"top_p\": 1, \"max_new_tokens\": 500},\n",
74+
")\n",
9275
"\n",
93-
"YOUCOM_API_KEY = getpass()\n",
94-
"os.environ[\"YOUCOM_API_KEY\"] = YOUCOM_API_KEY"
76+
"Settings.embed_model = HuggingFaceEmbedding(\n",
77+
" model_name=\"BAAI/bge-small-en-v1.5\"\n",
78+
")"
9579
]
9680
},
9781
{
9882
"cell_type": "markdown",
99-
"id": "cb210c7c",
83+
"id": "f8ff812b",
10084
"metadata": {},
10185
"source": [
102-
"We then call the Llama 2 model from replicate. \n",
103-
"\n",
104-
"We will use the llama 2 13b chat model. You can find more Llama 2 models by searching for them on the [Replicate model explore page](https://replicate.com/explore?query=llama).\n",
105-
"You can add them here in the format: model_name/version"
86+
"Next you will use the [Trvily](https://tavily.com/) search engine to augment the Llama 3's responses. To create a free trial Trvily Search API, sign in with your Google or Github account [here](https://app.tavily.com/sign-in)."
10687
]
10788
},
10889
{
10990
"cell_type": "code",
11091
"execution_count": null,
111-
"id": "c12fc2cb",
92+
"id": "75275628-5235-4b55-8033-601c76107528",
11293
"metadata": {},
11394
"outputs": [],
11495
"source": [
115-
"# set llm to be using Llama2 hosted on Replicate\n",
116-
"llama2_13b_chat = \"meta/llama-2-13b-chat:f4e2de70d66816a838a89eeeb621910adffb0dd0baba3976c96980970978018d\"\n",
96+
"from tavily import TavilyClient\n",
11797
"\n",
118-
"llm = Replicate(\n",
119-
" model=llama2_13b_chat,\n",
120-
" model_kwargs={\"temperature\": 0.01, \"top_p\": 1, \"max_new_tokens\":500}\n",
121-
")"
98+
"TAVILY_API_KEY = getpass()\n",
99+
"tavily = TavilyClient(api_key=TAVILY_API_KEY)"
122100
]
123101
},
124102
{
125103
"cell_type": "markdown",
126104
"id": "476d72da",
127105
"metadata": {},
128106
"source": [
129-
"Using our api key we set up earlier, we make a request from YOU.com for live data on a particular topic."
107+
"Do a live web search on \"Llama 3 fine-tuning\"."
130108
]
131109
},
132110
{
@@ -136,15 +114,8 @@
136114
"metadata": {},
137115
"outputs": [],
138116
"source": [
139-
"\n",
140-
"import requests\n",
141-
"\n",
142-
"query = \"Meta Connect\" # you can try other live data query about sports score, stock market and weather info \n",
143-
"headers = {\"X-API-Key\": os.environ[\"YOUCOM_API_KEY\"]}\n",
144-
"data = requests.get(\n",
145-
" f\"https://api.ydc-index.io/search?query={query}\",\n",
146-
" headers=headers,\n",
147-
").json()"
117+
"response = tavily.search(query=\"Llama 3 fine-tuning\")\n",
118+
"context = [{\"url\": obj[\"url\"], \"content\": obj[\"content\"]} for obj in response['results']]"
148119
]
149120
},
150121
{
@@ -154,55 +125,15 @@
154125
"metadata": {},
155126
"outputs": [],
156127
"source": [
157-
"# check the query result in JSON\n",
158-
"import json\n",
159-
"\n",
160-
"print(json.dumps(data, indent=2))"
161-
]
162-
},
163-
{
164-
"cell_type": "markdown",
165-
"id": "b196e697",
166-
"metadata": {},
167-
"source": [
168-
"We then use the [`JSONLoader`](https://llamahub.ai/l/file-json) to extract the text from the returned data. The `JSONLoader` gives us the ability to load the data into LamaIndex.\n",
169-
"In the next cell we show how to load the JSON result with key info stored as \"snippets\".\n",
170-
"\n",
171-
"However, you can also add the snippets in the query result to documents like below:\n",
172-
"```python \n",
173-
"from llama_index import Document\n",
174-
"snippets = [snippet for hit in data[\"hits\"] for snippet in hit[\"snippets\"]]\n",
175-
"documents = [Document(text=s) for s in snippets]\n",
176-
"```\n",
177-
"This can be handy if you just need to add a list of text strings to doc"
178-
]
179-
},
180-
{
181-
"cell_type": "code",
182-
"execution_count": null,
183-
"id": "7c40e73f-ca13-4f4a-a753-e613df3d389e",
184-
"metadata": {},
185-
"outputs": [],
186-
"source": [
187-
"# one way to load the JSON result with key info stored as \"snippets\"\n",
188-
"from llama_index import download_loader\n",
189-
"\n",
190-
"JsonDataReader = download_loader(\"JsonDataReader\")\n",
191-
"loader = JsonDataReader()\n",
192-
"documents = loader.load_data([hit[\"snippets\"] for hit in data[\"hits\"]])\n"
128+
"context"
193129
]
194130
},
195131
{
196132
"cell_type": "markdown",
197133
"id": "8e5e3b4e",
198134
"metadata": {},
199135
"source": [
200-
"With the data set up, we create a vector store for the data and a query engine for it.\n",
201-
"\n",
202-
"For our embeddings we will use `HuggingFaceEmbeddings` whose default embedding model is sentence-transformers/all-mpnet-base-v2. This model provides a good balance between speed and performance.\n",
203-
"To change the default model, call `HuggingFaceEmbeddings(model_name=<another_embedding_model>)`. \n",
204-
"\n",
205-
"For more info see https://huggingface.co/blog/mteb. "
136+
"Create documents based on the search results, index and save them to a vector store, then create a query engine."
206137
]
207138
},
208139
{
@@ -212,21 +143,11 @@
212143
"metadata": {},
213144
"outputs": [],
214145
"source": [
215-
"# use HuggingFace embeddings \n",
216-
"from langchain.embeddings.huggingface import HuggingFaceEmbeddings\n",
217-
"from llama_index import LangchainEmbedding\n",
146+
"from llama_index.core import Document\n",
218147
"\n",
148+
"documents = [Document(text=ct['content']) for ct in context]\n",
149+
"index = VectorStoreIndex.from_documents(documents)\n",
219150
"\n",
220-
"embeddings = LangchainEmbedding(HuggingFaceEmbeddings())\n",
221-
"print(embeddings)\n",
222-
"\n",
223-
"# create a ServiceContext instance to use Llama2 and custom embeddings\n",
224-
"service_context = ServiceContext.from_defaults(llm=llm, chunk_size=800, chunk_overlap=20, embed_model=embeddings)\n",
225-
"\n",
226-
"# create vector store index from the documents created above\n",
227-
"index = VectorStoreIndex.from_documents(documents, service_context=service_context)\n",
228-
"\n",
229-
"# create query engine from the index\n",
230151
"query_engine = index.as_query_engine(streaming=True)"
231152
]
232153
},
@@ -235,7 +156,7 @@
235156
"id": "2c4ea012",
236157
"metadata": {},
237158
"source": [
238-
"We are now ready to ask Llama 2 a question about the live data using our query engine."
159+
"You are now ready to ask Llama 3 questions about the live data using the query engine."
239160
]
240161
},
241162
{
@@ -245,7 +166,6 @@
245166
"metadata": {},
246167
"outputs": [],
247168
"source": [
248-
"# ask Llama2 a summary question about the search result\n",
249169
"response = query_engine.query(\"give me a summary\")\n",
250170
"response.print_response_stream()"
251171
]
@@ -257,8 +177,7 @@
257177
"metadata": {},
258178
"outputs": [],
259179
"source": [
260-
"# more questions\n",
261-
"query_engine.query(\"what products were announced\").print_response_stream()"
180+
"query_engine.query(\"what's the latest about Llama 3 fine-tuning?\").print_response_stream()"
262181
]
263182
},
264183
{
@@ -268,17 +187,7 @@
268187
"metadata": {},
269188
"outputs": [],
270189
"source": [
271-
"query_engine.query(\"tell me more about Meta AI assistant\").print_response_stream()"
272-
]
273-
},
274-
{
275-
"cell_type": "code",
276-
"execution_count": null,
277-
"id": "16a56542",
278-
"metadata": {},
279-
"outputs": [],
280-
"source": [
281-
"query_engine.query(\"what are Generative AI stickers\").print_response_stream()"
190+
"query_engine.query(\"tell me more about Llama 3 fine-tuning\").print_response_stream()"
282191
]
283192
}
284193
],
@@ -298,7 +207,7 @@
298207
"name": "python",
299208
"nbconvert_exporter": "python",
300209
"pygments_lexer": "ipython3",
301-
"version": "3.8.18"
210+
"version": "3.11.9"
302211
}
303212
},
304213
"nbformat": 4,

0 commit comments

Comments
 (0)