pathway-labs · szymondudycz · Feb 5, 2024
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 # ChatGPT Python API for sales
 
 This is an AI app to find **real-time** discounts/deals/sales prices from various online markets around the world. The project
-exposes an HTTP REST endpoint to answer user queries about current sales like [Amazon deals](https://www.amazon.com/gp/goldbox?ref_=nav_cs_gb) in a specific location or from the given any input file such as (CSV, Jsonlines, PDF, Markdown, Txt). It uses Pathway’s [LLM App features](https://github.com/pathwaycom/llm-app) to build real-time LLM(Large Language Model)-enabled data pipeline in Python and join data from multiple input sources, leverages OpenAI API [Embeddings](https://platform.openai.com/docs/api-reference/embeddings) and [Chat Completion](https://platform.openai.com/docs/api-reference/completions) endpoints to generate AI assistant responses.
+exposes an HTTP REST endpoint to answer user queries about current sales like [Amazon deals](https://www.amazon.com/gp/goldbox?ref_=nav_cs_gb) in a specific location or from the given any input file such as (CSV, Jsonlines, PDF, Markdown, Txt). It uses Pathway’s [LLM App features](https://github.com/pathwaycom/pathway) to build real-time LLM(Large Language Model)-enabled data pipeline in Python and join data from multiple input sources, leverages OpenAI API [Embeddings](https://platform.openai.com/docs/api-reference/embeddings) and [Chat Completion](https://platform.openai.com/docs/api-reference/completions) endpoints to generate AI assistant responses.
 
 Currently, the project supports two types of data sources and it is **possible to extend sources** by adding custom input connectors:
 
@@ -86,7 +86,7 @@ pw.run()
 - Real-time data.
 - Including discount information.
 
-The model might not answer such queries properly. Because it is not aware of the context or historical data or it needs additional details. In this case, you can use LLM App efficiently to give context to this search or answer process.  See how LLM App [works](https://github.com/pathwaycom/llm-app#how-it-works).
+The model might not answer such queries properly. Because it is not aware of the context or historical data or it needs additional details. In this case, you can use Pathway LLM xpack efficiently to give context to this search or answer process. See how LLM Application built with Pathway [works](https://github.com/pathwaycom/llm-app#how-it-works).
 
 For example, a typical response you can get from the OpenAI [Chat Completion endpoint](https://platform.openai.com/docs/api-reference/chat) or [ChatGPT UI](https://chat.openai.com/) interface without context is:
 
@@ -250,4 +250,4 @@ Ship Date: 2024-08-09
 
 1. [Set environment variables](#step-2-set-environment-variables)
 2. From the project root folder, open your terminal and run `docker compose up`.
-3. Navigate to `localhost:8501` on your browser when docker installion is successful.
+3. Navigate to `localhost:8501` on your browser when docker installation is successful.
diff --git a/common/openaiapi_helper.py b/common/openaiapi_helper.py
@@ -1,6 +1,8 @@
 from dotenv import load_dotenv
 import os
-from llm_app.model_wrappers import OpenAIEmbeddingModel, OpenAIChatGPTModel
+import pathway as pw
+from pathway.xpacks.llm.embedders import OpenAIEmbedder
+from pathway.xpacks.llm.llms import OpenAIChat, prompt_chat_single_qa
 
 load_dotenv()
 
@@ -13,17 +15,24 @@
 
 
 def openai_embedder(data):
-    embedder = OpenAIEmbeddingModel(api_key=api_key)
+    embedder = OpenAIEmbedder(
+        api_key=api_key,
+        model=embedder_locator,
+        retry_strategy=pw.asynchronous.FixedDelayRetryStrategy(),
+        cache_strategy=pw.asynchronous.DefaultCache(),
+    )
 
-    return embedder.apply(text=data, locator=embedder_locator)
+    return embedder(data)
 
 
 def openai_chat_completion(prompt):
-    model = OpenAIChatGPTModel(api_key=api_key)
-
-    return model.apply(
-            prompt,
-            locator=model_locator,
-            temperature=temperature,
-            max_tokens=max_tokens,
-        )
+    model = OpenAIChat(
+        api_key=api_key,
+        model=model_locator,
+        temperature=temperature,
+        retry_strategy=pw.asynchronous.FixedDelayRetryStrategy(),
+        cache_strategy=pw.asynchronous.DefaultCache(),
+        max_tokens=max_tokens,
+    )
+
+    return model(prompt_chat_single_qa(prompt))
diff --git a/examples/rainforest/rainforestapi_helper.py b/examples/rainforest/rainforestapi_helper.py
@@ -1,6 +1,7 @@
 import os
 import requests
 import json
+import json5
 from dotenv import load_dotenv
 from urllib.parse import urlencode
 
@@ -29,7 +30,8 @@ def send_request(data_dir, params):
     response = requests.get(get_url(params))
 
     if response.status_code == 200:
-        data = response.json()
+        # needed because of trailing coma in returned json array
+        data = json5.loads(response.text)
 
         deals_results = data.get('deals_results', [])
 

diff --git a/examples/ui/app.py b/examples/ui/app.py
@@ -19,11 +19,11 @@
     st.markdown("# About")
     st.markdown(
         "AI app to find real-time discounts from various online markets [Amazon deals](https://www.amazon.com/gp/goldbox?ref_=nav_cs_gb) in a specific location. "
-        "It uses Pathway’s [LLM App features](https://github.com/pathwaycom/llm-app) "
+        "It uses Pathway’s [LLM App features](https://github.com/pathwaycom/pathway) "
         "to build real-time LLM(Large Language Model)-enabled data pipeline in Python and join data from multiple input sources\n"
 
     )
-    st.markdown("[View the source code on GitHub](https://github.com/Boburmirzo/chatgpt-api-python-sales)")
+    st.markdown("[View the source code on GitHub](https://github.com/pathway-labs/chatgpt-api-python-sales)")
 
 # Load environment variables
 load_dotenv()

diff --git a/requirements.txt b/requirements.txt
@@ -2,6 +2,7 @@ pathway
 pandas
 requests
 datetime
-llm_app
 streamlit
-python-dotenv
+python-dotenv
+litellm
+json5