Skip to content

Commit ec8d923

Browse files
authored
Merge branch 'master' into valekjo-webhook-globals
2 parents 68fe85a + 982f4b6 commit ec8d923

File tree

8 files changed

+244
-20
lines changed

8 files changed

+244
-20
lines changed

nginx.conf

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,10 @@ server {
305305
# Removed pages
306306
# GPT plugins were discontinued April 9th, 2024 - https://help.openai.com/en/articles/8988022-winding-down-the-chatgpt-plugins-beta
307307
rewrite ^/platform/integrations/chatgpt-plugin$ https://blog.apify.com/add-custom-actions-to-your-gpts/ redirect;
308+
309+
# Python docs
310+
311+
rewrite ^/api/client/python/docs$ /api/client/python/docs/overview/introduction permanent;
308312
}
309313

310314
# Temporarily used to route crawlee.dev to the Crawlee GitHub pages.

sources/platform/actors/development/builds_and_runs/state_persistence.md

Lines changed: 51 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ By default, an Actor keeps its state in the server's memory. During a server swi
5151

5252
The [Apify SDKs](/sdk) handle state persistence automatically.
5353

54-
This is done using the `Actor.on()` method and the `migrating` event.
54+
This is done using the `Actor.on()` method and the `migrating` event.
5555

5656
- The `migrating` event is triggered just before a migration occurs, allowing you to save your state.
5757
- To retrieve previously saved state, you can use the [`Actor.getValue`](/sdk/js/reference/class/Actor#getValue)/[`Actor.get_value`](/sdk/python/reference/class/Actor#get_value) methods.
@@ -81,15 +81,15 @@ await Actor.exit();
8181
<TabItem value="Python" label="Python">
8282

8383
```python
84-
from apify import Actor
84+
from apify import Actor, Event
8585

86-
async def actor_migrate():
86+
async def actor_migrate(_event_data):
8787
await Actor.set_value('my-crawling-state', {'foo': 'bar'})
8888

8989
async def main():
9090
async with Actor:
9191
# ...
92-
Actor.on('migrating', actor_migrate)
92+
Actor.on(Event.MIGRATING, actor_migrate)
9393
# ...
9494
```
9595

@@ -128,3 +128,50 @@ async def main():
128128
</Tabs>
129129

130130
For improved Actor performance consider [caching repeated page data](/academy/expert-scraping-with-apify/saving-useful-stats).
131+
132+
## Speeding up migrations
133+
134+
Once your Actor receives the `migrating` event, the Apify platform will shut it down and restart it on a new server within one minute.
135+
To speed this process up, once you have persisted the Actor state,
136+
you can manually reboot the Actor in the `migrating` event handler using the `Actor.reboot()` method
137+
available in the [Apify SDK for JavaScript](/sdk/js/reference/class/Actor#reboot) or [Apify SDK for Python](/sdk/python/reference/class/Actor#reboot).
138+
139+
<Tabs groupId="main">
140+
<TabItem value="JavaScript" label="JavaScript">
141+
142+
```js
143+
import { Actor } from 'apify';
144+
145+
await Actor.init();
146+
// ...
147+
Actor.on('migrating', async () => {
148+
// ...
149+
// save state
150+
// ...
151+
await Actor.reboot();
152+
});
153+
// ...
154+
await Actor.exit();
155+
```
156+
157+
</TabItem>
158+
<TabItem value="Python" label="Python">
159+
160+
```python
161+
from apify import Actor, Event
162+
163+
async def actor_migrate(_event_data):
164+
# ...
165+
# save state
166+
# ...
167+
await Actor.reboot()
168+
169+
async def main():
170+
async with Actor:
171+
# ...
172+
Actor.on(Event.MIGRATING, actor_migrate)
173+
# ...
174+
```
175+
176+
</TabItem>
177+
</Tabs>

sources/platform/integrations/ai/haystack.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,4 +185,5 @@ To run it, you can use the following command: `python apify_integration.py`
185185
- [Apify-haystack integration documentation](https://haystack.deepset.ai/integrations/apify)
186186
- [Apify-haystack integration source code](https://github.com/apify/apify-haystack)
187187
- [Example: RAG - Extract and use website content for question answering](https://haystack.deepset.ai/cookbook/apify_haystack_rag)
188+
- [Example: RAG: Web Search and Analysis with Apify and Haystack](https://haystack.deepset.ai/cookbook/apify_haystack_rag_web_browser)
188189
- [Example: Analyze Your Instagram Comments’ Vibe](https://haystack.deepset.ai/cookbook/apify_haystack_instagram_comments_analysis)

sources/platform/integrations/ai/langchain.md

Lines changed: 25 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ If you prefer to use JavaScript, you can follow the [JavaScript LangChain docum
2020

2121
Before we start with the integration, we need to install all dependencies:
2222

23-
`pip install apify-client langchain langchain_community langchain_openai openai tiktoken`
23+
`pip install langchain langchain-openai langchain-apify`
2424

2525
After successful installation of all dependencies, we can start writing code.
2626

@@ -30,9 +30,10 @@ First, import all required packages:
3030
import os
3131

3232
from langchain.indexes import VectorstoreIndexCreator
33-
from langchain_community.utilities import ApifyWrapper
34-
from langchain_core.document_loaders.base import Document
35-
from langchain_openai import OpenAI
33+
from langchain_apify import ApifyWrapper
34+
from langchain_core.documents import Document
35+
from langchain_core.vectorstores import InMemoryVectorStore
36+
from langchain_openai import ChatOpenAI
3637
from langchain_openai.embeddings import OpenAIEmbeddings
3738
```
3839

@@ -49,6 +50,7 @@ Note that if you already have some results in an Apify dataset, you can load the
4950

5051
```python
5152
apify = ApifyWrapper()
53+
llm = ChatOpenAI(model="gpt-4o-mini")
5254

5355
loader = apify.call_actor(
5456
actor_id="apify/website-content-crawler",
@@ -68,14 +70,17 @@ The Actor call may take some time as it crawls the LangChain documentation websi
6870
Initialize the vector index from the crawled documents:
6971

7072
```python
71-
index = VectorstoreIndexCreator(embedding=OpenAIEmbeddings()).from_loaders([loader])
73+
index = VectorstoreIndexCreator(
74+
vectorstore_cls=InMemoryVectorStore,
75+
embedding=OpenAIEmbeddings()
76+
).from_loaders([loader])
7277
```
7378

7479
And finally, query the vector index:
7580

7681
```python
7782
query = "What is LangChain?"
78-
result = index.query_with_sources(query, llm=OpenAI())
83+
result = index.query_with_sources(query, llm=llm)
7984

8085
print("answer:", result["answer"])
8186
print("source:", result["sources"])
@@ -87,15 +92,17 @@ If you want to test the whole example, you can simply create a new file, `langch
8792
import os
8893

8994
from langchain.indexes import VectorstoreIndexCreator
90-
from langchain_community.utilities import ApifyWrapper
91-
from langchain_core.document_loaders.base import Document
92-
from langchain_openai import OpenAI
95+
from langchain_apify import ApifyWrapper
96+
from langchain_core.documents import Document
97+
from langchain_core.vectorstores import InMemoryVectorStore
98+
from langchain_openai import ChatOpenAI
9399
from langchain_openai.embeddings import OpenAIEmbeddings
94100

95101
os.environ["OPENAI_API_KEY"] = "Your OpenAI API key"
96102
os.environ["APIFY_API_TOKEN"] = "Your Apify API token"
97103

98104
apify = ApifyWrapper()
105+
llm = ChatOpenAI(model="gpt-4o-mini")
99106

100107
print("Call website content crawler ...")
101108
loader = apify.call_actor(
@@ -104,9 +111,12 @@ loader = apify.call_actor(
104111
dataset_mapping_function=lambda item: Document(page_content=item["text"] or "", metadata={"source": item["url"]}),
105112
)
106113
print("Compute embeddings...")
107-
index = VectorstoreIndexCreator(embedding=OpenAIEmbeddings()).from_loaders([loader])
114+
index = VectorstoreIndexCreator(
115+
vectorstore_cls=InMemoryVectorStore,
116+
embedding=OpenAIEmbeddings()
117+
).from_loaders([loader])
108118
query = "What is LangChain?"
109-
result = index.query_with_sources(query, llm=OpenAI())
119+
result = index.query_with_sources(query, llm=llm)
110120

111121
print("answer:", result["answer"])
112122
print("source:", result["sources"])
@@ -117,9 +127,11 @@ To run it, you can use the following command: `python langchain_integration.py`
117127
After running the code, you should see the following output:
118128

119129
```text
120-
answer: LangChain is a framework for developing applications powered by language models. It provides standard, extendable interfaces, external integrations, and end-to-end implementations for off-the-shelf use. It also integrates with other LLMs, systems, and products to create a vibrant and thriving ecosystem.
130+
answer: LangChain is a framework designed for developing applications powered by large language models (LLMs). It simplifies the
131+
entire application lifecycle, from development to productionization and deployment. LangChain provides open-source components a
132+
nd integrates with various third-party tools, making it easier to build and optimize applications using language models.
121133
122-
source: https://python.langchain.com
134+
source: https://python.langchain.com/docs/get_started/introduction
123135
```
124136

125137
LangChain is a standard interface through which you can interact with a variety of large language models (LLMs).
Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
---
2+
title: 🦜🔘➡️ LangGraph integration
3+
sidebar_label: LangGraph
4+
description: Learn how to build AI Agents with Apify and LangGraph 🦜🔘➡️.
5+
sidebar_position: 1
6+
slug: /integrations/langgraph
7+
---
8+
9+
**Learn how to build AI Agents with Apify and LangGraph.**
10+
11+
---
12+
13+
## What is LangGraph
14+
15+
[LangGraph](https://www.langchain.com/langgraph) is a framework designed for constructing stateful, multi-agent applications with Large Language Models (LLMs), allowing developers to build complex AI agent workflows that can leverage tools, APIs, and databases.
16+
17+
:::note Explore LangGraph
18+
19+
For more in-depth details on LangGraph, check out its [official documentation](https://langchain-ai.github.io/langgraph/).
20+
21+
:::
22+
23+
## How to use Apify with LangGraph
24+
25+
This guide will demonstrate how to use Apify Actors with LangGraph by building a ReAct agent that will use the [RAG Web Browser](https://apify.com/apify/rag-web-browser) Actor to search Google for TikTok profiles and [TikTok Data Extractor](https://apify.com/clockworks/free-tiktok-scraper) Actor to extract data from the TikTok profiles to analyze the profiles.
26+
27+
### Prerequisites
28+
29+
- **Apify API token**: To use Apify Actors in LangGraph, you need an Apify API token. If you don't have one, you can learn how to obtain it in the [Apify documentation](https://docs.apify.com/platform/integrations/api).
30+
31+
- **OpenAI API key**: In order to work with agents in LangGraph, you need an OpenAI API key. If you don't have one, you can get it from the [OpenAI platform](https://platform.openai.com/account/api-keys).
32+
33+
- **Python packages**: You need to install the following Python packages:
34+
35+
```bash
36+
pip install langgraph langchain-apify langchain-openai
37+
```
38+
39+
### Building the TikTok profile search and analysis agent
40+
41+
First, import all required packages:
42+
43+
```python
44+
import os
45+
46+
from langchain_apify import ApifyActorsTool
47+
from langchain_core.messages import HumanMessage
48+
from langchain_openai import ChatOpenAI
49+
from langgraph.prebuilt import create_react_agent
50+
```
51+
52+
Next, set the environment variables for the Apify API token and OpenAI API key:
53+
54+
```python
55+
os.environ["OPENAI_API_KEY"] = "Your OpenAI API key"
56+
os.environ["APIFY_API_TOKEN"] = "Your Apify API token"
57+
```
58+
59+
Instantiate LLM and Apify Actors tools:
60+
61+
```python
62+
llm = ChatOpenAI(model="gpt-4o-mini")
63+
64+
browser = ApifyActorsTool("apify/rag-web-browser")
65+
tiktok = ApifyActorsTool("clockworks/free-tiktok-scraper")
66+
```
67+
68+
Create the ReAct agent with the LLM and Apify Actors tools:
69+
70+
```python
71+
tools = [browser, tiktok]
72+
agent_executor = create_react_agent(llm, tools)
73+
```
74+
75+
Finally, run the agent and stream the messages:
76+
77+
```python
78+
for state in agent_executor.stream(
79+
stream_mode="values",
80+
input={
81+
"messages": [
82+
HumanMessage(content="Search the web for OpenAI TikTok profile and analyze their profile.")
83+
]
84+
}):
85+
state["messages"][-1].pretty_print()
86+
```
87+
88+
:::note Search and analysis may take some time
89+
90+
The agent tool call may take some time as it searches the web for OpenAI TikTok profiles and analyzes them.
91+
92+
:::
93+
94+
You will see the agent's messages in the console, which will show each step of the agent's workflow.
95+
96+
```text
97+
================================ Human Message =================================
98+
99+
Search the web for OpenAI TikTok profile and analyze their profile.
100+
================================== AI Message ==================================
101+
Tool Calls:
102+
apify_actor_apify_rag-web-browser (call_y2rbmQ6gYJYC2lHzWJAoKDaq)
103+
Call ID: call_y2rbmQ6gYJYC2lHzWJAoKDaq
104+
Args:
105+
run_input: {"query":"OpenAI TikTok profile","maxResults":1}
106+
107+
...
108+
109+
================================== AI Message ==================================
110+
111+
The OpenAI TikTok profile is titled "OpenAI (@openai) Official." Here are some key details about the profile:
112+
113+
- **Followers**: 592.3K
114+
- **Likes**: 3.3M
115+
- **Description**: The profile features "low key research previews" and includes videos that showcase their various projects and research developments.
116+
117+
### Profile Overview:
118+
- **Profile URL**: [OpenAI TikTok Profile](https://www.tiktok.com/@openai?lang=en)
119+
- **Content Focus**: The posts primarily involve previews of OpenAI's research and various AI-related innovations.
120+
121+
...
122+
123+
```
124+
125+
126+
If you want to test the whole example, you can simply create a new file, `langgraph_integration.py`, and copy the whole code into it.
127+
128+
```python
129+
import os
130+
131+
from langchain_apify import ApifyActorsTool
132+
from langchain_core.messages import HumanMessage
133+
from langchain_openai import ChatOpenAI
134+
from langgraph.prebuilt import create_react_agent
135+
136+
os.environ["OPENAI_API_KEY"] = "Your OpenAI API key"
137+
os.environ["APIFY_API_TOKEN"] = "Your Apify API token"
138+
139+
llm = ChatOpenAI(model="gpt-4o-mini")
140+
141+
browser = ApifyActorsTool("apify/rag-web-browser")
142+
tiktok = ApifyActorsTool("clockworks/free-tiktok-scraper")
143+
144+
tools = [browser, tiktok]
145+
agent_executor = create_react_agent(llm, tools)
146+
147+
for state in agent_executor.stream(
148+
stream_mode="values",
149+
input={
150+
"messages": [
151+
HumanMessage(content="Search the web for OpenAI TikTok profile and analyze their profile.")
152+
]
153+
}):
154+
state["messages"][-1].pretty_print()
155+
```
156+
157+
## Resources
158+
159+
- [Apify Actors](https://docs.apify.com/platform/actors)
160+
- [LangGraph - How to Create a ReAct Agent](https://langchain-ai.github.io/langgraph/how-tos/create-react-agent/)

sources/platform/storage/dataset.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -147,7 +147,7 @@ You can then use that variable to [access the dataset's items and manage it](/ap
147147

148148
> When using the [`.list_items()`](/api/client/python/reference/class/DatasetClient#list_items) method, if you fill both `omit` and `field` parameters with the same value, then `omit` parameter will take precedence and the field is excluded from the results.
149149
150-
Check out the [Python API client documentation](/api/client/python/reference/class/DatasetClient) for [help with setup](/api/client/python/docs) and more details.
150+
Check out the [Python API client documentation](/api/client/python/reference/class/DatasetClient) for [help with setup](/api/client/python/docs/overview/introduction) and more details.
151151

152152
### Apify SDKs
153153

sources/platform/storage/key_value_store.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ my_key_val_store_client = apify_client.key_value_store('jane-doe/my-key-val-stor
124124

125125
You can then use that variable to [access the key-value store's items and manage it](/api/client/python/reference/class/KeyValueStoreClient).
126126

127-
Check out the [Python API client documentation](/api/client/python/reference/class/KeyValueStoreClient) for [help with setup](/api/client/python/docs) and more details.
127+
Check out the [Python API client documentation](/api/client/python/reference/class/KeyValueStoreClient) for [help with setup](/api/client/python/docs/overview/introduction) and more details.
128128

129129
### Apify SDKs
130130

sources/platform/storage/request_queue.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,7 @@ my_queue_client = apify_client.request_queue('jane-doe/my-request-queue')
135135

136136
You can then use that variable to [access the request queue's items and manage it](/api/client/python/reference/class/RequestQueueClient).
137137

138-
Check out the [Python API client documentation](/api/client/python/reference/class/RequestQueueClient) for [help with setup](/api/client/python/docs) and more details.
138+
Check out the [Python API client documentation](/api/client/python/reference/class/RequestQueueClient) for [help with setup](/api/client/python/docs/overview/introduction) and more details.
139139

140140
### Apify SDKs
141141

0 commit comments

Comments
 (0)