Skip to content

Commit c0d6a7a

Browse files
jirispilkamtrunkatTC-MO
authored
docs: Add documentation for Qdrant integration (#1077)
Documentation for Qdrant database --------- Co-authored-by: Marek Trunkát <[email protected]> Co-authored-by: Michał Olender <[email protected]>
1 parent ea7d38b commit c0d6a7a

File tree

6 files changed

+191
-0
lines changed

6 files changed

+191
-0
lines changed
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
---
2+
title: Qdrant integration
3+
description: Learn how to integrate Apify with Qdrant to feed data crawled from the web into the Qdrant vector database.
4+
sidebar_label: Qdrant
5+
sidebar_position: 4
6+
slug: /integrations/qdrant
7+
toc_min_heading_level: 2
8+
toc_max_heading_level: 4
9+
---
10+
11+
**Learn how to integrate Apify with Qdrant to transfer crawled data into the Qdrant vector database.**
12+
13+
---
14+
15+
[Qdrant](https://qdrant.tech) is a high performance managed vector database that allows users to store and query dense vectors for next generation AI applications such as recommendation systems, semantic search, and retrieval augmented generation (RAG).
16+
17+
The Apify integration for Qdrant enables you to export results from Apify Actors and Dataset items into a specific Qdrant collection.
18+
19+
## Prerequisites
20+
21+
Before you begin, ensure that you have the following:
22+
23+
- A [Qdrant database](https://qdrant.tech) set up.
24+
- A Qdrant URL to the database and Qdrant API token.
25+
- An [OpenAI API key](https://openai.com/index/openai-api/) to compute text embeddings.
26+
- An [Apify API token](https://docs.apify.com/platform/integrations/api#api-token) to access [Apify Actors](https://apify.com/store).
27+
28+
29+
### Integration Methods
30+
31+
You can integrate Apify with Qdrant using either the Apify Console or the Apify Python SDK.
32+
33+
:::note Website Content Crawler usage
34+
35+
The examples utilize the Website Content Crawler Actor, which deeply crawls websites, cleans HTML by removing modals and navigation elements, and converts HTML to Markdown for training AI models or providing web content to LLMs and generative AI applications.
36+
37+
:::
38+
39+
#### Apify Console
40+
41+
1. Set up the [Website Content Crawler](https://apify.com/apify/website-content-crawler) Actor in the [Apify Console](https://console.apify.com). Refer to this guide on how to set up [website content crawl for your project](https://blog.apify.com/talk-to-your-website-with-large-language-models/).
42+
43+
1. Once you have the crawler ready, navigate to the integration section and add Qdrant integration via Connect Actor or Task.
44+
45+
![Website Content Crawler with Qdrant integration](../images/qdrant-wcc-integration.png)
46+
47+
1. Search for Qdrant integration and connect it.
48+
49+
![Website Content Crawler with Qdrant integration](../images/qdrant-wcc-integration-connect.png)
50+
51+
1. Select when to trigger this integration (typically when a run succeeds) and fill in all the required fields for the Qdrant integration. You can learn more about the input parameters at the [Qdrant integration input schema](https://apify.com/apify/qdrant-integration).
52+
53+
![Qdrant integration configuration](../images/qdrant-integration-setup.png)
54+
55+
1. For a comprehensive explanation on how to combine Actors to accomplish more complex tasks, refer to the guide on [Actor-to-Actor](https://blog.apify.com/connecting-scrapers-apify-integration/) integrations.
56+
57+
#### Python
58+
59+
Another way to interact with Qdrant is through the [Apify Python SDK](https://docs.apify.com/sdk/python/).
60+
61+
1. Install the Apify Python SDK by running the following command:
62+
63+
```py
64+
pip install apify-client
65+
```
66+
67+
1. Create a Python script and import all the necessary modules:
68+
69+
```python
70+
from apify_client import ApifyClient
71+
72+
APIFY_API_TOKEN = "YOUR-APIFY-TOKEN"
73+
OPENAI_API_KEY = "YOUR-OPENAI-API-KEY"
74+
75+
QDRANT_URL = "YOUR-QDRANT-URL"
76+
QDRANT_API_KEY = "YOUR-QDRANT-API-KEY"
77+
78+
client = ApifyClient(APIFY_API_TOKEN)
79+
```
80+
81+
1. Call the [Website Content Crawler](https://apify.com/apify/website-content-crawler) Actor to crawl the Qdrant documentation and extract text content from the web pages:
82+
83+
```python
84+
actor_call = client.actor("apify/website-content-crawler").call(
85+
run_input={"startUrls": [{"url": "https://qdrant.tech/documentation/"}]}
86+
)
87+
```
88+
89+
1. Call Apify's Qdrant integration and store all data in the Qdrant Vector Database:
90+
91+
```python
92+
qdrant_integration_inputs = {
93+
"qdrantUrl": QDRANT_URL,
94+
"qdrantApiKey": QDRANT_API_KEY,
95+
"qdrantCollectionName": "apify",
96+
"qdrantAutoCreateCollection": True,
97+
"datasetId": actor_call["defaultDatasetId"],
98+
"datasetFields": ["text"],
99+
"enableDeltaUpdates": True,
100+
"deltaUpdatesPrimaryDatasetFields": ["url"],
101+
"expiredObjectDeletionPeriodDays": 30,
102+
"embeddingsProvider": "OpenAI",
103+
"embeddingsApiKey": OPENAI_API_KEY,
104+
"performChunking": True,
105+
"chunkSize": 1000,
106+
"chunkOverlap": 0,
107+
}
108+
actor_call = client.actor("apify/qdrant-integration").call(run_input=qdrant_integration_inputs)
109+
110+
```
111+
112+
You have successfully integrated Apify with Qdrant and the data is now stored in the Qdrant vector database.
113+
114+
## Additional Resources
115+
116+
- [Apify Qdrant integration](https://apify.com/apify/qdrant-integration)
117+
- [Qdrant documentation](https://qdrant.tech/documentation/)
43.3 KB
Loading
70.2 KB
Loading
445 KB
Loading

sources/platform/integrations/index.mdx

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,12 @@ If you are working on an AI/LLM-related project, we recommend you look into the
171171
imageUrl="/img/platform/integrations/pinecone.svg"
172172
smallImage
173173
/>
174+
<Card
175+
title="Qdrant"
176+
to="./integrations/qdrant"
177+
imageUrl="/img/platform/integrations/qdrant.svg"
178+
smallImage
179+
/>
174180
</CardGrid>
175181

176182
## Other Actors
Lines changed: 68 additions & 0 deletions
Loading

0 commit comments

Comments
 (0)