Skip to content

Commit 7c62100

Browse files
Merge branch 'pre/beta' of https://github.com/vedovati-matteo/Scrapegraph-ai into pre/beta
2 parents f17cef9 + 8423f10 commit 7c62100

File tree

78 files changed

+1848
-212
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

78 files changed

+1848
-212
lines changed

CHANGELOG.md

Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,145 @@
1+
## [1.13.3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.2...v1.13.3) (2024-08-10)
2+
3+
4+
### Bug Fixes
5+
6+
* conditional node ([778efd4](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/778efd4c87c69754bfbbf7a80d652f4cfd31a361))
7+
8+
## [1.13.2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.1...v1.13.2) (2024-08-10)
9+
10+
11+
### Bug Fixes
12+
13+
* fetch node ([f01b55e](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/f01b55e89b1365760f0dce4fa15ac0e74d280c57))
14+
15+
16+
### chore
17+
18+
* update gemini model to "gemini-pro" ([a7264ce](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/a7264cebd28857b4a13e7db2f27e80e5b57e4407))
19+
20+
## [1.13.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.0...v1.13.1) (2024-08-09)
21+
22+
23+
### Bug Fixes
24+
25+
* conditional node ([ce00345](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/ce003454953e5785d4746223c252de38cd5d07ea))
26+
27+
## [1.13.0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.12.2...v1.13.0) (2024-08-09)
28+
## [1.13.0-beta.9](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.0-beta.8...v1.13.0-beta.9) (2024-08-10)
29+
30+
31+
32+
### Features
33+
34+
* add grok integration ([fa651d4](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/fa651d4cd9ab8ae9cf58280f1256ceb4171ef088))
35+
* add mistral support ([17f2707](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/17f2707313f65a1e96443b3c8a1f5137892f2c5a))
36+
* update base_graph ([0571b6d](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/0571b6da55920bfe691feef2e1ecb5f3760dabf7))
37+
38+
39+
### Bug Fixes
40+
41+
* **chunking:** count tokens from words instead of characters ([5ec2de9](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/5ec2de9e1a14def5596738b6cdf769f5039a246d)), closes [#513](https://github.com/ScrapeGraphAI/Scrapegraph-ai/issues/513)
42+
* **FetchNode:** handling of missing browser_base key ([07720b6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/07720b6e0ca10ba6ce3c1359706a09baffcc4ad0))
43+
* **AbstractGraph:** LangChain warnings handling, Mistral tokens ([786af99](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/786af992f8fbdadfdc3d2d6a06c0cfd81289f8f2))
44+
* **FetchNode:** missing bracket syntax error ([50edbcc](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/50edbcc7f80e419f72f3f69249fec4a37597ef9a))
45+
* refactoring of fetch_node ([29ad140](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/29ad140fa399e9cdd98289a70506269db25fb599))
46+
* refactoring of fetch_node adding comment ([bfc6852](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/bfc6852b77b643e34543f7e436349f73d4ba1b5a))
47+
* refactoring of fetch_node qixed error ([1ea2ad8](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/1ea2ad8e79e9777c60f86565ed4930ee46e1ca53))
48+
* refactoring of merge_answer_node ([898e5a7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/898e5a7af504fbf4c1cabb14103e66184037de49))
49+
50+
51+
### chore
52+
53+
* **models_tokens:** add mistral models ([5e82432](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/5e824327c3acb69d53f3519344d0f8c2e3defa8b))
54+
* **mistral:** create examples ([f8ad616](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/f8ad616e10c271443e2dcb4123c8ddb91de2ff69))
55+
* **examples:** fix Mistral examples ([b0ffc51](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/b0ffc51e5415caec562a565710f5195afe1fbcb2))
56+
* update requirements for mistral ([9868555](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/986855512319541d1d02356df9ad61ab7fc5d807))
57+
58+
59+
### CI
60+
61+
* **release:** 1.11.0-beta.11 [skip ci] ([579d3f3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/579d3f394b54636673baf8e9f619f1c57a2ecce4))
62+
* **release:** 1.11.0-beta.12 [skip ci] ([cf2a17e](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/cf2a17ed5d79c62271fd9ea8ec89793884b04b56))
63+
* **release:** 1.13.0-beta.1 [skip ci] ([8eb66f6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/8eb66f6e22d6b53f0fb73d0da18302e7b00b99e3))
64+
* **release:** 1.13.0-beta.2 [skip ci] ([684d01a](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/684d01a2cb979c076a0f9d64855debd79b32ad58))
65+
* **release:** 1.13.0-beta.3 [skip ci] ([6b053cf](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/6b053cfc95655f122baef999325888c13f4af883))
66+
* **release:** 1.13.0-beta.4 [skip ci] ([7f1f750](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/7f1f7503f7c83c2e4d41a906fb3aa6012a2e0f52))
67+
* **release:** 1.13.0-beta.5 [skip ci] ([2eba73b](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/2eba73b784ee443260117e98ab7c943934b3018d)), closes [#513](https://github.com/ScrapeGraphAI/Scrapegraph-ai/issues/513)
68+
* **release:** 1.13.0-beta.6 [skip ci] ([e75b574](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/e75b574b67040e127599da9ee1b0eee13d234cb9))
69+
* **release:** 1.13.0-beta.7 [skip ci] ([6e56925](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/6e56925355c424edae290c70fd98646ab5f420ee))
70+
* add refactoring of default temperature ([6c3b37a](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/6c3b37ab001b80c09ea9ffb56d4c3df338e33a7a))
71+
72+
## [1.13.0-beta.8](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.0-beta.7...v1.13.0-beta.8) (2024-08-09)
73+
74+
75+
### Bug Fixes
76+
77+
* broken node ([1272273](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/127227349915deeb0dede34aa575ad269ed7cbe3))
78+
79+
80+
## [1.13.0-beta.7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.0-beta.6...v1.13.0-beta.7) (2024-08-09)
81+
82+
83+
### Bug Fixes
84+
85+
* generate answer node omni ([b52e4a3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/b52e4a390bb23ca55922e47046db558e1969a047))
86+
* generate answer node pdf has a bug ([625ca9f](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/625ca9f22a91a292a844ddb45e0edc767bf24711))
87+
88+
89+
### CI
90+
91+
* **release:** 1.12.1 [skip ci] ([928f704](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/928f7040ab1ef3a87f1cbad599b888940fa835c4))
92+
* **release:** 1.12.2 [skip ci] ([ece605e](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/ece605e3ee0aa110501f6642eb687831a4d0660b))
93+
94+
## [1.12.2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.12.1...v1.12.2) (2024-08-07)
95+
96+
97+
98+
### Bug Fixes
99+
100+
* generate answer node omni ([b52e4a3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/b52e4a390bb23ca55922e47046db558e1969a047))
101+
102+
## [1.12.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.12.0...v1.12.1) (2024-08-07)
103+
104+
* **FetchNode:** missing bracket syntax error ([50edbcc](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/50edbcc7f80e419f72f3f69249fec4a37597ef9a))
105+
106+
## [1.13.0-beta.5](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.0-beta.4...v1.13.0-beta.5) (2024-08-08)
107+
108+
109+
### Bug Fixes
110+
111+
* generate answer node pdf has a bug ([625ca9f](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/625ca9f22a91a292a844ddb45e0edc767bf24711))
112+
113+
* **chunking:** count tokens from words instead of characters ([5ec2de9](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/5ec2de9e1a14def5596738b6cdf769f5039a246d)), closes [#513](https://github.com/ScrapeGraphAI/Scrapegraph-ai/issues/513)
114+
115+
## [1.13.0-beta.4](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.0-beta.3...v1.13.0-beta.4) (2024-08-07)
116+
117+
118+
### Bug Fixes
119+
120+
* refactoring of merge_answer_node ([898e5a7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/898e5a7af504fbf4c1cabb14103e66184037de49))
121+
122+
## [1.13.0-beta.3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.0-beta.2...v1.13.0-beta.3) (2024-08-07)
123+
124+
125+
### Features
126+
127+
* add mistral support ([17f2707](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/17f2707313f65a1e96443b3c8a1f5137892f2c5a))
128+
129+
130+
### Bug Fixes
131+
132+
* **FetchNode:** handling of missing browser_base key ([07720b6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/07720b6e0ca10ba6ce3c1359706a09baffcc4ad0))
133+
* **AbstractGraph:** LangChain warnings handling, Mistral tokens ([786af99](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/786af992f8fbdadfdc3d2d6a06c0cfd81289f8f2))
134+
135+
136+
### chore
137+
138+
* **models_tokens:** add mistral models ([5e82432](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/5e824327c3acb69d53f3519344d0f8c2e3defa8b))
139+
* **mistral:** create examples ([f8ad616](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/f8ad616e10c271443e2dcb4123c8ddb91de2ff69))
140+
* **examples:** fix Mistral examples ([b0ffc51](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/b0ffc51e5415caec562a565710f5195afe1fbcb2))
141+
* update requirements for mistral ([9868555](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/986855512319541d1d02356df9ad61ab7fc5d807))
142+
1143
## [1.13.0-beta.2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.13.0-beta.1...v1.13.0-beta.2) (2024-08-07)
2144

3145

@@ -21,6 +163,7 @@
21163
* **release:** 1.11.0-beta.11 [skip ci] ([579d3f3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/579d3f394b54636673baf8e9f619f1c57a2ecce4))
22164
* **release:** 1.11.0-beta.12 [skip ci] ([cf2a17e](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/cf2a17ed5d79c62271fd9ea8ec89793884b04b56))
23165

166+
24167
## [1.12.0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.11.3...v1.12.0) (2024-08-06)
25168

26169

examples/gemini/pdf_scraper_graph_gemini.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from dotenv import load_dotenv
77
from scrapegraphai.utils import prettify_exec_info
88
from scrapegraphai.graphs import PDFScraperGraph
9+
910
load_dotenv()
1011

1112

@@ -18,7 +19,7 @@
1819
graph_config = {
1920
"llm": {
2021
"api_key": gemini_key,
21-
"model": "gemini-pr",
22+
"model": "gemini-pro",
2223
},
2324
}
2425

examples/local_models/smart_scraper_ollama.py

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,18 +14,21 @@
1414
"format": "json", # Ollama needs the format to be specified explicitly
1515
# "base_url": "http://localhost:11434", # set ollama URL arbitrarily
1616
},
17-
1817
"verbose": True,
1918
"headless": False
2019
}
2120

2221
# ************************************************
2322
# Create the SmartScraperGraph instance and run it
2423
# ************************************************
25-
2624
smart_scraper_graph = SmartScraperGraph(
27-
prompt="List me all the titles of the website",
28-
source="https://sport.sky.it/nba?gr=www",
25+
<<<<<<< Updated upstream
26+
prompt="Find some information about what does the company do, the name and a contact email.",
27+
source="https://scrapegraphai.com/",
28+
=======
29+
prompt="List all the projects with their descriptions",
30+
source="https://perinim.github.io/projects/",
31+
>>>>>>> Stashed changes
2932
config=graph_config
3033
)
3134

examples/mistral/.env.example

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
MISTRAL_API_KEY="YOUR MISTRAL API KEY"
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
"""
2+
Basic example of scraping pipeline using CSVScraperMultiGraph from CSV documents
3+
"""
4+
5+
import os
6+
from dotenv import load_dotenv
7+
import pandas as pd
8+
from scrapegraphai.graphs import CSVScraperMultiGraph
9+
from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
10+
11+
load_dotenv()
12+
# ************************************************
13+
# Read the CSV file
14+
# ************************************************
15+
16+
FILE_NAME = "inputs/username.csv"
17+
curr_dir = os.path.dirname(os.path.realpath(__file__))
18+
file_path = os.path.join(curr_dir, FILE_NAME)
19+
20+
text = pd.read_csv(file_path)
21+
22+
# ************************************************
23+
# Define the configuration for the graph
24+
# ************************************************
25+
mistral_key = os.getenv("MISTRAL_API_KEY")
26+
27+
graph_config = {
28+
"llm": {
29+
"api_key": mistral_key,
30+
"model": "mistral/open-mistral-nemo",
31+
},
32+
}
33+
34+
# ************************************************
35+
# Create the CSVScraperMultiGraph instance and run it
36+
# ************************************************
37+
38+
csv_scraper_graph = CSVScraperMultiGraph(
39+
prompt="List me all the last names",
40+
source=[str(text), str(text)],
41+
config=graph_config
42+
)
43+
44+
result = csv_scraper_graph.run()
45+
print(result)
46+
47+
# ************************************************
48+
# Get graph execution info
49+
# ************************************************
50+
51+
graph_exec_info = csv_scraper_graph.get_execution_info()
52+
print(prettify_exec_info(graph_exec_info))
53+
54+
# Save to json or csv
55+
convert_to_csv(result, "result")
56+
convert_to_json(result, "result")
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
"""
2+
Basic example of scraping pipeline using CSVScraperGraph from CSV documents
3+
"""
4+
5+
import os
6+
from dotenv import load_dotenv
7+
import pandas as pd
8+
from scrapegraphai.graphs import CSVScraperGraph
9+
from scrapegraphai.utils import convert_to_csv, convert_to_json, prettify_exec_info
10+
load_dotenv()
11+
12+
# ************************************************
13+
# Read the CSV file
14+
# ************************************************
15+
16+
FILE_NAME = "inputs/username.csv"
17+
curr_dir = os.path.dirname(os.path.realpath(__file__))
18+
file_path = os.path.join(curr_dir, FILE_NAME)
19+
20+
text = pd.read_csv(file_path)
21+
22+
# ************************************************
23+
# Define the configuration for the graph
24+
# ************************************************
25+
26+
mistral_key = os.getenv("MISTRAL_API_KEY")
27+
28+
graph_config = {
29+
"llm": {
30+
"api_key": mistral_key,
31+
"model": "mistral/open-mistral-nemo",
32+
},
33+
}
34+
35+
# ************************************************
36+
# Create the CSVScraperGraph instance and run it
37+
# ************************************************
38+
39+
csv_scraper_graph = CSVScraperGraph(
40+
prompt="List me all the last names",
41+
source=str(text), # Pass the content of the file, not the file object
42+
config=graph_config
43+
)
44+
45+
result = csv_scraper_graph.run()
46+
print(result)
47+
48+
# ************************************************
49+
# Get graph execution info
50+
# ************************************************
51+
52+
graph_exec_info = csv_scraper_graph.get_execution_info()
53+
print(prettify_exec_info(graph_exec_info))
54+
55+
# Save to json or csv
56+
convert_to_csv(result, "result")
57+
convert_to_json(result, "result")

0 commit comments

Comments
 (0)