Skip to content

Commit 5065aa0

Browse files
authored
Merge branch 'pre/beta' into supercoder-327
2 parents d0e300a + b4d7532 commit 5065aa0

File tree

160 files changed

+5127
-542
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

160 files changed

+5127
-542
lines changed

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ docs/source/_static/
2323
venv/
2424
.venv/
2525
.vscode/
26+
.conda/
2627

2728
# exclude pdf, mp3
2829
*.pdf
@@ -38,3 +39,6 @@ lib/
3839
*.html
3940
.idea
4041

42+
# extras
43+
cache/
44+
run_smart_scraper.py

CHANGELOG.md

Lines changed: 200 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,95 @@
1+
## [1.7.0-beta.4](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.7.0-beta.3...v1.7.0-beta.4) (2024-06-12)
2+
3+
4+
### Bug Fixes
5+
6+
* common params ([6b4cdf9](https://github.com/VinciGit00/Scrapegraph-ai/commit/6b4cdf92b82fa143e4217a2e5da46d04f2585de8))
7+
8+
## [1.7.0-beta.3](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.7.0-beta.2...v1.7.0-beta.3) (2024-06-11)
9+
10+
11+
### Features
12+
13+
* add caching ([d790361](https://github.com/VinciGit00/Scrapegraph-ai/commit/d79036149a3197a385b73553f29df66d36480c38))
14+
* add dynamic caching ([7ed2fe8](https://github.com/VinciGit00/Scrapegraph-ai/commit/7ed2fe8ef0d16fd93cb2ff88840bcaa643349e33))
15+
* add new chunking function ([e1f045b](https://github.com/VinciGit00/Scrapegraph-ai/commit/e1f045b2809fc7db0c252f4c6f2f9a435c66ba91))
16+
* **merge:** add scriptcreatormulti, rag cache and semchunk ([15421ef](https://github.com/VinciGit00/Scrapegraph-ai/commit/15421eff7009b80293f7d84df5086d22944dfb99))
17+
* **schema:** merge scripts to follow pydantic schema ([5d692bf](https://github.com/VinciGit00/Scrapegraph-ai/commit/5d692bff9e4f124146dd37e573f7c3c0aa8d9a23))
18+
* refactoring of rag node ([7a13a68](https://github.com/VinciGit00/Scrapegraph-ai/commit/7a13a6819ff35a6f6197ee837d0eb8ea65e31776))
19+
20+
21+
### Bug Fixes
22+
23+
* **cache:** correctly pass the node arguments and logging ([c881f64](https://github.com/VinciGit00/Scrapegraph-ai/commit/c881f64209a86a69ddd3105f5d0360d9ed183490))
24+
* **node:** fixed generate answer node pydantic schema ([ab00f23](https://github.com/VinciGit00/Scrapegraph-ai/commit/ab00f23d859c64995ccfe329b24379cf3c14d73c))
25+
26+
27+
### Docs
28+
29+
* **cache:** added cache_path param ([edddb68](https://github.com/VinciGit00/Scrapegraph-ai/commit/edddb682d06262088885e340b7b73cc70adf9583))
30+
* **scriptcreator:** enhance documentation ([650c3aa](https://github.com/VinciGit00/Scrapegraph-ai/commit/650c3aaa60dab169358c2c04bfca9dee8d1a5d68))
31+
32+
## [1.7.0-beta.2](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.7.0-beta.1...v1.7.0-beta.2) (2024-06-10)
33+
34+
35+
### Features
36+
37+
* Add tests for RobotsNode and update test setup ([b0511ae](https://github.com/VinciGit00/Scrapegraph-ai/commit/b0511aeaaac55570c8dad25b7cac7237bd20ef4c))
38+
* Add tests for SmartScraperGraph using sample text and configuration fixtures ([@tejhande](https://github.com/tejhande)) ([c927145](https://github.com/VinciGit00/Scrapegraph-ai/commit/c927145bd06693d0fad02b2285b426276b7d61a8))
39+
* Add tests for SmartScraperGraph using sample text and configuration fixtures ([@tejhande](https://github.com/tejhande)) ([9e7038c](https://github.com/VinciGit00/Scrapegraph-ai/commit/9e7038c5962563f53e0d44943d5c604cb1a2b035))
40+
* Add tests for SmartScraperGraph using sample text and configuration fixtures ([@tejhande](https://github.com/tejhande)) ([c286b16](https://github.com/VinciGit00/Scrapegraph-ai/commit/c286b1649e75d6c655698f38d695b58e3efa6270))
41+
* Add tests for SmartScraperGraph using sample text and configuration fixtures ([@tejhande](https://github.com/tejhande)) ([08f1be6](https://github.com/VinciGit00/Scrapegraph-ai/commit/08f1be682b0509f1e06148269fec1fa2897c394e))
42+
43+
## [1.7.0-beta.1](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0...v1.7.0-beta.1) (2024-06-09)
44+
45+
46+
### Features
47+
48+
* add csv scraper and xml scraper multi ([b408655](https://github.com/VinciGit00/Scrapegraph-ai/commit/b4086550cc9dc42b2fd91ee7ef60c6a2c2ac3fd2))
49+
* **indexify-node:** add example ([5d1fbf8](https://github.com/VinciGit00/Scrapegraph-ai/commit/5d1fbf806a20746931ebb7fcb32c383d9d549d93))
50+
* add forcing format as json ([5cfc101](https://github.com/VinciGit00/Scrapegraph-ai/commit/5cfc10178abf0b7a3e0b2229512396e243305438))
51+
* add json as output ([5d20186](https://github.com/VinciGit00/Scrapegraph-ai/commit/5d20186bf20fb2384f2a9e7e81c2e875ff50a4f3))
52+
* add json multiscraper ([5bda918](https://github.com/VinciGit00/Scrapegraph-ai/commit/5bda918a39e4b50d86d784b4c592cc2ea1a68986))
53+
* add pdf scraper multi graph ([f5cbd80](https://github.com/VinciGit00/Scrapegraph-ai/commit/f5cbd80c977f51233ac1978d8450fcf0ec2ff461))
54+
* **pydantic:** added pydantic output schema ([376f758](https://github.com/VinciGit00/Scrapegraph-ai/commit/376f758a76e3e111dc34416dedf8e294dc190963))
55+
* **append_node:** append node to existing graph ([f8b08e0](https://github.com/VinciGit00/Scrapegraph-ai/commit/f8b08e0b33ca31124c2773f47a624eeb0a4f302f))
56+
* fix an if ([c8d556d](https://github.com/VinciGit00/Scrapegraph-ai/commit/c8d556da4e4b8730c6c35f1d448270b8e26923f2))
57+
* refactoring of abstract graph ([fff89f4](https://github.com/VinciGit00/Scrapegraph-ai/commit/fff89f431f60b5caa4dd87643a1bb8895bf96d48))
58+
* refactoring of an in if ([244aada](https://github.com/VinciGit00/Scrapegraph-ai/commit/244aada2de1f3bc88782fa90e604e8b936b79aa4))
59+
* removed a bug ([8de720d](https://github.com/VinciGit00/Scrapegraph-ai/commit/8de720d37958e31b73c5c89bc21f474f3303b42b))
60+
* removed rag node ([930f673](https://github.com/VinciGit00/Scrapegraph-ai/commit/930f67374752561903462a25728c739946f9449b))
61+
* **version:** update burr version ([cfa1336](https://github.com/VinciGit00/Scrapegraph-ai/commit/cfa13368f4d5c7dd8be27aabe19c7602d24686da))
62+
63+
64+
### Bug Fixes
65+
66+
* **schema:** fixed json output ([5c9843f](https://github.com/VinciGit00/Scrapegraph-ai/commit/5c9843f1410a78568892635e53872793d5ba0d6f))
67+
* oneapi model ([4fcb990](https://github.com/VinciGit00/Scrapegraph-ai/commit/4fcb9902fe4c147c61a1622a919ade338c03b8d8))
68+
* typo in prompt ([4639f0c](https://github.com/VinciGit00/Scrapegraph-ai/commit/4639f0cac5029c6802a6caded7103d247f4f06dd))
69+
70+
71+
### Docs
72+
73+
* stylize badges in readme ([8696ade](https://github.com/VinciGit00/Scrapegraph-ai/commit/8696adede79cf9557c49a8b30a095b76ec3d02f6))
74+
75+
76+
### CI
77+
78+
* **release:** 1.5.3-beta.1 [skip ci] ([6ea1d2c](https://github.com/VinciGit00/Scrapegraph-ai/commit/6ea1d2c4d0aaf7a341a2ea6ea7070438a7610fe4))
79+
* **release:** 1.5.3-beta.2 [skip ci] ([b57bcef](https://github.com/VinciGit00/Scrapegraph-ai/commit/b57bcef5c18530ce03ff6ec65e9e33d00d9f6515))
80+
* **release:** 1.5.5-beta.1 [skip ci] ([38d138e](https://github.com/VinciGit00/Scrapegraph-ai/commit/38d138e36faa718632b7560fab197c25e24da9de))
81+
* **release:** 1.6.0-beta.1 [skip ci] ([1d217e4](https://github.com/VinciGit00/Scrapegraph-ai/commit/1d217e4ae682ddf16d911b6db6973dc05445660c))
82+
* **release:** 1.6.0-beta.10 [skip ci] ([4d0d8fa](https://github.com/VinciGit00/Scrapegraph-ai/commit/4d0d8fa453f411927f49d75b9f67fb08ab168759))
83+
* **release:** 1.6.0-beta.11 [skip ci] ([3453ac0](https://github.com/VinciGit00/Scrapegraph-ai/commit/3453ac01f5da9148c8d10f29724b4a1c20d0a6e8))
84+
* **release:** 1.6.0-beta.2 [skip ci] ([ed1dc0b](https://github.com/VinciGit00/Scrapegraph-ai/commit/ed1dc0be08faf7e050f627c175897ae9c0eccbcf))
85+
* **release:** 1.6.0-beta.3 [skip ci] ([b70cb37](https://github.com/VinciGit00/Scrapegraph-ai/commit/b70cb37c623d56f5508650937bc314724ceec0e9))
86+
* **release:** 1.6.0-beta.4 [skip ci] ([08a14ef](https://github.com/VinciGit00/Scrapegraph-ai/commit/08a14efdd334ae645cb5cfe0dec04332659b99d5))
87+
* **release:** 1.6.0-beta.5 [skip ci] ([dde0c7e](https://github.com/VinciGit00/Scrapegraph-ai/commit/dde0c7e27deb55a0005691d402406a13ee507420))
88+
* **release:** 1.6.0-beta.6 [skip ci] ([ac8e7c1](https://github.com/VinciGit00/Scrapegraph-ai/commit/ac8e7c12fe677a357b8b1b8d42a1aca8503de727))
89+
* **release:** 1.6.0-beta.7 [skip ci] ([cab5f68](https://github.com/VinciGit00/Scrapegraph-ai/commit/cab5f6828cac926a82d9ecfe7a97596aaabfa385))
90+
* **release:** 1.6.0-beta.8 [skip ci] ([7a6f016](https://github.com/VinciGit00/Scrapegraph-ai/commit/7a6f016f9231f92e1bb99059e08b431ce99b14cf))
91+
* **release:** 1.6.0-beta.9 [skip ci] ([ca8aff8](https://github.com/VinciGit00/Scrapegraph-ai/commit/ca8aff8d8849552159ff1b86fd175fa5e9fe7c1f))
92+
193
## [1.6.0](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.5.7...v1.6.0) (2024-06-09)
294

395

@@ -13,41 +105,141 @@
13105
## [1.5.7](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.5.6...v1.5.7) (2024-06-06)
14106

15107

108+
16109
### Bug Fixes
17110

111+
* bug on generate_answer_node ([1d38ed1](https://github.com/VinciGit00/Scrapegraph-ai/commit/1d38ed146afae95dae1f35ac51180a1882bf8a29))
112+
* getter ([67d83cf](https://github.com/VinciGit00/Scrapegraph-ai/commit/67d83cff46d8ea606b8972c364ab4c56e6fa4fe4))
18113
* update openai tts class ([10672d6](https://github.com/VinciGit00/Scrapegraph-ai/commit/10672d6ebb06d950bbf8b66cc9a2d420c183013d))
19114

20-
## [1.5.6](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.5.5...v1.5.6) (2024-06-05)
115+
116+
### Docs
117+
118+
* add Japanese README ([4559ab6](https://github.com/VinciGit00/Scrapegraph-ai/commit/4559ab6db845a0d94371a09d0ed1e1623eed9ee2))
119+
* update japanese.md ([f0042a8](https://github.com/VinciGit00/Scrapegraph-ai/commit/f0042a8e33f8fb8b113681ee0a9995d329bb0faa))
120+
* update README.md ([871e398](https://github.com/VinciGit00/Scrapegraph-ai/commit/871e398a26786d264dbd1b2743864ed2cc12b3da))
121+
122+
123+
### Test
124+
125+
* Enhance JSON scraping pipeline test ([d845a1b](https://github.com/VinciGit00/Scrapegraph-ai/commit/d845a1ba7d6e7f7574b92b51b6d5326bbfb3d1c6))
126+
127+
128+
### CI
129+
130+
* **release:** 1.5.5 [skip ci] ([3629215](https://github.com/VinciGit00/Scrapegraph-ai/commit/36292150daf6449d6af58fc18ced1771e70e45cc))
131+
* **release:** 1.5.6 [skip ci] ([49cdadf](https://github.com/VinciGit00/Scrapegraph-ai/commit/49cdadf11722abe5b60b49f1c7f90186771356cc))
132+
* **release:** 1.5.7 [skip ci] ([c17daca](https://github.com/VinciGit00/Scrapegraph-ai/commit/c17daca409fd3aaa5eaf0c3372c14127aeaf7d3d))
133+
134+
## [1.6.0-beta.10](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.9...v1.6.0-beta.10) (2024-06-08)
135+
136+
137+
### Features
138+
139+
* **version:** update burr version ([cfa1336](https://github.com/VinciGit00/Scrapegraph-ai/commit/cfa13368f4d5c7dd8be27aabe19c7602d24686da))
140+
141+
### Docs
142+
143+
* stylize badges in readme ([8696ade](https://github.com/VinciGit00/Scrapegraph-ai/commit/8696adede79cf9557c49a8b30a095b76ec3d02f6))
144+
145+
## [1.6.0-beta.9](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.8...v1.6.0-beta.9) (2024-06-07)
146+
147+
148+
### Features
149+
150+
* **indexify-node:** add example ([5d1fbf8](https://github.com/VinciGit00/Scrapegraph-ai/commit/5d1fbf806a20746931ebb7fcb32c383d9d549d93))
21151

22152

23153
### Bug Fixes
24154

25-
* getter ([67d83cf](https://github.com/VinciGit00/Scrapegraph-ai/commit/67d83cff46d8ea606b8972c364ab4c56e6fa4fe4))
155+
* **schema:** fixed json output ([5c9843f](https://github.com/VinciGit00/Scrapegraph-ai/commit/5c9843f1410a78568892635e53872793d5ba0d6f))
156+
157+
## [1.6.0-beta.8](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.7...v1.6.0-beta.8) (2024-06-05)
158+
159+
160+
### Features
161+
162+
* add json as output ([5d20186](https://github.com/VinciGit00/Scrapegraph-ai/commit/5d20186bf20fb2384f2a9e7e81c2e875ff50a4f3))
163+
164+
## [1.6.0-beta.7](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.6...v1.6.0-beta.7) (2024-06-05)
165+
166+
167+
### Features
168+
169+
* **pydantic:** added pydantic output schema ([376f758](https://github.com/VinciGit00/Scrapegraph-ai/commit/376f758a76e3e111dc34416dedf8e294dc190963))
170+
* **append_node:** append node to existing graph ([f8b08e0](https://github.com/VinciGit00/Scrapegraph-ai/commit/f8b08e0b33ca31124c2773f47a624eeb0a4f302f))
171+
172+
## [1.6.0-beta.6](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.5...v1.6.0-beta.6) (2024-06-04)
173+
174+
175+
### Features
176+
177+
* refactoring of abstract graph ([fff89f4](https://github.com/VinciGit00/Scrapegraph-ai/commit/fff89f431f60b5caa4dd87643a1bb8895bf96d48))
178+
179+
## [1.6.0-beta.5](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.4...v1.6.0-beta.5) (2024-06-04)
180+
181+
182+
### Features
183+
184+
* refactoring of an in if ([244aada](https://github.com/VinciGit00/Scrapegraph-ai/commit/244aada2de1f3bc88782fa90e604e8b936b79aa4))
185+
186+
## [1.6.0-beta.4](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.3...v1.6.0-beta.4) (2024-06-03)
187+
188+
189+
### Features
190+
191+
* fix an if ([c8d556d](https://github.com/VinciGit00/Scrapegraph-ai/commit/c8d556da4e4b8730c6c35f1d448270b8e26923f2))
192+
193+
## [1.6.0-beta.3](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.2...v1.6.0-beta.3) (2024-06-03)
194+
195+
196+
### Features
197+
198+
* removed a bug ([8de720d](https://github.com/VinciGit00/Scrapegraph-ai/commit/8de720d37958e31b73c5c89bc21f474f3303b42b))
199+
200+
## [1.6.0-beta.2](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.6.0-beta.1...v1.6.0-beta.2) (2024-06-03)
201+
202+
203+
### Features
204+
205+
* add csv scraper and xml scraper multi ([b408655](https://github.com/VinciGit00/Scrapegraph-ai/commit/b4086550cc9dc42b2fd91ee7ef60c6a2c2ac3fd2))
206+
* add json multiscraper ([5bda918](https://github.com/VinciGit00/Scrapegraph-ai/commit/5bda918a39e4b50d86d784b4c592cc2ea1a68986))
207+
* add pdf scraper multi graph ([f5cbd80](https://github.com/VinciGit00/Scrapegraph-ai/commit/f5cbd80c977f51233ac1978d8450fcf0ec2ff461))
208+
* removed rag node ([930f673](https://github.com/VinciGit00/Scrapegraph-ai/commit/930f67374752561903462a25728c739946f9449b))
209+
210+
## [1.6.0-beta.1](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.5.5-beta.1...v1.6.0-beta.1) (2024-06-02)
211+
26212

27-
## [1.5.5](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.5.4...v1.5.5) (2024-06-05)
213+
### Features
214+
215+
* add forcing format as json ([5cfc101](https://github.com/VinciGit00/Scrapegraph-ai/commit/5cfc10178abf0b7a3e0b2229512396e243305438))
216+
217+
## [1.5.5-beta.1](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.5.4...v1.5.5-beta.1) (2024-05-31)
28218

29219

30220
### Bug Fixes
31221

32-
* bug on generate_answer_node ([1d38ed1](https://github.com/VinciGit00/Scrapegraph-ai/commit/1d38ed146afae95dae1f35ac51180a1882bf8a29))
222+
* oneapi model ([4fcb990](https://github.com/VinciGit00/Scrapegraph-ai/commit/4fcb9902fe4c147c61a1622a919ade338c03b8d8))
223+
* typo in prompt ([4639f0c](https://github.com/VinciGit00/Scrapegraph-ai/commit/4639f0cac5029c6802a6caded7103d247f4f06dd))
33224

34225

35-
### Docs
226+
### CI
36227

37-
* add Japanese README ([4559ab6](https://github.com/VinciGit00/Scrapegraph-ai/commit/4559ab6db845a0d94371a09d0ed1e1623eed9ee2))
38-
* update japanese.md ([f0042a8](https://github.com/VinciGit00/Scrapegraph-ai/commit/f0042a8e33f8fb8b113681ee0a9995d329bb0faa))
39-
* update README.md ([871e398](https://github.com/VinciGit00/Scrapegraph-ai/commit/871e398a26786d264dbd1b2743864ed2cc12b3da))
228+
* **release:** 1.5.3-beta.1 [skip ci] ([6ea1d2c](https://github.com/VinciGit00/Scrapegraph-ai/commit/6ea1d2c4d0aaf7a341a2ea6ea7070438a7610fe4))
229+
* **release:** 1.5.3-beta.2 [skip ci] ([b57bcef](https://github.com/VinciGit00/Scrapegraph-ai/commit/b57bcef5c18530ce03ff6ec65e9e33d00d9f6515))
40230

41231
## [1.5.4](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.5.3...v1.5.4) (2024-05-31)
42232

43233

234+
44235
### Bug Fixes
45236

46237
* **3.9:** python 3.9 logging fix ([8be27ba](https://github.com/VinciGit00/Scrapegraph-ai/commit/8be27bad8022e75379309deccc8f6878ee1a362d))
47238

48239
## [1.5.3](https://github.com/VinciGit00/Scrapegraph-ai/compare/v1.5.2...v1.5.3) (2024-05-30)
49240

50241

242+
51243
### Bug Fixes
52244

53245
* typo in generate_screper_node ([c4ce361](https://github.com/VinciGit00/Scrapegraph-ai/commit/c4ce36111f17526fd167c613a58ae09e361b62e1))

README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,11 @@
22
# 🕷️ ScrapeGraphAI: You Only Scrape Once
33
[English](https://github.com/VinciGit00/Scrapegraph-ai/blob/main/README.md) | [中文](https://github.com/VinciGit00/Scrapegraph-ai/blob/main/docs/chinese.md) | [日本語](https://github.com/VinciGit00/Scrapegraph-ai/blob/main/docs/japanese.md)
44

5-
[![Downloads](https://static.pepy.tech/badge/scrapegraphai)](https://pepy.tech/project/scrapegraphai)
6-
[![linting: pylint](https://img.shields.io/badge/linting-pylint-yellowgreen)](https://github.com/pylint-dev/pylint)
7-
[![Pylint](https://github.com/VinciGit00/Scrapegraph-ai/actions/workflows/pylint.yml/badge.svg)](https://github.com/VinciGit00/Scrapegraph-ai/actions/workflows/pylint.yml)
8-
[![CodeQL](https://github.com/VinciGit00/Scrapegraph-ai/actions/workflows/codeql.yml/badge.svg)](https://github.com/VinciGit00/Scrapegraph-ai/actions/workflows/codeql.yml)
9-
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5+
[![Downloads](https://img.shields.io/pepy/dt/scrapegraphai?style=for-the-badge)](https://pepy.tech/project/scrapegraphai)
6+
[![linting: pylint](https://img.shields.io/badge/linting-pylint-yellowgreen?style=for-the-badge)](https://github.com/pylint-dev/pylint)
7+
[![Pylint](https://img.shields.io/github/actions/workflow/status/VinciGit00/Scrapegraph-ai/pylint.yml?style=for-the-badge)](https://github.com/VinciGit00/Scrapegraph-ai/actions/workflows/pylint.yml)
8+
[![CodeQL](https://img.shields.io/github/actions/workflow/status/VinciGit00/Scrapegraph-ai/codeql.yml?style=for-the-badge)](https://github.com/VinciGit00/Scrapegraph-ai/actions/workflows/codeql.yml)
9+
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](https://opensource.org/licenses/MIT)
1010
[![](https://dcbadge.vercel.app/api/server/gkxQDAjfeX)](https://discord.gg/gkxQDAjfeX)
1111

1212
ScrapeGraphAI is a *web scraping* python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, etc.).
@@ -43,11 +43,14 @@ The documentation for ScrapeGraphAI can be found [here](https://scrapegraph-ai.r
4343
Check out also the Docusaurus [here](https://scrapegraph-doc.onrender.com/).
4444

4545
## 💻 Usage
46-
There are three main scraping pipelines that can be used to extract information from a website (or local file):
46+
There are multiple standard scraping pipelines that can be used to extract information from a website (or local file):
4747
- `SmartScraperGraph`: single-page scraper that only needs a user prompt and an input source;
4848
- `SearchGraph`: multi-page scraper that extracts information from the top n search results of a search engine;
4949
- `SpeechGraph`: single-page scraper that extracts information from a website and generates an audio file.
50-
- `SmartScraperMultiGraph`: multiple page scraper given a single prompt
50+
- `ScriptCreatorGraph`: single-page scraper that extracts information from a website and generates a Python script.
51+
52+
- `SmartScraperMultiGraph`: multi-page scraper that extracts information from multiple pages given a single prompt and a list of sources;
53+
- `ScriptCreatorMultiGraph`: multi-page scraper that generates a Python script for extracting information from multiple pages given a single prompt and a list of sources.
5154

5255
It is possible to use different LLM through APIs, such as **OpenAI**, **Groq**, **Azure** and **Gemini**, or local models using **Ollama**.
5356

docs/assets/scriptcreatorgraph.png

53.7 KB
Loading

docs/source/scrapers/graph_config.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ Some interesting ones are:
1313
- `loader_kwargs`: A dictionary with additional parameters to be passed to the `Loader` class, such as `proxy`.
1414
- `burr_kwargs`: A dictionary with additional parameters to enable `Burr` graphical user interface.
1515
- `max_images`: The maximum number of images to be analyzed. Useful in `OmniScraperGraph` and `OmniSearchGraph`.
16+
- `cache_path`: The path where the cache files will be saved. If already exists, the cache will be loaded from this path.
1617

1718
.. _Burr:
1819

0 commit comments

Comments
 (0)