Skip to content

Commit 367dea5

Browse files
authored
Merge branch 'pre/beta' into feat/parallel-node-execution
2 parents a8d5e7d + c0d26d6 commit 367dea5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+2288
-605
lines changed

CHANGELOG.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,65 @@
1+
## [0.11.0-beta.7](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.11.0-beta.6...v0.11.0-beta.7) (2024-05-13)
2+
3+
4+
### Bug Fixes
5+
6+
* bug for claude ([d0167de](https://github.com/VinciGit00/Scrapegraph-ai/commit/d0167dee71779a3c1e1e042e17a41134b93b3c78))
7+
8+
9+
### Docs
10+
11+
* **refactor:** changed example ([c7ec114](https://github.com/VinciGit00/Scrapegraph-ai/commit/c7ec114274da64f0b61cee80afe908a36ad26b78))
12+
13+
## [0.11.0-beta.6](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.11.0-beta.5...v0.11.0-beta.6) (2024-05-13)
14+
15+
16+
### Bug Fixes
17+
18+
* **fetch-node:** removed isSoup from default ([0c15947](https://github.com/VinciGit00/Scrapegraph-ai/commit/0c1594737f878ed5672f4c889fdf9b4e0d7ec49a))
19+
20+
## [0.11.0-beta.5](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.11.0-beta.4...v0.11.0-beta.5) (2024-05-13)
21+
22+
23+
### Features
24+
25+
* **webdriver-backend:** add dynamic import scripts from module and file ([db2234b](https://github.com/VinciGit00/Scrapegraph-ai/commit/db2234bf5d2f2589b080cd4136f33c4f4443bdfb))
26+
* **proxy-rotation:** add parse (IP address) or search (from broker) functionality for proxy rotation ([2170131](https://github.com/VinciGit00/Scrapegraph-ai/commit/217013181da06abe8d71d9db70e809ea4ebd8236))
27+
* added proxy rotation ([0c36a7e](https://github.com/VinciGit00/Scrapegraph-ai/commit/0c36a7ec1f32ee073d9e0f534a2cb97aba3d7a1f))
28+
* **safe-web-driver:** enchanced the original `AsyncChromiumLoader` web driver with proxy protection and flexible kwargs and backend ([768719c](https://github.com/VinciGit00/Scrapegraph-ai/commit/768719cce80953fa6cbe283e442420116c438f16))
29+
30+
31+
### Bug Fixes
32+
33+
* **pytest:** add dependency for mocking testing functions ([2f4fd45](https://github.com/VinciGit00/Scrapegraph-ai/commit/2f4fd45700ebf1db0c429b5a6249386d1a111615))
34+
* **chromium-loader:** ensure it subclasses langchain's base loader ([b54d984](https://github.com/VinciGit00/Scrapegraph-ai/commit/b54d984c134c8cbc432fd111bb161d3d53cf4a85))
35+
* **proxy-rotation:** removed duplicated arg and passed the loader_kwarhs correctly to the node ([1e9a564](https://github.com/VinciGit00/Scrapegraph-ai/commit/1e9a56461632999c5dc09f5aa930c14c954025ad))
36+
* **proxy-rotation:** removed max_shape duplicate ([5d6d996](https://github.com/VinciGit00/Scrapegraph-ai/commit/5d6d996e8f6132101d4c3af835d74f0674baffa1))
37+
38+
39+
### Docs
40+
41+
* **refactor:** added proxy-rotation usage and refactor readthedocs ([e256b75](https://github.com/VinciGit00/Scrapegraph-ai/commit/e256b758b2ada641f97b23b1cf6c6b0174563d8a))
42+
43+
## [0.11.0-beta.4](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.11.0-beta.3...v0.11.0-beta.4) (2024-05-12)
44+
45+
46+
### Features
47+
48+
* add new prompt info ([e2350ed](https://github.com/VinciGit00/Scrapegraph-ai/commit/e2350eda6249d8e121344d12c92645a3887a5b76))
49+
50+
## [0.11.0-beta.3](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.11.0-beta.2...v0.11.0-beta.3) (2024-05-12)
51+
52+
53+
### Features
54+
55+
* add support for deepseek-chat ([156b67b](https://github.com/VinciGit00/Scrapegraph-ai/commit/156b67b91e1798f67082123e2c0087d358a32d4d)), closes [#222](https://github.com/VinciGit00/Scrapegraph-ai/issues/222)
56+
57+
58+
### Docs
59+
60+
* add diagram showing general structure/flow of the library ([13ae918](https://github.com/VinciGit00/Scrapegraph-ai/commit/13ae9180ac5e7ef11dad1a210cf8790e797397dd))
61+
* update overview diagram with more models ([b441b30](https://github.com/VinciGit00/Scrapegraph-ai/commit/b441b30a5c60dda105964f69bd4cef06825f5c74))
62+
163
## [0.11.0-beta.2](https://github.com/VinciGit00/Scrapegraph-ai/compare/v0.11.0-beta.1...v0.11.0-beta.2) (2024-05-10)
264

365

51.8 KB
Binary file not shown.
82 KB
Loading

docs/assets/searchgraph.png

50.2 KB
Loading

docs/assets/smartscrapergraph.png

58.2 KB
Loading

docs/assets/speechgraph.png

45.8 KB
Loading

docs/source/conf.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,4 +30,3 @@
3030
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
3131

3232
html_theme = 'sphinx_rtd_theme'
33-
html_static_path = ['_static']

docs/source/getting_started/examples.rst

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
Examples
22
========
33

4-
Here some example of the different ways to scrape with ScrapegraphAI
4+
Let's suppose you want to scrape a website to get a list of projects with their descriptions.
5+
You can use the `SmartScraperGraph` class to do that.
6+
The following examples show how to use the `SmartScraperGraph` class with OpenAI models and local models.
57

68
OpenAI models
79
^^^^^^^^^^^^^
@@ -78,7 +80,7 @@ After that, you can run the following code, using only your machine resources br
7880
# ************************************************
7981
8082
smart_scraper_graph = SmartScraperGraph(
81-
prompt="List me all the news with their description.",
83+
prompt="List me all the projects with their description.",
8284
# also accepts a string with the already downloaded HTML code
8385
source="https://perinim.github.io/projects",
8486
config=graph_config
@@ -87,3 +89,4 @@ After that, you can run the following code, using only your machine resources br
8789
result = smart_scraper_graph.run()
8890
print(result)
8991
92+
To find out how you can customize the `graph_config` dictionary, by using different LLM and adding new parameters, check the `Scrapers` section!

docs/source/getting_started/installation.rst

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,26 +7,35 @@ for this project.
77
Prerequisites
88
^^^^^^^^^^^^^
99

10-
- `Python 3.8+ <https://www.python.org/downloads/>`_
11-
- `pip <https://pip.pypa.io/en/stable/getting-started/>`
12-
- `ollama <https://ollama.com/>` *optional for local models
10+
- `Python >=3.9,<3.12 <https://www.python.org/downloads/>`_
11+
- `pip <https://pip.pypa.io/en/stable/getting-started/>`_
12+
- `Ollama <https://ollama.com/>`_ (optional for local models)
1313

1414

1515
Install the library
1616
^^^^^^^^^^^^^^^^^^^^
1717

18+
The library is available on PyPI, so it can be installed using the following command:
19+
1820
.. code-block:: bash
1921
2022
pip install scrapegraphai
2123
24+
**Note:** It is higly recommended to install the library in a virtual environment (conda, venv, etc.)
25+
26+
If your clone the repository, you can install the library using `poetry <https://python-poetry.org/docs/>`_:
27+
28+
.. code-block:: bash
29+
30+
poetry install
31+
2232
Additionally on Windows when using WSL
2333
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2434

35+
If you are using Windows Subsystem for Linux (WSL) and you are facing issues with the installation of the library, you might need to install the following packages:
36+
2537
.. code-block:: bash
2638
2739
sudo apt-get -y install libnss3 libnspr4 libgbm1 libasound2
2840
29-
As simple as that! You are now ready to scrape gnamgnamgnam 👿👿👿
30-
31-
3241

docs/source/index.rst

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,6 @@
33
You can adapt this file completely to your liking, but it should at least
44
contain the root `toctree` directive.
55
6-
Welcome to scrapegraphai-ai's documentation!
7-
=======================================
8-
9-
Here you will find all the information you need to get started.
10-
The following sections will guide you through the installation process and the usage of the library.
11-
126
.. toctree::
137
:maxdepth: 2
148
:caption: Introduction
@@ -22,6 +16,19 @@ The following sections will guide you through the installation process and the u
2216

2317
getting_started/installation
2418
getting_started/examples
19+
20+
.. toctree::
21+
:maxdepth: 2
22+
:caption: Scrapers
23+
24+
scrapers/graphs
25+
scrapers/llm
26+
scrapers/graph_config
27+
28+
.. toctree::
29+
:maxdepth: 2
30+
:caption: Modules
31+
2532
modules/modules
2633

2734
Indices and tables

0 commit comments

Comments
 (0)