You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+20-4Lines changed: 20 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,11 +2,27 @@
2
2
3
3
Tobey is a throughput optimizing but friendly web crawler, that is scalable from a single instance to a cluster. It features intelligent rate limiting, distributed coordination, and flexible deployment options. Tobey honors resource exclusions in a robots.txt and tries its best not to overwhelm a host.
4
4
5
-
## Getting Started
5
+
## Quickstart
6
6
7
7
```sh
8
-
go run .# Start the crawler.
9
-
curl -X POST http://127.0.0.1:8080 -d 'https://www.example.org/'# Submit a crawl request.
8
+
go run .# Start the crawler as a server.
9
+
curl -X POST http://127.0.0.1:8080 -d 'https://www.example.org'# Submit a crawl request.
10
+
```
11
+
12
+
```sh
13
+
go build -o /usr/local/bin/tobey
14
+
tobey -u https://example.org # Use the cli interface to submit a crawl request.
15
+
```
16
+
17
+
## CLI Mode
18
+
19
+
Tobey offers an - albeit limited - cli mode that allows you to run ad hoc crawls. Target URLs can be provided via the `-u` flag. By default results
20
+
will be saved in the current directory. Use the `-o` flag to specify a different output directory. For all remaining options, please review the cli help via `-h`.
0 commit comments