You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+16-14Lines changed: 16 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -17,13 +17,14 @@ tobey -u https://example.org # Use the cli interface to submit a crawl request.
17
17
## CLI Mode
18
18
19
19
Tobey offers an - albeit limited - cli mode that allows you to run ad hoc crawls. Target URLs can be provided via the `-u` flag. By default results
20
-
will be saved in the current directory. Use the `-o` flag to specify a different output directory. Use the `-i` flag to specify paths to ignore. For all remaining options, please review the cli help via `-h`.
20
+
will be saved in the current directory. Use the `-o` flag to specify a different output directory. Use the `-i` flag to specify paths to ignore. Use the `-oc` flag to store response bodies directly on disk without wrapping them in JSON files. For all remaining options, please review the cli help via `-h`.
|`TOBEY_DEBUG`, `-debug`| Both |`false`|`true`, `false`| Controls debug mode. |
294
+
|`TOBEY_SKIP_CACHE`, `-no-cache`| Both |`false`|`true`, `false`| Controls caching access. |
295
+
|`TOBEY_UA`, `-ua`| Both |`Tobey/0`| any string | User-Agent to identify with. |
296
+
|`TOBEY_HOST`, `-host`| Service | empty | i.e. `localhost`, `127.0.0.1`| Adress to bind the HTTP server to. Empty means listen on all. |
297
+
|`TOBEY_PORT`, `-port`| Service |`8080`|`1-65535`| Port to bind the HTTP server to. |
298
+
|`TOBEY_WORKERS`, `-w`| Both |`5`|`1-128`| Number of workers to start. |
299
+
|`TOBEY_REDIS_DSN`| Service | empty | i.e. `redis://localhost:6379`| DSN to reach a Redis instance for coordinting multiple instances. |
300
+
|`TOBEY_PROGRESS_DSN`| Service |`noop://`|`memory://`, `factorial://host:port`, `console://`, `noop://`| DSN for progress reporting service. |
301
+
|`TOBEY_RESULT_REPORTER_DSN`| Service |`disk://results`|`disk:///path`, `webhook://host/path`, `noop://`| DSN specifying where crawl results should be stored. |
302
+
|`TOBEY_TELEMETRY`, `-telemetry`| Service | empty |`metrics`, `traces`, `pulse`| Space separated list of what kind of telemetry is emitted. |
303
+
|`-i`| CLI | empty | comma-separated paths, i.e. `/search`, `'*.pdf'`, `'*.(pdf|asc)'` | Paths to ignore during crawling. |
304
+
|`-oc`| CLI |`false`|`true`, `false`| Store response bodies directly on disk without JSON wrapper. |
303
305
304
306
_Note:_ When enabling telemetry ensure you are also providing [OpenTelemetry environment variables](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/).
0 commit comments