Skip to content

Commit f87d3f3

Browse files
committed
Document cli flags
1 parent 3284a25 commit f87d3f3

File tree

2 files changed

+17
-14
lines changed

2 files changed

+17
-14
lines changed

.cursorrules

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Don't add Getter Methods. When adding new features, ensure the README.md mentions them as well.

README.md

Lines changed: 16 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,14 @@ tobey -u https://example.org # Use the cli interface to submit a crawl request.
1717
## CLI Mode
1818

1919
Tobey offers an - albeit limited - cli mode that allows you to run ad hoc crawls. Target URLs can be provided via the `-u` flag. By default results
20-
will be saved in the current directory. Use the `-o` flag to specify a different output directory. Use the `-i` flag to specify paths to ignore. For all remaining options, please review the cli help via `-h`.
20+
will be saved in the current directory. Use the `-o` flag to specify a different output directory. Use the `-i` flag to specify paths to ignore. Use the `-oc` flag to store response bodies directly on disk without wrapping them in JSON files. For all remaining options, please review the cli help via `-h`.
2121

2222
```sh
2323
tobey -h
2424
tobey -u https://example.org
2525
tobey -u https://example.org/blog,https://example.org/values -o results
2626
tobey -u https://example.org -i search/,admin/
27+
tobey -u https://example.org -oc # Store response bodies directly with appropriate file extensions
2728
```
2829

2930
## Submitting Crawl Requests
@@ -287,19 +288,20 @@ Using sane defaults, tobey works out of the box without additional configuration
287288
the following environment variables and/or command line flags can be used. Please also see `.env.example` for a working
288289
example configuration.
289290

290-
| Variable / Flag | Default | Supported Values | Description |
291-
|----------------|----------------|------------------|----------------------------------|
292-
| `TOBEY_DEBUG`, `-debug` | `false` | `true`, `false` | Controls debug mode. |
293-
| `TOBEY_SKIP_CACHE`, `-no-cache` | `false` | `true`, `false` | Controls caching access. |
294-
| `TOBEY_UA`, `-ua` | `Tobey/0`| any string | User-Agent to identify with. |
295-
| `TOBEY_HOST`, `-host` | empty | i.e. `localhost`, `127.0.0.1` | Adress to bind the HTTP server to. Empty means listen on all. |
296-
| `TOBEY_PORT`, `-port` | `8080` | `1-65535` | Port to bind the HTTP server to. Alternatively you can use the `-port` command line flag. |
297-
| `TOBEY_WORKERS`, `-w`| `5` | `1-128` | Number of workers to start. |
298-
| `TOBEY_REDIS_DSN` | empty | i.e. `redis://localhost:6379` | DSN to reach a Redis instance for coordinting multiple instances. |
299-
| `TOBEY_PROGRESS_DSN` | `noop://` | `memory://`, `factorial://host:port`, `console://`, `noop://` | DSN for progress reporting service. |
300-
| `TOBEY_RESULT_REPORTER_DSN` | `disk://results` | `disk:///path`, `webhook://host/path`, `noop://` | DSN specifying where crawl results should be stored. |
301-
| `TOBEY_TELEMETRY`, `-telemetry` | empty | `metrics`, `traces`, `pulse` | Space separated list of what kind of telemetry is emitted. |
302-
| `-i` | empty | comma-separated paths, i.e. `/search`, `'*.pdf'`, `'*.(pdf|asc)'` | Paths to ignore during crawling (cli only). |
291+
| Variable / Flag | Mode | Default | Supported Values | Description |
292+
|----------------|------|----------------|------------------|----------------------------------|
293+
| `TOBEY_DEBUG`, `-debug` | Both | `false` | `true`, `false` | Controls debug mode. |
294+
| `TOBEY_SKIP_CACHE`, `-no-cache` | Both | `false` | `true`, `false` | Controls caching access. |
295+
| `TOBEY_UA`, `-ua` | Both | `Tobey/0`| any string | User-Agent to identify with. |
296+
| `TOBEY_HOST`, `-host` | Service | empty | i.e. `localhost`, `127.0.0.1` | Adress to bind the HTTP server to. Empty means listen on all. |
297+
| `TOBEY_PORT`, `-port` | Service | `8080` | `1-65535` | Port to bind the HTTP server to. |
298+
| `TOBEY_WORKERS`, `-w`| Both | `5` | `1-128` | Number of workers to start. |
299+
| `TOBEY_REDIS_DSN` | Service | empty | i.e. `redis://localhost:6379` | DSN to reach a Redis instance for coordinting multiple instances. |
300+
| `TOBEY_PROGRESS_DSN` | Service | `noop://` | `memory://`, `factorial://host:port`, `console://`, `noop://` | DSN for progress reporting service. |
301+
| `TOBEY_RESULT_REPORTER_DSN` | Service | `disk://results` | `disk:///path`, `webhook://host/path`, `noop://` | DSN specifying where crawl results should be stored. |
302+
| `TOBEY_TELEMETRY`, `-telemetry` | Service | empty | `metrics`, `traces`, `pulse` | Space separated list of what kind of telemetry is emitted. |
303+
| `-i` | CLI | empty | comma-separated paths, i.e. `/search`, `'*.pdf'`, `'*.(pdf|asc)'` | Paths to ignore during crawling. |
304+
| `-oc` | CLI | `false` | `true`, `false` | Store response bodies directly on disk without JSON wrapper. |
303305

304306
_Note:_ When enabling telemetry ensure you are also providing [OpenTelemetry environment variables](https://opentelemetry.io/docs/languages/sdk-configuration/otlp-exporter/).
305307

0 commit comments

Comments
 (0)