You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/cli/usage.md
+35-12Lines changed: 35 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@ The Kreuzberg CLI provides command-line access to all extraction features. This
25
25
--8<-- "snippets/cli/install_go_sdk.md"
26
26
27
27
!!! info "Feature Availability"
28
-
**Homebrew Installation:**
28
+
**Homebrew Installation:**
29
29
30
30
- ✅ Text extraction (PDF, Office, images, 75+ formats)
31
31
- ✅ OCR with Tesseract
@@ -165,22 +165,37 @@ Configure OCR backend, language, and Tesseract options in your config file (see
165
165
166
166
### Using Config Files
167
167
168
-
Kreuzberg automatically discovers configuration files by searching the current directory and parent directories for:
169
-
170
-
1.`./kreuzberg.{toml,yaml,yml,json}` in the current directory
171
-
2.`../kreuzberg.{toml,yaml,yml,json}` in the parent directory (and so on, up the directory tree)
168
+
Kreuzberg automatically discovers a configuration file by searching the current directory and parent directories for **`kreuzberg.toml`** only. If you use YAML or JSON, specify the file explicitly with `--config`.
172
169
173
170
```bash title="Terminal"
174
-
# Extract using discovered configuration
171
+
# Extract using discovered configuration (finds kreuzberg.toml)
175
172
kreuzberg extract document.pdf
176
173
```
177
174
178
175
### Specify Config File
179
176
177
+
You can load TOML, YAML (`.yaml` or `.yml`), or JSON via `--config`:
Use the CLI image `ghcr.io/kreuzberg-dev/kreuzberg-cli:latest` for command-line usage. The full image `ghcr.io/kreuzberg-dev/kreuzberg:latest` also includes the CLI.
466
+
450
467
### Basic Docker
451
468
452
469
```bash title="Terminal"
453
470
# Extract document using Docker with mounted directory
454
-
docker run -v $(pwd):/data ghcr.io/kreuzberg-dev/kreuzberg:latest \
471
+
docker run -v $(pwd):/data ghcr.io/kreuzberg-dev/kreuzberg-cli:latest \
455
472
extract /data/document.pdf
456
473
457
474
# Extract and save output to host directory using shell redirection
458
-
docker run -v $(pwd):/data ghcr.io/kreuzberg-dev/kreuzberg:latest \
475
+
docker run -v $(pwd):/data ghcr.io/kreuzberg-dev/kreuzberg-cli:latest \
459
476
extract /data/document.pdf > output.txt
460
477
```
461
478
462
479
### Docker with OCR
463
480
464
481
```bash title="Terminal"
465
482
# Extract with OCR using Docker
466
-
docker run -v $(pwd):/data ghcr.io/kreuzberg-dev/kreuzberg:latest \
483
+
docker run -v $(pwd):/data ghcr.io/kreuzberg-dev/kreuzberg-cli:latest \
467
484
extract /data/scanned.pdf --ocr true
468
485
```
469
486
@@ -472,11 +489,11 @@ docker run -v $(pwd):/data ghcr.io/kreuzberg-dev/kreuzberg:latest \
472
489
**docker-compose.yaml:**
473
490
474
491
```yaml title="docker-compose.yaml"
475
-
version: '3.8'
492
+
version: "3.8"
476
493
477
494
services:
478
495
kreuzberg:
479
-
image: ghcr.io/kreuzberg-dev/kreuzberg:latest
496
+
image: ghcr.io/kreuzberg-dev/kreuzberg-cli:latest
480
497
volumes:
481
498
- ./documents:/input
482
499
command: extract /input/document.pdf --ocr true
@@ -558,8 +575,9 @@ The `serve` command starts a RESTful HTTP API server:
558
575
# Start server on default host (127.0.0.1) and port (8000)
559
576
kreuzberg serve
560
577
561
-
# Start server on specific host and port
578
+
# Start server on specific host and port (-H / -p are short forms)
0 commit comments