You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add debug module with PageImage for visualization (#18)
* feat: add debug module with PageImage for visualization
* feat: support string color names and RGBA tuples in debug methods
* fix: fix lint errors in debug module
Copy file name to clipboardExpand all lines: CHANGELOG.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,6 +15,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
15
15
- Add `Page.page_idx` property: zero-based index of the page within its document
16
16
- Add `Page.rotation_degrees` property: clockwise rotation of the page in degrees
17
17
- Add `Page.clear_cache()` method as the canonical name for clearing cached objects
18
+
- Add `tablers.debug` module with `PageImage` class for visualizing detected tables and edges on a rendered page image; requires the optional `debug` extra (`pip install tablers[debug]`)
Copy file name to clipboardExpand all lines: docs/getting_started/installation.md
+12Lines changed: 12 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,6 +18,18 @@ The recommended way to install Tablers is via pip:
18
18
pip install tablers
19
19
```
20
20
21
+
## Optional Dependencies
22
+
23
+
### Debug / Visualization
24
+
25
+
The `tablers.debug` module provides tools for visualizing detected tables, edges, and intersection points on a rendered page image. It requires two additional packages:
26
+
27
+
```bash
28
+
pip install tablers[debug]
29
+
```
30
+
31
+
This installs `pillow` and `pypdfium2` alongside Tablers. If these packages are not present, importing `tablers.debug` will raise an `ImportError`.
32
+
21
33
## Building from Source
22
34
23
35
If you need to build Tablers from source, follow these steps:
|`antialias`|`bool`|`False`| Enable anti-aliasing during rendering |
587
+
588
+
**Raises:**`RuntimeError` — If `original`is`None`and the document has already been closed.
589
+
590
+
!!! note "Password-protected PDFs"
591
+
PageImage rendering supports only documents **without a password**. For password-protected PDFs, use `Document.save_to_bytes()` to obtain a decrypted copy, then open it with`Document(bytes=...)`andpass the resulting page to PageImage.
592
+
593
+
**Attributes:**
594
+
595
+
| Attribute | Type | Description |
596
+
|-----------|------|-------------|
597
+
|`original`|`PIL.Image.Image`| The unmodified rendered page image |
598
+
|`annotated`|`PIL.Image.Image`| The working copy withall annotations applied |
599
+
|`scale`|`float`| Ratio of image pixels to page points (`image_width / page_width`) |
|`debug_table(table, fill, stroke, stroke_width)`| Draw a filled rectangle over every cell in a `Table`|
628
+
|`debug_tablefinder(tf_settings, **kwargs)`| Draw all detected tables (cell outlines) and detected edges |
629
+
630
+
**Color arguments** (`fill`, `stroke`in the methods above): accept either an RGBAtuple`(r, g, b, a)`or a string. String colors are resolved via PIL's [`ImageColor.getrgb`](https://pillow.readthedocs.io/en/stable/reference/ImageColor.html). For the list of supported string formats, see the [ImageColor reference](https://pillow.readthedocs.io/en/stable/reference/ImageColor.html). Alpha is set to 255 (opaque) for string colors; for transparency use an RGBA tuple.
631
+
632
+
**Default color constants** (importable from`tablers.debug`):
633
+
634
+
| Constant | Value | Description |
635
+
|----------|-------|-------------|
636
+
|`DEFAULT_FILL`|`(0, 0, 255, 50)`| Semi-transparent blue fill |
637
+
|`DEFAULT_STROKE`|`(255, 0, 0, 200)`| Near-opaque red stroke |
638
+
|`DEFAULT_STROKE_WIDTH`|`1`| Stroke width in pixels |
Copy file name to clipboardExpand all lines: docs/usage/advanced.md
+86Lines changed: 86 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -401,6 +401,92 @@ except RuntimeError as e:
401
401
print(f"Runtime error: {e}")
402
402
```
403
403
404
+
## Visualizing Table Detection
405
+
406
+
The optional `tablers.debug` module lets you render a page to an image and annotate it with detected tables, edges, and intersection points. Install the extra dependencies first:
407
+
408
+
```bash
409
+
pip install tablers[debug]
410
+
```
411
+
412
+
Rendering supports only **documents without a password**. For password-protected PDFs, use `Document.save_to_bytes()` to get a decrypted copy, then open it with `Document(bytes=...)` and pass the resulting page to `PageImage`.
413
+
414
+
### Quick Visual Debug
415
+
416
+
`debug_tablefinder()` renders all detection results in one call: cell outlines (blue fill, red border) and detected edges (red lines). You can pass custom colors to `debug_table()` and the drawing methods; `fill` and `stroke` accept either RGBA tuples or strings. For supported string color formats, see the [PIL ImageColor reference](https://pillow.readthedocs.io/en/stable/reference/ImageColor.html).
417
+
418
+
```python
419
+
from tablers import Document
420
+
from tablers.debug import PageImage
421
+
422
+
with Document("example.pdf") as doc:
423
+
page = doc.get_page(0)
424
+
img = PageImage(page, resolution=150)
425
+
img.debug_tablefinder()
426
+
427
+
# Save to file
428
+
img.save("debug.png", quantize=False)
429
+
430
+
# Or display inline in Jupyter (auto-detected via _repr_png_)
431
+
img
432
+
```
433
+
434
+
Pass `TfSettings` or keyword arguments to use non-default detection settings:
Use `debug_table()` to annotate specific tables, or combine it with other drawing methods. Color arguments (`fill`, `stroke`) accept RGBA tuples or strings; for supported string formats see the [PIL ImageColor reference](https://pillow.readthedocs.io/en/stable/reference/ImageColor.html).
443
+
444
+
```python
445
+
from tablers import Document, find_tables
446
+
from tablers.debug import PageImage
447
+
448
+
with Document("example.pdf") as doc:
449
+
page = doc.get_page(0)
450
+
tables = find_tables(page, extract_text=False)
451
+
452
+
img = PageImage(page)
453
+
454
+
# Annotate all tables individually (optional: custom colors; same as default blue/red here)
455
+
for table in tables:
456
+
img.debug_table(table, fill="blue", stroke="red")
457
+
458
+
img.save("tables.png", quantize=False)
459
+
```
460
+
461
+
### Drawing Primitives
462
+
463
+
`PageImage` provides low-level drawing helpers that all return `self` for chaining:
464
+
465
+
```python
466
+
img = (
467
+
PageImage(page)
468
+
.draw_hline(200.0) # horizontal guide line
469
+
.draw_vline(300.0) # vertical guide line
470
+
.draw_rect((50, 100, 250, 400)) # arbitrary bbox
471
+
.draw_circle((150.0, 250.0), radius=5) # point of interest
472
+
)
473
+
img.save("annotated.png", quantize=False)
474
+
```
475
+
476
+
### Resetting and Copying
477
+
478
+
```python
479
+
img = PageImage(page)
480
+
img.debug_tablefinder()
481
+
482
+
# Remove all annotations and start fresh
483
+
img.reset()
484
+
485
+
# Create an independent copy to try different annotations
486
+
img2 = img.copy()
487
+
img2.debug_tablefinder(vertical_strategy="text")
488
+
```
489
+
404
490
## Next Steps
405
491
406
492
- See [Settings Reference](../reference/settings.md) for all configuration options
0 commit comments