bug: XLSX markdown output lost tabular structure in 4.3.x

Hi team,

after upgrading from `4.2.15` to `4.3.x` (tested with `4.3.0` and `4.3.6`), XLSX extraction with `output_format="markdown"` appears to have regressed from structured markdown tables to line-based text, which removes explicit table structure and makes downstream LLM parsing harder.

#### Config
```python
from kreuzberg import ExtractionConfig, extract_bytes

config = ExtractionConfig(
    force_ocr=False,
    output_format="markdown",
)
result = await extract_bytes(file_bytes, mime_type, config)
print(result.content)
```

#### Observed change (same XLSX input)

**Before (`4.2.15`)**
```text
## Segment A

| Survey Segment A |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| Segment A | Col 1 | Col 2 | Col 3 | Col 4 | Col 5 | Col 6 |
| Metric X | value_a1 | 2.0 | value_a2 | 4.0 | value_a3 | 4.0 |
| Metric Y | value_b1 | 3.0 | value_b2 | 7.0 | value_b3 | 5.0 |
```

**After (`4.3.x`)**
```text
Survey Segment A

Segment A Col 1 Col 2 Col 3 Col 4 Col 5 Col 6
Metric X value_a1 2.0 value_a2 4.0 value_a3 4.0
Metric Y value_b1 3.0 value_b2 7.0 value_b3 5.0
```

#### Expected
When `output_format="markdown"`, XLSX output should preserve explicit tabular structure (headers + row/column boundaries), similar to prior behavior.

#### Actual
Output is linearized into line-based text and no longer includes markdown table syntax (`| ... |`), so row/column structure becomes implicit.

#### Impact
This is a functional regression for LLM-oriented downstream consumers that rely on markdown table structure for reliable interpretation of spreadsheet content.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: XLSX markdown output lost tabular structure in 4.3.x #405

Config

Observed change (same XLSX input)

Expected

Actual

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: XLSX markdown output lost tabular structure in 4.3.x #405

Description

Config

Observed change (same XLSX input)

Expected

Actual

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions