Skip to content

fix(docx): preserve parent state when processing table cells#3076

Closed
Br1an67 wants to merge 1 commit intodocling-project:mainfrom
Br1an67:fix/issue-2668-fix-2668
Closed

fix(docx): preserve parent state when processing table cells#3076
Br1an67 wants to merge 1 commit intodocling-project:mainfrom
Br1an67:fix/issue-2668-fix-2668

Conversation

@Br1an67
Copy link
Contributor

@Br1an67 Br1an67 commented Mar 7, 2026

Issue resolved by this Pull Request:
Resolves #2668

Summary

Fixed a bug where section headers appearing after tables in DOCX documents were incorrectly being added as children of table header cells. This caused the document hierarchy to be corrupted, resulting in malformed markdown and other exports.

Root Cause

The _handle_tables method in msword_backend.py calls _walk_linear to process rich table cell content. However, _walk_linear modifies self.parents (especially when cells contain headings). This parent state was never restored after processing each cell, polluting the parent hierarchy for subsequent document elements.

Changes

  • Save self.parents.copy() before calling _walk_linear in two places within _handle_tables:
    1. For rich table cells (multi-cell tables)
    2. For 1x1 tables treated as furniture
  • Restore self.parents after processing each cell

This follows the same pattern already used in _handle_textbox_content.

Testing

  • All existing msword backend tests pass
  • Verified fix with a test case that creates a table followed by a section header - the header now correctly appears at the document level, not nested within the table

Checklist:

  • Documentation has been updated, if necessary. (Not necessary - internal fix)
  • Examples have been added, if necessary. (Not necessary)
  • Tests have been added, if necessary. (Existing tests cover this area)

When processing rich table cells and 1x1 tables, _walk_linear modifies
self.parents (especially if cells contain headings). This corrupted
the parent hierarchy, causing section headers after tables to be
incorrectly added as children of table header cells.

Save and restore self.parents around _walk_linear calls in _handle_tables
to prevent parent state pollution, following the same pattern used in
_handle_textbox_content.
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

DCO Check Failed

Hi @Br1an67, your pull request has failed the Developer Certificate of Origin (DCO) check.

This repository supports remediation commits, so you can fix this without rewriting history — but you must follow the required message format.


🛠 Quick Fix: Add a remediation commit

Run this command:

git commit --allow-empty -s -m "DCO Remediation Commit for Br1an67 <932039080@qq.com>

I, Br1an67 <932039080@qq.com>, hereby add my Signed-off-by to this commit: 14bff6bdf19692003f85899bf42d07e15caf5031"
git push

🔧 Advanced: Sign off each commit directly

For the latest commit:

git commit --amend --signoff
git push --force-with-lease

For multiple commits:

git rebase --signoff origin/main
git push --force-with-lease

More info: DCO check report

@mergify
Copy link

mergify bot commented Mar 7, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@codecov
Copy link

codecov bot commented Mar 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Br1an67
Copy link
Contributor Author

Br1an67 commented Mar 17, 2026

Closing this in favor of #3047, which includes more comprehensive state restoration (saves/restores self.parents, self.level, and self.level_at_new_list) and has test coverage. I'll incorporate the 1x1 furniture table fix from this PR into #3047.

@Br1an67 Br1an67 closed this Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Section after table being incorrectly added as child of table header

1 participant