Skip to content

Unused namespaces in root get lost on load #86

@sjml

Description

@sjml

In general sxd-document is pretty good at round-tripping XML, at least semantically, but I've noticed that if it loads a document that has multiple xmlns:* attributes in the root, it only retains ones that are actually used in the document.

For the most part, this is ok, since the data is unnecessary and the XML parses fine without the extra namespaces. However, Microsoft Word, for reasons known only to Microsoft, will choke if a docx file is missing some namespaces, even if they aren't used.

The styles, document, for instance, starts like this:

<?xml version="1.0" encoding="UTF-8"?>
<w:styles xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"
    xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
    xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml"
    xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml"
    xmlns:w16="http://schemas.microsoft.com/office/word/2018/wordml"
    xmlns:w16cex="http://schemas.microsoft.com/office/word/2018/wordml/cex"
    xmlns:w16cid="http://schemas.microsoft.com/office/word/2016/wordml/cid"
    xmlns:w16sdtdh="http://schemas.microsoft.com/office/word/2020/wordml/sdtdatahash"
    xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" mc:Ignorable="w14 w15 w16se w16cid w16 w16cex w16sdtdh">
    <!-- ... -->
</w:styles>

Only two of those xmlns attributes get used in the document, and so they're the only ones present if I write it out again.

<?xml version="1.0"?>
<w:styles mc:Ignorable="w14 w15 w16se w16cid w16 w16cex w16sdtdh"
    xmlns:w='http://schemas.openxmlformats.org/wordprocessingml/2006/main'
    xmlns:mc='http://schemas.openxmlformats.org/markup-compatibility/2006'>
    <!-- ... -->
</w:styles>

When that resulting file is zipped back up into a docx, Word won't load it without complaining. (It actually is ok, but it throws an error and loads it in Compatibility Mode, prompting the user to oversave the old one.)

Is there any easy way to retain the namespace listing that was present on initial load?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions