Skip to content

Conversation

@robyngit
Copy link
Member

The editor uses $.parseHTML() rather than $.parseXML() (or the native XML DOMParser) to parse EML. This is for historical reasons and to maintain compatibility with the rest of the codebase.

In HTML, <source> is a void element, meaning it can't contain text content and must not have a closing tag. Consequently the <source>...</source> elements in measurement scale get mis-parsed by the EML211 model. The closing tag is ignored and the contained text ends up outside the element node, leading malformed EML.

This issue doesn't come up often in practice because <source> elements can't be added via the UI, but it can happen when users create and upload EML via other tools, then try to edit it in the Editor.

The EML model already had an existing cleanUpXML() method that was meant to replace <source> elements with <sourced> and the rest of the code base expected this behaviour (e.g. looked for the text content of <sourced> elements to get the source text for the model). However, the replacement wasn't assigned back to xmlString, so it had no effect. It was also not set to replace all occurrences, only the first. Additionally, we did not revert the changes when saving back to the server, so the saved EML would contain <sourced> elements instead of <source>.

This PR fixes the cleanUpXML() method to properly replace all occurrences of <source> and </source> with <sourced> and </sourced>, and adds a revertSourcedToSource() method to revert these changes before saving the EML back to the server. I also added a unit test to verify this behaviour, with multiple <source> elements.

@robyngit robyngit linked an issue Nov 18, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

EML corrupted when edited in MetacatUI

2 participants