Skip to content

CDATA scanning in XML not behaving properly #531

@akshay-kr

Description

@akshay-kr

Actually we are using Antisamy plugin to parse XML with content inside CDATA tag which used to work before this commit HtmlUnit/htmlunit-neko@49a31c0 was added in htmlunit-neko

For example an XML like this,

<xt:c-code xt:name="code" xt:version="1" xt:id="15ae0cc7-ded7-4a74-97b8-d66238d3c177"><xt:parameter xt:name="language">html</xt:parameter><xt:text-body><![CDATA[<div></div>]]></xt:text-body></xt:c-code>

Before this commit the result for CDATA scanning part was <![CDATA[<div></div>]]> but after this commit the result is <![CDATA[<div]]>]]&gt;

We are parsing this XML, specifically the content inside CDATA and then storing it. Later when viewing we extract the content inside CDATA and render it on the web page.

Also raised an issue for same on htmlunit-neko repo,
HtmlUnit/htmlunit-neko#125

Is this the expected behaviour going forward? Is there a way we can bring back previous behaviour for folks who maybe using the same for XML content parsing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions