Skip to content

Invalid ID attribute with php8.4's \Dom\HTMLDocument #18316

@edent

Description

@edent

Description

The following code:

<?php
$html = `<p id="example ">`
$dom = \Dom\HTMLDocument::createFromString($html, LIBXML_NOERROR, "UTF-8");
echo $dom->saveHTML();

Resulted in this output:

<html><head></head><body><p id="example "></p></body></html>

But I expected this output instead:

<html><head></head><body><p id="example"></p></body></html>

As per https://html.spec.whatwg.org/multipage/dom.html#global-attributes:the-id-attribute-2

When specified on HTML elements, the id attribute value must be unique amongst all the IDs in the element's tree and must contain at least one character. The value must not contain any ASCII whitespace.

Further detail at https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Global_attributes/id#syntax

I'd suggest trimming whitespace from the IDs. It may also be sensible to do that on the other attributes. Of course, the behaviour does depend on how closely you want to follow the input's mistakes. If the intention is to closely replicate the input (whether their original code was valid or not) then please close this bug.

PHP Version

PHP 8.4.6 (cli) (built: Apr 11 2025 02:19:14) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.4.6, Copyright (c) Zend Technologies
with Zend OPcache v8.4.6, Copyright (c), by Zend Technologies

Operating System

Pop!_OS 22.04 LTS

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions