Skip to content

Conversation

nielsdos
Copy link
Member

We're currently using a libxml buffer, which requires copying the buffer to zend_strings every time we want to output the string. Furthermore, its use of the system allocator instead of ZendMM makes it not count towards the memory_limit and hinders performance.

This patch adds a custom writer such that the strings are written to a smart_str instance, using ZendMM for improved performance, and giving the ability to not copy the string in the common case where flush has empty set to true.

We're currently using a libxml buffer, which requires copying the buffer
to zend_strings every time we want to output the string. Furthermore,
its use of the system allocator instead of ZendMM makes it not count
towards the memory_limit and hinders performance.

This patch adds a custom writer such that the strings are written to a
smart_str instance, using ZendMM for improved performance, and giving
the ability to not copy the string in the common case where flush has
empty set to true.
Copy link
Member

@Girgias Girgias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sensible

@staabm
Copy link
Contributor

staabm commented Sep 30, 2024

This patch adds a custom writer such that the strings are written to a smart_str instance, using ZendMM for improved performance

do we have an idea how much faster this is?

@nielsdos
Copy link
Member Author

This patch adds a custom writer such that the strings are written to a smart_str instance, using ZendMM for improved performance

do we have an idea how much faster this is?

It highly depends on the workload.
On a simple test like this:

<?php

$writer = XMLWriter::toMemory();
var_dump($writer);

for ($i = 0; $i < 10000; $i++) {
    xmlwriter_start_element($writer, 'foo');
    xmlwriter_write_cdata($writer, 'some cdata');
    xmlwriter_end_element($writer);
}

for ($i = 0; $i < 10000; $i++)
    $writer->flush(false);

I get:

Benchmark 1: ./sapi/cli/php x.php
  Time (mean ± σ):     148.7 ms ±   3.2 ms    [User: 144.6 ms, System: 3.8 ms]
  Range (min … max):   144.3 ms … 156.5 ms    19 runs
 
Benchmark 2: ./sapi/cli/php_old x.php
  Time (mean ± σ):     212.8 ms ±   4.5 ms    [User: 207.7 ms, System: 4.6 ms]
  Range (min … max):   203.8 ms … 220.1 ms    14 runs
 
Summary
  ./sapi/cli/php x.php ran
    1.43 ± 0.04 times faster than ./sapi/cli/php_old x.php

Further speed improvements likely possible by switching to fast ZPP.

@nielsdos nielsdos closed this in f5e81fe Sep 30, 2024
jorgsowa pushed a commit to jorgsowa/php-src that referenced this pull request Oct 1, 2024
We're currently using a libxml buffer, which requires copying the buffer
to zend_strings every time we want to output the string. Furthermore,
its use of the system allocator instead of ZendMM makes it not count
towards the memory_limit and hinders performance.

This patch adds a custom writer such that the strings are written to a
smart_str instance, using ZendMM for improved performance, and giving
the ability to not copy the string in the common case where flush has
empty set to true.

Closes phpGH-16120.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants