-
Notifications
You must be signed in to change notification settings - Fork 8k
Add str_extend() #20527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add str_extend() #20527
Conversation
This pre-allocates a large string, for usage with concatenations.
Users must take care to keep the refcount to 1, if they desire benefiting from this.
Note that it is generally pointless to call str_extend("", $size) (i.e. extending an empty string), given that e.g. concatenation will special case empty strings, and then use the other string.
(Which is why not a str_alloc($size), which would be pointless and thrown away during concat op.)
This has a very slight performance improvement on the general case of appending a single byte in a loop (given that zend_string_extend now uses perealloc3) of about 8%.
In particular zend_string_extend() will mostly run into the fast path of zend_mm_realloc_heap for huge allocations.
When using str_extend(), appending a single byte in a loop is 33% faster than the old baseline.
The tested loop is:
$str = str_extend("a", 1 << 26);
for ($i = 0; $i < 1 << 25; ++$i) {
$str .= "a";
}
|
I suppose that test is quite pathological under asan :-D |
|
Not sure who to ping on this - is this something that interests you @arnaud-lb ? |
|
This looks right, but this seems fragile as the user or internals can break the optimization easily (as you have pointed, I would prefer if we introduced a string builder class, or something like Javascript's typed arrays (which are essentially views of memory buffers, thus are suitable for creating a string builder or manipulating binary data). Oddly, I'm seeing smaller gains when using // x.php
$str = str_extend("a", 1 << 26);
for ($i = 0; $i < 1 << 25; ++$i) {
$str .= "a";
}// y.php
$str = "a";
for ($i = 0; $i < 1 << 25; ++$i) {
$str .= "a";
}However I do get similar results when not using I've triggered a benchmark run: https://github.com/php/php-src/actions/runs/19572237486. For 1-byte appends, using indexed string access seems faster: // z.php
$str = str_repeat("\0", 1 << 26);
for ($i = 0; $i < 1 << 25; ++$i) {
$str[$i] = "a";
} |
AWS x86_64 (c7i.24xl)
Laravel 12.2.0 demo app - 100 consecutive runs, 50 warmups, 100 requests (sec)
Symfony 2.7.0 demo app - 100 consecutive runs, 50 warmups, 100 requests (sec)
Wordpress 6.2 main page - 100 consecutive runs, 20 warmups, 20 requests (sec)
bench.php - 100 consecutive runs, 10 warmups, 2 requests (sec)
|
dstogov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have strong opinion, but the patch needs some clarification.
| static zend_always_inline void *zend_mm_realloc_heap(zend_mm_heap *heap, void *ptr, size_t size, bool use_copy_size, size_t copy_size ZEND_FILE_LINE_DC ZEND_FILE_LINE_ORIG_DC) | ||
| #define EREALLOC_DEFAULT 0 | ||
| #define EREALLOC_COPY 1 | ||
| #define EREALLOC_NOSHRINK 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EREALLOC_COPY doesn't keep the explanation of semantic of use_copy_size.
In case use_copy_size is set, erealloc() copies only the size specified in copy_size argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EREALLOC_NOSHRINK also embraces use_copy_size mode. Probably mode should be represented as a bitset.
| return _zend_mm_alloc(heap, size ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC); | ||
| } else { | ||
| if (mode == EREALLOC_NOSHRINK) { | ||
| old_size = zend_mm_get_huge_block_size(heap, ptr ZEND_FILE_LINE_RELAY_CC ZEND_FILE_LINE_ORIG_RELAY_CC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use huge_block_size?
| /** | ||
| * @compile-time-eval | ||
| */ | ||
| function str_extend(string $string, int $size): string {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is the best self-explainable name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. The string does not become longer, so "extend" is the incorrect word. "reserve" may be better.
This pre-allocates a large string, for usage with concatenations. Users must take care to keep the refcount to 1, if they desire benefiting from this.
Note that it is generally pointless to call str_extend("", $size) (i.e. extending an empty string), given that e.g. concatenation will special case empty strings, and then use the other string. (Which is why not a str_alloc($size), which would be pointless and thrown away during concat op.)
This has a slight performance improvement on the general case of appending a single byte in a loop (given that zend_string_extend now uses perealloc3) of about 8%. In particular zend_string_extend() will mostly run into the fast path of zend_mm_realloc_heap for huge allocations.
When using str_extend(), appending a single byte in a loop is 33% faster than the old baseline.
The tested loop is:
Specifically hyperfine (x.php being the above test script and y.php being the script, but with "a" directly instead of str_extend()):
I haven't looked at improvements / overhead outside of the synthetic case though (not quite sure what public code to test against) - I could observe some improvements in string processing code though.
Ideally this feature may allow optimizations in JIT eventually whereby lightweight appending can be done with a bare capacity counter for looped string appends, compared against the value initially passed to str_extend(), avoiding repeated string extends.