Skip to content

Conversation

@SakiTakamachi
Copy link
Member

No description provided.

@SakiTakamachi SakiTakamachi force-pushed the grapheme_levenshtein_emalloc branch from 16fc2df to e669e68 Compare May 1, 2025 12:11
@SakiTakamachi SakiTakamachi force-pushed the grapheme_levenshtein_emalloc branch from e669e68 to 92d3e78 Compare May 1, 2025 12:42
@devnexen
Copy link
Member

devnexen commented May 1, 2025

Not sure I m liking this. Not saying it won t get merged eventually though. Is there an issue with the actual code ?

@SakiTakamachi
Copy link
Member Author

@devnexen
No, there's nothing wrong with the code.
I was hoping to improve performance by reducing the number of memory allocations, since more allocations usually mean more overhead.
But on second thought, speed might not be that important in this context...

@devnexen
Copy link
Member

devnexen commented May 1, 2025

cc @youkidearitai if you want to spin your opinion too.
@SakiTakamachi if you fix the segfaults, I ll have a look later today. Cheers.

@nielsdos
Copy link
Member

nielsdos commented May 1, 2025

If you want to do this, and I doubt this matters a lot for performance, you can do p1 = safe_emalloc(strlen_2 + 1, sizeof(zend_long) * 2, 0); which is much simpler and doesn't require the 32/64 bit split.

@SakiTakamachi
Copy link
Member Author

@nielsdos
Yeah, I just measured it, and it only showed a 1–2% improvement.
I'm thinking of closing this...

@SakiTakamachi
Copy link
Member Author

SakiTakamachi commented May 1, 2025

@nielsdos

you can do p1 = safe_emalloc(strlen_2 + 1, sizeof(zend_long) * 2, 0); which is much simpler and doesn't require the 32/64 bit split.

In this case, wouldn't the maximum allocatable memory be cut in half on a 32-bit system?

edit: This is just a question out of curiosity.

@nielsdos
Copy link
Member

nielsdos commented May 1, 2025

In this case, wouldn't the maximum allocatable memory be cut in half on a 32-bit system?

The total amount of allocated bytes don't change when you do that. If it overflows it wouldn't have fit two allocation either. So how would that change the maximum?

@SakiTakamachi
Copy link
Member Author

@nielsdos
It seems I was overthinking the fact that strlen_2 is signed and ended up drawing the wrong conclusion. You're right.

@SakiTakamachi
Copy link
Member Author

No matter how many times I measure it, the improvement is negligible...
Alright, I'll close this PR. Thanks for reviewing it!

Benchmark 1: /php-dev/sapi/cli/php /mount/intl/1.php
  Time (mean ± σ):     614.4 ms ±   9.8 ms    [User: 601.0 ms, System: 7.2 ms]
  Range (min … max):   605.3 ms … 637.4 ms    10 runs
 
Benchmark 2: /master/sapi/cli/php /mount/intl/1.php
  Time (mean ± σ):     618.0 ms ±  13.8 ms    [User: 605.5 ms, System: 6.3 ms]
  Range (min … max):   605.8 ms … 648.2 ms    10 runs
 
Summary
  '/php-dev/sapi/cli/php /mount/intl/1.php' ran
    1.01 ± 0.03 times faster than '/master/sapi/cli/php /mount/intl/1.php'

@SakiTakamachi SakiTakamachi deleted the grapheme_levenshtein_emalloc branch May 1, 2025 13:37
@youkidearitai
Copy link
Contributor

@SakiTakamachi If you improve performance this function, I like improve how to use ICU library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants