fix: use mb_strlen for multibyte characters in LineLength sniff by dmzoneill · Pull Request #3971 · squizlabs/PHP_CodeSniffer

dmzoneill · 2026-02-11T18:29:26Z

Summary

changed Generic.Files.LineLength to use mb_strlen() instead of strlen() when calculating comment line lengths, so multibyte UTF-8 characters are counted correctly instead of by byte count

Changes

replaced strlen() with mb_strlen() using UTF-8 encoding
replaced strrpos() with mb_strrpos() using UTF-8 encoding
added fallback to byte-based functions when mb_* functions not available

Test

tested with the reproduction case from the issue - Norwegian text with å, æ, ø characters now correctly reports 32 chars instead of 35 bytes

…squizlabs#3923) The Generic.Files.LineLength sniff was using strlen() to calculate comment line lengths, which counts bytes instead of characters. This caused false positives when comments contained multibyte UTF-8 characters like Norwegian letters (å, æ, ø). Changed to use mb_strlen() and mb_strrpos() with UTF-8 encoding when available, falling back to strlen() and strrpos() when the mb_* functions are not available.

jrfnl closed this Feb 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

fix: use mb_strlen for multibyte characters in LineLength sniff#3971

fix: use mb_strlen for multibyte characters in LineLength sniff#3971
dmzoneill wants to merge 1 commit intosquizlabs:masterfrom
dmzoneill-forks:fix/issue-3923-multibyte-line-length

dmzoneill commented Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

dmzoneill commented Feb 11, 2026

Summary

Changes

Test

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants