Skip to content

[8.5] Add locale_is_right_to_left #527

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: 1.x
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions src/Php85/Php85.php
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@

/**
* @author Pierre Ambroise <[email protected]>
* @author Alexander Schranz <[email protected]>
*
* @internal
*/
Expand All @@ -33,4 +34,8 @@ public static function get_exception_handler(): ?callable

return $handler;
}

public static function locale_is_right_to_left(string $locale): bool {
return (bool) preg_match('/^(?:ar|he|fa|ur|ps|sd|ug|ckb|yi|dv|ku_arab|ku-arab)(?:[_-].*)?$/i', $locale);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right implementation. What is right to left is a script, not a language. And locales might specify a script explicitly which is not the most likely script.

Copy link
Author

@alexander-schranz alexander-schranz May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, any suggestions? Sadly the original code does not help here: unicode-org/icu@53dcbe6

  • A script is right-to-left according to the CLDR script metadata
  • which corresponds to whether the script's letters have Bidi_Class=R or AL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what Gemini provides, let's take inspiration from this snippet?

    function locale_is_right_to_left(string $locale): bool
    {
        // Define the list of RTL scripts within the function scope.
        // These are 4-letter ISO 15924 script codes.
        static $rtlScripts = [
            'Adlm', 'Arab', 'Armi', 'Hebr', 'Mani', 'Mend', 'Nkoo',
            'Phnx', 'Rohg', 'Samr', 'Syrc', 'Thaa',
        ];

        // This is a minimal, hardcoded version of the CLDR "likelySubtags" data.
        // It maps a language code to its most likely SCRIPT code.
        // This list is NOT exhaustive and is the primary weakness of this approach.
        static $languageToLikelyRtlScript = [
            'ar' => 'Arab', // Arabic
            'fa' => 'Arab', // Persian (Farsi)
            'ur' => 'Arab', // Urdu
            'ps' => 'Arab', // Pashto
            'sd' => 'Arab', // Sindhi
            'ug' => 'Arab', // Uyghur
            'ckb' => 'Arab', // Sorani Kurdish
            'he' => 'Hebr', // Hebrew
            'yi' => 'Hebr', // Yiddish
            'dv' => 'Thaa', // Dhivehi
            'nqo' => 'Nkoo', // N'Ko
        ];

        if (empty($locale)) {
            return false;
        }

        // Normalize separators and split the locale into parts.
        $localeParts = preg_split('/[_-]/', $locale);
        $language = strtolower($localeParts[0] ?? '');
        $script = null;

        // Look for an explicit script subtag (always 4 letters).
        foreach ($localeParts as $part) {
            if (strlen($part) === 4 && ctype_alpha($part)) {
                // Capitalize the first letter for standard format (e.g., "Arab", "Latn").
                $script = ucfirst(strtolower($part));
                break;
            }
        }

        // If no explicit script was found, try to infer it from our map.
        if ($script === null) {
            $script = $languageToLikelyScript[$language] ?? null;
        }

        // If we couldn't determine a script, we can't determine direction.
        if ($script === null) {
            // Fallback for languages where the code itself is a strong indicator
            if (in_array($language, ['ar', 'he', 'fa', 'ur', 'ps', 'sd', 'ug', 'ckb', 'yi', 'dv'])) {
                 return true;
            }
            return false;
        }

        // Check if the determined script is in our list of RTL scripts.
        return in_array($script, $rtlScripts, true);
    }
}

}
}
4 changes: 4 additions & 0 deletions src/Php85/bootstrap.php
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,7 @@ function get_error_handler(): ?callable { return p\Php85::get_error_handler(); }
if (!function_exists('get_exception_handler')) {
function get_exception_handler(): ?callable { return p\Php85::get_exception_handler(); }
}

if (!function_exists('locale_is_right_to_left')) {
function locale_is_right_to_left(string $locale): bool { return p\Php85::locale_is_right_to_left($locale); }
}
19 changes: 19 additions & 0 deletions tests/Php85/Php85Test.php
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,25 @@ public static function provideHandler()
$handler = new TestHandlerInvokable();
yield [$handler, $handler];
}

public function testLocaleIsRightToLeft(): void
{
$this->assertTrue(locale_is_right_to_left('ar'));
$this->assertTrue(locale_is_right_to_left('he'));
$this->assertTrue(locale_is_right_to_left('fa'));
$this->assertTrue(locale_is_right_to_left('ur'));
$this->assertTrue(locale_is_right_to_left('ps'));
$this->assertTrue(locale_is_right_to_left('sd'));
$this->assertTrue(locale_is_right_to_left('ug'));
$this->assertTrue(locale_is_right_to_left('ckb'));
$this->assertTrue(locale_is_right_to_left('yi'));
$this->assertTrue(locale_is_right_to_left('dv'));
$this->assertTrue(locale_is_right_to_left('ku_arab'));
$this->assertTrue(locale_is_right_to_left('ku-arab'));

$this->assertFalse(locale_is_right_to_left('en'));
$this->assertFalse(locale_is_right_to_left('fr'));
}
}

class TestHandler
Expand Down
Loading