-
Notifications
You must be signed in to change notification settings - Fork 8k
Closed as not planned
Labels
Description
Description
(For all tests online tool is used: https://onlinephp.io/)
The following code:
<?php
// > this is \PDOException message in Russian, that represents `Server is not responding` (it means - server configuration is not a solution)
$str = base64_decode('U1FMU1RBVEVbSFkwMDBdIFsyMDAyXSDP7uTq6/735e3o5SDt5SDz8fLg7e7i6+Xt7iwg8i7qLiDq7u3l9+376SDq7uzv/P7y5fAg7vLi5fDjIOfg7/Du8SDt4CDv7uTq6/735e3o5S4NCiAoU1FMOiBTRVQgRk9SRUlHTl9LRVlfQ0hFQ0tTPTA7KQ==');
// > and it looks like
// ###
// Warning: Your output contains characters that could not be displayed. Make sure you encode the output when working with special characters or binary data. [Click here for an example on how to do this](https://onlinephp.io/code/utf8-in-the-sandbox)
// SQLSTATE[HY000] [2002] ����������� �� �����������, �.�. �������� ��������� ������ ������ �� �����������.
(SQL: SET FOREIGN_KEY_CHECKS=0;)
// ###
$mbListEncodings = mb_list_encodings();
$detect = mb_detect_encoding($str);
// PHP_VERSION_ID < 80300 -> 'UTF-8'
// PHP_VERSION_ID >= 80300 -> 'ASCII'
var_dump($detect);
$detect2 = mb_detect_encoding($str, $mbListEncodings, true);
// $detect2 = mb_detect_encoding($str, $mbListEncodings); // > same result, actually isnt, without $strict = true, it may return 'ASCII' if provided below results is not an option, with $strict it returns FALSE then
// PHP_VERSION_ID < 80100 -> 'ISO-8859-1'
// PHP_VERSION_ID >= 80100 -> 'Windows-1252'
var_dump($detect2);
// > accidentally IT WORKS HERE but PHP_VERSION_ID >= 80100
array_unshift($mbListEncodings, 'CP1251');
array_unshift($mbListEncodings, 'Windows-1251');
$detect3 = mb_detect_encoding($str, $mbListEncodings, true);
// PHP_VERSION_ID < 80100 -> 'ISO-8859-1' // > !!! seems as old bug
// PHP_VERSION_ID >= 80100 -> 'Windows-1251'
var_dump($detect3);
$cpDetectedWrong = 'Windows-1252';
$converted = mb_convert_encoding($str, 'UTF-8', $cpDetectedWrong);
$converted_b64 = base64_encode($converted);
var_dump($converted); // string(207) "SQLSTATE[HY000] [2002] Ïîäêëþ÷åíèå íå óñòàíîâëåíî, ò.ê. êîíå÷íûé êîìïüþòåð îòâåðã çàïðîñ íà ïîäêëþ÷åíèå.
(SQL: SET FOREIGN_KEY_CHECKS=0;)"
var_dump($converted_b64); // string(276) "U1FMU1RBVEVbSFkwMDBdIFsyMDAyXSDDj8Ouw6TDqsOrw77Dt8Olw63DqMOlIMOtw6Ugw7PDscOyw6DDrcOuw6LDq8Olw63Driwgw7Iuw6ouIMOqw67DrcOlw7fDrcO7w6kgw6rDrsOsw6/DvMO+w7LDpcOwIMOuw7LDosOlw7DDoyDDp8Ogw6/DsMOuw7Egw63DoCDDr8Ouw6TDqsOrw77Dt8Olw63DqMOlLg0KIChTUUw6IFNFVCBGT1JFSUdOX0tFWV9DSEVDS1M9MDsp"But I expected this output instead:
<?php
$detect = mb_detect_encoding($str, mb_list_encodings(), true);
var_dump($detect); // 'Windows-1251'
$cpDectectedCorrect = 'Windows-1251';
$converted = mb_convert_encoding($str, 'UTF-8', $cpDectectedCorrect);
$converted_b64 = base64_encode($converted);
var_dump($converted); // string(207) "SQLSTATE[HY000] [2002] Подключение не установлено, т.к. конечный компьютер отверг запрос на подключение.
(SQL: SET FOREIGN_KEY_CHECKS=0;)"
var_dump($converted_b64); // string(276) "U1FMU1RBVEVbSFkwMDBdIFsyMDAyXSDQn9C+0LTQutC70Y7Rh9C10L3QuNC1INC90LUg0YPRgdGC0LDQvdC+0LLQu9C10L3Qviwg0YIu0LouINC60L7QvdC10YfQvdGL0Lkg0LrQvtC80L/RjNGO0YLQtdGAINC+0YLQstC10YDQsyDQt9Cw0L/RgNC+0YEg0L3QsCDQv9C+0LTQutC70Y7Rh9C10L3QuNC1Lg0KIChTUUw6IFNFVCBGT1JFSUdOX0tFWV9DSEVDS1M9MDsp"I've tried using mb_check_encoding()... I've played for few hours with mb_detect_order(), mb_list_encodings()... I've even tried to split known encodings by groups by first letters or their slugs and apply mb_convert_encoding for better detection for each group.
No. Just dont work, and should be fixed like
<?php
set_exception_handler(function ($e) {
$phpMessage = $e->getMessage();
if ($e instanceof \PDOException) {
$isUtf8 = preg_match('//u', $phpMessage) === 1;
if (! $isUtf8) {
$isWindows = (strtoupper(substr(PHP_OS, 0, 3)) === 'WIN');
if ($isWindows) {
$phpMessage = mb_convert_encoding($phpMessage, 'UTF-8', 'CP1251');
}
}
}
/// ...code
});PHP Version
PHP 8.4
Operating System
Windows 10