Skip to content

Commit 57a593f

Browse files
authored
[libc++] Remove obsolete locale-specific regex tests (llvm#159590)
After a recent macOS update, several of the locale-specific regex tests started failing. These tests were mainly testing two locale specific features of regular expressions: - A character class like `[=x=]` matches any character that is considered equivalent to `x` according to the collation rules of the current locale. - A character class like `[[.ch.]]` matches anything that is equivalent to `ch` (whether as two letters or as a single collation element) in the current locale. However, these tests were relying on platform-specific localization data, specifically they were only working with older macOS localization data. As can be seen from the numerous XFAILs, most mainstream platforms didn't actually pass this test. After the macOS update, macOS itself also doesn't pass these tests anymore. I looked at whether there are locales where these tests would still make sense, and I couldn't find any. I am not a localization expert, but it appears that only legacy locales like the traditional Spanish locale (which isn't commonly shipped on systems anymore) considers `[.ch.]` to be a single collation element. Therefore, it seems that the locale specific part of these tests is not relevant anymore, and this patch removes them. The patch also moves some tests for equivalence classes inside character classes to their non locale-specific tests, since that feature was not covered there. Finally, the lookup_collatename.pass.cpp test was fixed by removing an assertion that `ch` is a collation element in the CZ locale, which seems to not be the case in recent localization data (and appears to be the root cause for about half the failures in these tests).
1 parent 46ce6a0 commit 57a593f

17 files changed

+360
-1023
lines changed

libcxx/test/std/re/re.alg/re.alg.match/awk.locale.pass.cpp

Lines changed: 0 additions & 136 deletions
This file was deleted.

libcxx/test/std/re/re.alg/re.alg.match/awk.pass.cpp

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -573,6 +573,29 @@ int main(int, char**)
573573
assert(m.position(0) == 0);
574574
assert(m.str(0) == s);
575575
}
576+
{
577+
std::cmatch m;
578+
const char s[] = "m";
579+
assert(std::regex_match(s, m,
580+
std::regex("[a[=m=]z]", std::regex_constants::awk)));
581+
assert(m.size() == 1);
582+
assert(!m.prefix().matched);
583+
assert(m.prefix().first == s);
584+
assert(m.prefix().second == m[0].first);
585+
assert(!m.suffix().matched);
586+
assert(m.suffix().first == m[0].second);
587+
assert(m.suffix().second == m[0].second);
588+
assert((std::size_t)m.length(0) == std::char_traits<char>::length(s));
589+
assert(m.position(0) == 0);
590+
assert(m.str(0) == s);
591+
}
592+
{
593+
std::cmatch m;
594+
const char s[] = "m";
595+
assert(!std::regex_match(s, m,
596+
std::regex("[a[=M=]z]", std::regex_constants::awk)));
597+
assert(m.size() == 0);
598+
}
576599
{
577600
std::cmatch m;
578601
const char s[] = "-";
@@ -1215,6 +1238,29 @@ int main(int, char**)
12151238
assert(m.position(0) == 0);
12161239
assert(m.str(0) == s);
12171240
}
1241+
{
1242+
std::wcmatch m;
1243+
const wchar_t s[] = L"m";
1244+
assert(std::regex_match(s, m, std::wregex(L"[a[=m=]z]",
1245+
std::regex_constants::awk)));
1246+
assert(m.size() == 1);
1247+
assert(!m.prefix().matched);
1248+
assert(m.prefix().first == s);
1249+
assert(m.prefix().second == m[0].first);
1250+
assert(!m.suffix().matched);
1251+
assert(m.suffix().first == m[0].second);
1252+
assert(m.suffix().second == m[0].second);
1253+
assert((std::size_t)m.length(0) == std::char_traits<wchar_t>::length(s));
1254+
assert(m.position(0) == 0);
1255+
assert(m.str(0) == s);
1256+
}
1257+
{
1258+
std::wcmatch m;
1259+
const wchar_t s[] = L"m";
1260+
assert(!std::regex_match(s, m, std::wregex(L"[a[=M=]z]",
1261+
std::regex_constants::awk)));
1262+
assert(m.size() == 0);
1263+
}
12181264
{
12191265
std::wcmatch m;
12201266
const wchar_t s[] = L"-";

libcxx/test/std/re/re.alg/re.alg.match/basic.locale.pass.cpp

Lines changed: 0 additions & 125 deletions
This file was deleted.

libcxx/test/std/re/re.alg/re.alg.match/basic.pass.cpp

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -575,6 +575,29 @@ int main(int, char**)
575575
assert(m.position(0) == 0);
576576
assert(m.str(0) == s);
577577
}
578+
{
579+
std::cmatch m;
580+
const char s[] = "m";
581+
assert(std::regex_match(s, m, std::regex("[a[=m=]z]",
582+
std::regex_constants::basic)));
583+
assert(m.size() == 1);
584+
assert(!m.prefix().matched);
585+
assert(m.prefix().first == s);
586+
assert(m.prefix().second == m[0].first);
587+
assert(!m.suffix().matched);
588+
assert(m.suffix().first == m[0].second);
589+
assert(m.suffix().second == m[0].second);
590+
assert(m.length(0) >= 0 && static_cast<std::size_t>(m.length(0)) == std::char_traits<char>::length(s));
591+
assert(m.position(0) == 0);
592+
assert(m.str(0) == s);
593+
}
594+
{
595+
std::cmatch m;
596+
const char s[] = "m";
597+
assert(!std::regex_match(s, m, std::regex("[a[=M=]z]",
598+
std::regex_constants::basic)));
599+
assert(m.size() == 0);
600+
}
578601
{
579602
std::cmatch m;
580603
const char s[] = "-";
@@ -1203,6 +1226,29 @@ int main(int, char**)
12031226
assert(m.position(0) == 0);
12041227
assert(m.str(0) == s);
12051228
}
1229+
{
1230+
std::wcmatch m;
1231+
const wchar_t s[] = L"m";
1232+
assert(std::regex_match(s, m, std::wregex(L"[a[=m=]z]",
1233+
std::regex_constants::basic)));
1234+
assert(m.size() == 1);
1235+
assert(!m.prefix().matched);
1236+
assert(m.prefix().first == s);
1237+
assert(m.prefix().second == m[0].first);
1238+
assert(!m.suffix().matched);
1239+
assert(m.suffix().first == m[0].second);
1240+
assert(m.suffix().second == m[0].second);
1241+
assert(m.length(0) >= 0 && static_cast<std::size_t>(m.length(0)) == std::char_traits<wchar_t>::length(s));
1242+
assert(m.position(0) == 0);
1243+
assert(m.str(0) == s);
1244+
}
1245+
{
1246+
std::wcmatch m;
1247+
const wchar_t s[] = L"m";
1248+
assert(!std::regex_match(s, m, std::wregex(L"[a[=M=]z]",
1249+
std::regex_constants::basic)));
1250+
assert(m.size() == 0);
1251+
}
12061252
{
12071253
std::wcmatch m;
12081254
const wchar_t s[] = L"-";

0 commit comments

Comments
 (0)