GH-48721: [C++] Add test for file creation with UTF-8 filenames#48722
GH-48721: [C++] Add test for file creation with UTF-8 filenames#48722pitrou merged 3 commits intoapache:mainfrom
Conversation
|
|
| // Test that file operations work with valid UTF-8 filenames. | ||
| // On Windows, PlatformFilename::FromString() converts UTF-8 strings to wide strings. | ||
| // On Unix, filenames are treated as opaque byte strings. | ||
| std::string utf8_file_name = "test_file_한국어_😀.txt"; |
There was a problem hiding this comment.
한국어 is "Korean" in Korean FYI .. :-)..
pitrou
left a comment
There was a problem hiding this comment.
Thanks contributing this @HyukjinKwon . Looks good in general, just one comment.
Also, can you please trim down the PR description?
|
I made the PR description shorter. Hopefully this one is easier to follow. |
|
Ah, it's not the related test failure. |
pitrou
left a comment
There was a problem hiding this comment.
+1, I'll just wait for CI and then merge. Thank you @HyukjinKwon !
|
After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit 02e37e2. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
4937d9f (ARROW-5102) added the TODO comment requesting a test with valid UTF-8 filenames. Later, the UTF-8 to UTF-16 conversion logic on Windows was introduced in commit eb23ea9 (ARROW-5648) which should fix the issue.
Essentially we should add a test for:
arrow/cpp/src/arrow/util/io_util.cc
Lines 143 to 149 in 727106f
StringToNative()). This test complements existingFileNameWideCharConversionRangeExceptiontest (invalid UTF-8).What changes are included in this PR?
This PR adds the test described above.
Are these changes tested?
Unittest was added.
Are there any user-facing changes?
No, test-only.