Skip to content
Closed
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 5 additions & 23 deletions llvm/lib/Support/Windows/Path.inc
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "llvm/Support/ConvertUTF.h"
#include "llvm/Support/WindowsError.h"
#include <fcntl.h>
#include <filesystem>
#include <sys/stat.h>
#include <sys/types.h>

Expand Down Expand Up @@ -1373,29 +1374,10 @@ std::error_code closeFile(file_t &F) {
}

std::error_code remove_directories(const Twine &path, bool IgnoreErrors) {
// Convert to utf-16.
SmallVector<wchar_t, 128> Path16;
std::error_code EC = widenPath(path, Path16);
if (EC && !IgnoreErrors)
return EC;

// SHFileOperation() accepts a list of paths, and so must be double null-
// terminated to indicate the end of the list. The buffer is already null
// terminated, but since that null character is not considered part of the
// vector's size, pushing another one will just consume that byte. So we
// need to push 2 null terminators.
Path16.push_back(0);
Path16.push_back(0);

SHFILEOPSTRUCTW shfos = {};
shfos.wFunc = FO_DELETE;
shfos.pFrom = Path16.data();
shfos.fFlags = FOF_NO_UI;

int result = ::SHFileOperationW(&shfos);
if (result != 0 && !IgnoreErrors)
return mapWindowsError(result);
return std::error_code();
const std::filesystem::path Path(path.str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, passing paths like this won't do the right thing.

Within LLVM, paths in the usual form with narrow chars is assumed to be UTF-8; for the existing Windows implementations here, it is converted to UTF-16 into wchar_t form with widenPath above.

However, in std::filesystem, a narrow char path is assumed to be in the system active code page (ACP). Unless the ACP is set to UTF-8, only plain ASCII path names would work correctly.

There is a std::filesystem::u8path() function which takes char input and interprets it as UTF-8, which we could use instead. That one is deprecated since C++20 though (because in C++20, there's a char8_t data type to natively signal UTF8 strings).

I guess it's simplest to just keep the existing widenPath call first and then pass the path from that to std::filesystem though, but I don't have a very strong opinion on it - u8path() might be ok.

(At least this is how MS STL and libc++ implements std::filesystem. I haven't observed libstdc++'s implementation much at all; I'm not sure how mature it is on Windows.)

Copy link
Contributor Author

@slydiman slydiman Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated this patch to use widenPath() and tested it with non-ASCII symbols locally.

std::error_code EC;
std::filesystem::remove_all(Path, EC);
return IgnoreErrors ? std::error_code() : EC;
}

static void expandTildeExpr(SmallVectorImpl<char> &Path) {
Expand Down
Loading