@@ -94,31 +94,34 @@ static inline path operator+(path p1, path p2)
94
94
95
95
/* *
96
96
* Convert path object to byte string. On POSIX, paths natively are byte
97
- * strings so this is trivial. On Windows, paths natively are Unicode, so an
98
- * encoding step is necessary.
97
+ * strings, so this is trivial. On Windows, paths natively are Unicode, so an
98
+ * encoding step is necessary. The inverse of \ref PathToString is \ref
99
+ * PathFromString. The strings returned and parsed by these functions can be
100
+ * used to call POSIX APIs, and for roundtrip conversion, logging, and
101
+ * debugging.
99
102
*
100
- * The inverse of \ref PathToString is \ref PathFromString. The strings
101
- * returned and parsed by these functions can be used to call POSIX APIs, and
102
- * for roundtrip conversion, logging, and debugging. But they are not
103
- * guaranteed to be valid UTF-8, and are generally meant to be used internally,
104
- * not externally. When communicating with external programs and libraries that
105
- * require UTF-8, fs::path::u8string() and fs::u8path() methods can be used.
106
- * For other applications, if support for non UTF-8 paths is required, or if
107
- * higher-level JSON or XML or URI or C-style escapes are preferred, it may be
108
- * also be appropriate to use different path encoding functions.
109
- *
110
- * Implementation note: On Windows, the std::filesystem::path(string)
111
- * constructor and std::filesystem::path::string() method are not safe to use
112
- * here, because these methods encode the path using C++'s narrow multibyte
113
- * encoding, which on Windows corresponds to the current "code page", which is
114
- * unpredictable and typically not able to represent all valid paths. So
115
- * std::filesystem::path::u8string() and std::filesystem::u8path() functions
116
- * are used instead on Windows. On POSIX, u8string/u8path functions are not
117
- * safe to use because paths are not always valid UTF-8, so plain string
118
- * methods which do not transform the path there are used.
103
+ * Because \ref PathToString and \ref PathFromString functions don't specify an
104
+ * encoding, they are meant to be used internally, not externally. They are not
105
+ * appropriate to use in applications requiring UTF-8, where
106
+ * fs::path::u8string() and fs::u8path() methods should be used instead. Other
107
+ * applications could require still different encodings. For example, JSON, XML,
108
+ * or URI applications might prefer to use higher level escapes (\uXXXX or
109
+ * &XXXX; or %XX) instead of multibyte encoding. Rust, Python, Java applications
110
+ * may require encoding paths with their respective UTF-8 derivatives WTF-8,
111
+ * PEP-383, and CESU-8 (see https://en.wikipedia.org/wiki/UTF-8#Derivatives).
119
112
*/
120
113
static inline std::string PathToString (const path& path)
121
114
{
115
+ // Implementation note: On Windows, the std::filesystem::path(string)
116
+ // constructor and std::filesystem::path::string() method are not safe to
117
+ // use here, because these methods encode the path using C++'s narrow
118
+ // multibyte encoding, which on Windows corresponds to the current "code
119
+ // page", which is unpredictable and typically not able to represent all
120
+ // valid paths. So std::filesystem::path::u8string() and
121
+ // std::filesystem::u8path() functions are used instead on Windows. On
122
+ // POSIX, u8string/u8path functions are not safe to use because paths are
123
+ // not always valid UTF-8, so plain string methods which do not transform
124
+ // the path there are used.
122
125
#ifdef WIN32
123
126
return path.u8string ();
124
127
#else
0 commit comments