|
1563 | 1563 | </style> |
1564 | 1564 | <meta content="Bikeshed version 4416b18d5, updated Tue Jan 2 15:52:39 2024 -0800" name="generator"> |
1565 | 1565 | <link href="https://isocpp.org/favicon.ico" rel="icon"> |
1566 | | - <meta content="13d7c997af9d4ef75ba4d5addd710fe9c58a4268" name="revision"> |
| 1566 | + <meta content="36253156092910ff21aa2d4b2a5b9a7ec762c0bf" name="revision"> |
1567 | 1567 | <style>/* Boilerplate: style-autolinks */ |
1568 | 1568 | .css.css, .property.property, .descriptor.descriptor { |
1569 | 1569 | color: var(--a-normal-text); |
@@ -2136,6 +2136,12 @@ <h2 class="heading settled" data-level="2" id="motivation"><span class="secno">2 |
2136 | 2136 | <blockquote> |
2137 | 2137 | <p>the file system treats path and file names as an opaque sequence of <code class="highlight"><c- n>WCHAR</c-></code>s</p> |
2138 | 2138 | </blockquote> |
| 2139 | + <p>This is also true on POSIX (<a data-link-type="biblio" href="#biblio-pep383" title="PEP 383 – Non-decodable Bytes in System Character Interfaces">[PEP383]</a>):</p> |
| 2140 | + <blockquote> |
| 2141 | + <p>File names, environment variables, and command line arguments are defined as |
| 2142 | +being character data in POSIX; the C APIs however allow passing arbitrary |
| 2143 | +bytes - whether these conform to a certain encoding or not.</p> |
| 2144 | + </blockquote> |
2139 | 2145 | <p>Arbitrary paths are formatted on POSIX such that there is no data loss. |
2140 | 2146 | Unfortunately this is not the case on Windows, for example:</p> |
2141 | 2147 | <pre class="language-c++ highlight"><c- k>auto</c-> <c- n>p1</c-> <c- o>=</c-> <c- n>std</c-><c- o>::</c-><c- n>filesystem</c-><c- o>::</c-><c- n>path</c-><c- p>(</c->L<c- s>"</c-><c- se>\xD800</c-><c- s>"</c-><c- p>);</c-> <c- c1>// a lone surrogate</c-> |
@@ -2186,7 +2192,22 @@ <h2 class="heading settled" data-level="3" id="proposal"><span class="secno">3. |
2186 | 2192 | <pre class="highlight"><c- s>"</c-><c- se>\xED\xA0\x81</c-><c- s>"</c-> |
2187 | 2193 | </pre> |
2188 | 2194 | </table> |
2189 | | - <p>TODO</p> |
| 2195 | + <p>At the same time this will preserve the observable behavior for <code class="highlight"><c- n>std</c-><c- o>::</c-><c- n>print</c-></code> when printing to a terminal. For example:</p> |
| 2196 | +<pre class="language-c++ highlight"><c- n>std</c-><c- o>::</c-><c- n>print</c-><c- p>(</c-><c- s>"{}</c-><c- se>\n</c-><c- s>"</c-><c- p>,</c-> <c- n>std</c-><c- o>::</c-><c- n>filesystem</c-><c- o>::</c-><c- n>path</c-><c- p>(</c->L<c- s>"</c-><c- se>\xD800</c-><c- s>"</c-><c- p>));</c-> |
| 2197 | +</pre> |
| 2198 | + <p>will still print</p> |
| 2199 | +<pre class="highlight">� |
| 2200 | +</pre> |
| 2201 | + <p>on implementations that follow the recommended practice from <a href="https://eel.is/c++draft/ostream.formatted.print">[ostream.formatted.print</a>]:</p> |
| 2202 | + <blockquote> |
| 2203 | + <p><em>Recommended practice</em>: For <code class="highlight"><c- n>vprint_unicode</c-></code>, if invoking the native Unicode |
| 2204 | +API requires transcoding, implementations should substitute invalid code |
| 2205 | +units with U+FFFD REPLACEMENT CHARACTER per the Unicode Standard, Chapter 3.9 |
| 2206 | +U+FFFD Substitution in Conversion.</p> |
| 2207 | + </blockquote> |
| 2208 | + <p>WTF-8 is used to handle invalid UTF-16 in Rust (<a data-link-type="biblio" href="#biblio-rust-osstring" title="OsString Struct. The Rust Standard Library.">[RUST-OSSTRING]</a>) and Node.js |
| 2209 | +libuv (<a data-link-type="biblio" href="#biblio-libuv" title="Miscellaneous utilities. libuv Documentation.">[LIBUV]</a>). Python also handles this but with a different mechanism |
| 2210 | +(<a data-link-type="biblio" href="#biblio-pep383" title="PEP 383 – Non-decodable Bytes in System Character Interfaces">[PEP383]</a>).</p> |
2190 | 2211 | </main> |
2191 | 2212 | <script> |
2192 | 2213 | (function() { |
@@ -2320,8 +2341,14 @@ <h2 class="heading settled" data-level="3" id="proposal"><span class="secno">3. |
2320 | 2341 | <h2 class="no-num no-ref heading settled" id="references"><span class="content">References</span><a class="self-link" href="#references"></a></h2> |
2321 | 2342 | <h3 class="no-num no-ref heading settled" id="informative"><span class="content">Informative References</span><a class="self-link" href="#informative"></a></h3> |
2322 | 2343 | <dl> |
| 2344 | + <dt id="biblio-libuv">[LIBUV] |
| 2345 | + <dd>l; et al. <a href="https://docs.libuv.org/en/v1.x/misc.html"><cite>Miscellaneous utilities. libuv Documentation.</cite></a>. URL: <a href="https://docs.libuv.org/en/v1.x/misc.html">https://docs.libuv.org/en/v1.x/misc.html</a> |
2323 | 2346 | <dt id="biblio-p2845">[P2845] |
2324 | 2347 | <dd>Victor Zverovich. <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2845r8.html"><cite>Formatting of std::filesystem::path</cite></a>. URL: <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2845r8.html">https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2845r8.html</a> |
| 2348 | + <dt id="biblio-pep383">[PEP383] |
| 2349 | + <dd>Martin von Löwis. <a href="https://peps.python.org/pep-0383/"><cite>PEP 383 – Non-decodable Bytes in System Character Interfaces</cite></a>. URL: <a href="https://peps.python.org/pep-0383/">https://peps.python.org/pep-0383/</a> |
| 2350 | + <dt id="biblio-rust-osstring">[RUST-OSSTRING] |
| 2351 | + <dd>R; et al. <a href="https://doc.rust-lang.org/std/ffi/struct.OsString.html"><cite>OsString Struct. The Rust Standard Library.</cite></a>. URL: <a href="https://doc.rust-lang.org/std/ffi/struct.OsString.html">https://doc.rust-lang.org/std/ffi/struct.OsString.html</a> |
2325 | 2352 | <dt id="biblio-win32-fileio">[WIN32-FILEIO] |
2326 | 2353 | <dd>Microsoft Corporation. <a href="https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation"><cite>Maximum Path Length Limitation – Local file systems</cite></a>. URL: <a href="https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation">https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation</a> |
2327 | 2354 | <dt id="biblio-wtf">[WTF] |
|
0 commit comments