|
9 | 9 | // except according to those terms.
|
10 | 10 |
|
11 | 11 | //! Windows-specific extensions to the primitives in the `std::ffi` module.
|
| 12 | +//! |
| 13 | +//! # Overview |
| 14 | +//! |
| 15 | +//! For historical reasons, the Windows API uses a form of potentially |
| 16 | +//! ill-formed UTF-16 encoding for strings. Specifically, the 16-bit |
| 17 | +//! code units in Windows strings may contain [isolated surrogate code |
| 18 | +//! points which are not paired together][ill-formed-utf-16]. The |
| 19 | +//! Unicode standard requires that surrogate code points (those in the |
| 20 | +//! range U+D800 to U+DFFF) always be *paired*, because in the UTF-16 |
| 21 | +//! encoding a *surrogate code unit pair* is used to encode a single |
| 22 | +//! character. For compatibility with code that does not enforce |
| 23 | +//! these pairings, Windows does not enforce them, either. |
| 24 | +//! |
| 25 | +//! While it is not always possible to convert such a string losslessly into |
| 26 | +//! a valid UTF-16 string (or even UTF-8), it is often desirable to be |
| 27 | +//! able to round-trip such a string from and to Windows APIs |
| 28 | +//! losslessly. For example, some Rust code may be "bridging" some |
| 29 | +//! Windows APIs together, just passing `WCHAR` strings among those |
| 30 | +//! APIs without ever really looking into the strings. |
| 31 | +//! |
| 32 | +//! If Rust code *does* need to look into those strings, it can |
| 33 | +//! convert them to valid UTF-8, possibly lossily, by substituting |
| 34 | +//! invalid sequences with U+FFFD REPLACEMENT CHARACTER, as is |
| 35 | +//! conventionally done in other Rust APIs that deal with string |
| 36 | +//! encodings. |
| 37 | +//! |
| 38 | +//! # `OsStringExt` and `OsStrExt` |
| 39 | +//! |
| 40 | +//! [`OsString`] is the Rust wrapper for owned strings in the |
| 41 | +//! preferred representation of the operating system. On Windows, |
| 42 | +//! this struct gets augmented with an implementation of the |
| 43 | +//! [`OsStringExt`] trait, which has a [`from_wide`] method. This |
| 44 | +//! lets you create an [`OsString`] from a `&[u16]` slice; presumably |
| 45 | +//! you get such a slice out of a `WCHAR` Windows API. |
| 46 | +//! |
| 47 | +//! Similarly, [`OsStr`] is the Rust wrapper for borrowed strings from |
| 48 | +//! preferred representation of the operating system. On Windows, the |
| 49 | +//! [`OsStrExt`] trait provides the [`encode_wide`] method, which |
| 50 | +//! outputs an [`EncodeWide`] iterator. You can [`collect`] this |
| 51 | +//! iterator, for example, to obtain a `Vec<u16>`; you can later get a |
| 52 | +//! pointer to this vector's contents and feed it to Windows APIs. |
| 53 | +//! |
| 54 | +//! These traits, along with [`OsString`] and [`OsStr`], work in |
| 55 | +//! conjunction so that it is possible to **round-trip** strings from |
| 56 | +//! Windows and back, with no loss of data, even if the strings are |
| 57 | +//! ill-formed UTF-16. |
| 58 | +//! |
| 59 | +//! [ill-formed-utf-16]: https://simonsapin.github.io/wtf-8/#ill-formed-utf-16 |
| 60 | +//! [`OsString`]: ../../../ffi/struct.OsString.html |
| 61 | +//! [`OsStr`]: ../../../ffi/struct.OsStr.html |
| 62 | +//! [`OsStringExt`]: trait.OsStringExt.html |
| 63 | +//! [`OsStrExt`]: trait.OsStrExt.html |
| 64 | +//! [`EncodeWide`]: struct.EncodeWide.html |
| 65 | +//! [`from_wide`]: trait.OsStringExt.html#tymethod.from_wide |
| 66 | +//! [`encode_wide`]: trait.OsStrExt.html#tymethod.encode_wide |
| 67 | +//! [`collect`]: ../../../iter/trait.Iterator.html#method.collect |
12 | 68 |
|
13 | 69 | #![stable(feature = "rust1", since = "1.0.0")]
|
14 | 70 |
|
|
0 commit comments