Module overview for std::os::windows:ffi

federicomenaquintero · federicomenaquintero · commit 155b4b1c5fff · 2017-09-25T20:45:38.000-05:00
diff --git a/src/libstd/ffi/mod.rs b/src/libstd/ffi/mod.rs
@@ -92,7 +92,7 @@
 //! your code can detect errors in case the environment variable did
 //! not in fact contain valid Unicode data.
 //!
-//! * [`OsStr`] represents a borrowed reference to a string in a format that
+//! * [`OsStr`] represents a borrowed string slice in a format that
 //! can be passed to the operating system.  It can be converted into
 //! an UTF-8 Rust string slice in a similar way to `OsString`.
 //!
diff --git a/src/libstd/sys/windows/ext/ffi.rs b/src/libstd/sys/windows/ext/ffi.rs
@@ -9,6 +9,62 @@
 // except according to those terms.
 
 //! Windows-specific extensions to the primitives in the `std::ffi` module.
+//!
+//! # Overview
+//!
+//! For historical reasons, the Windows API uses a form of potentially
+//! ill-formed UTF-16 encoding for strings.  Specifically, the 16-bit
+//! code units in Windows strings may contain [isolated surrogate code
+//! points which are not paired together][ill-formed-utf-16].  The
+//! Unicode standard requires that surrogate code points (those in the
+//! range U+D800 to U+DFFF) always be *paired*, because in the UTF-16
+//! encoding a *surrogate code unit pair* is used to encode a single
+//! character.  For compatibility with code that does not enforce
+//! these pairings, Windows does not enforce them, either.
+//!
+//! While it is not always possible to convert such a string losslessly into
+//! a valid UTF-16 string (or even UTF-8), it is often desirable to be
+//! able to round-trip such a string from and to Windows APIs
+//! losslessly.  For example, some Rust code may be "bridging" some
+//! Windows APIs together, just passing `WCHAR` strings among those
+//! APIs without ever really looking into the strings.
+//!
+//! If Rust code *does* need to look into those strings, it can
+//! convert them to valid UTF-8, possibly lossily, by substituting
+//! invalid sequences with U+FFFD REPLACEMENT CHARACTER, as is
+//! conventionally done in other Rust APIs that deal with string
+//! encodings.
+//!
+//! # `OsStringExt` and `OsStrExt`
+//!
+//! [`OsString`] is the Rust wrapper for owned strings in the
+//! preferred representation of the operating system.  On Windows,
+//! this struct gets augmented with an implementation of the
+//! [`OsStringExt`] trait, which has a [`from_wide`] method.  This
+//! lets you create an [`OsString`] from a `&[u16]` slice; presumably
+//! you get such a slice out of a `WCHAR` Windows API.
+//!
+//! Similarly, [`OsStr`] is the Rust wrapper for borrowed strings from
+//! preferred representation of the operating system.  On Windows, the
+//! [`OsStrExt`] trait provides the [`encode_wide`] method, which
+//! outputs an [`EncodeWide`] iterator.  You can [`collect`] this
+//! iterator, for example, to obtain a `Vec<u16>`; you can later get a
+//! pointer to this vector's contents and feed it to Windows APIs.
+//!
+//! These traits, along with [`OsString`] and [`OsStr`], work in
+//! conjunction so that it is possible to **round-trip** strings from
+//! Windows and back, with no loss of data, even if the strings are
+//! ill-formed UTF-16.
+//!
+//! [ill-formed-utf-16]: https://simonsapin.github.io/wtf-8/#ill-formed-utf-16
+//! [`OsString`]: ../../../ffi/struct.OsString.html
+//! [`OsStr`]: ../../../ffi/struct.OsStr.html
+//! [`OsStringExt`]: trait.OsStringExt.html
+//! [`OsStrExt`]: trait.OsStrExt.html
+//! [`EncodeWide`]: struct.EncodeWide.html
+//! [`from_wide`]: trait.OsStringExt.html#tymethod.from_wide
+//! [`encode_wide`]: trait.OsStrExt.html#tymethod.encode_wide
+//! [`collect`]: ../../../iter/trait.Iterator.html#method.collect
 
 #![stable(feature = "rust1", since = "1.0.0")]