diff --git a/src/wp-includes/class-wp-mime-sniffer.php b/src/wp-includes/class-wp-mime-sniffer.php new file mode 100644 index 0000000000000..cadca1564ef5b --- /dev/null +++ b/src/wp-includes/class-wp-mime-sniffer.php @@ -0,0 +1,1114 @@ + The HTTP Content-Type header field is intended to indicate the MIME type of an HTTP response. + * > However, many HTTP servers supply a Content-Type header field value that does not match the + * > actual contents of the response. Historically, web browsers have tolerated these servers by + * > examining the content of HTTP responses in addition to the Content-Type header field in order + * > to determine the effective MIME type of the response. + * > + * > Without a clear specification for how to "sniff" the MIME type, each user agent has been + * > forced to reverse-engineer the algorithms of other user agents in order to maintain + * > interoperability. Inevitably, these efforts have not been entirely successful, resulting + * > in divergent behaviors among user agents. In some cases, these divergent behaviors have + * > had security implications, as a user agent could interpret an HTTP response as a different + * > MIME type than the server intended. + * > + * > These security issues are most severe when an "honest" server allows potentially malicious + * > users to upload their own files and then serves the contents of those files with a low-privilege + * > MIME type. For example, if a server believes that the client will treat a contributed file as an + * > image (and thus treat it as benign), but a user agent believes the content to be HTML (and thus + * > privileged to execute any scripts contained therein), an attacker might be able to steal the + * > user’s authentication credentials and mount other cross-site scripting attacks. (Malicious + * > servers, of course, can specify an arbitrary MIME type in the Content-Type header field.) + * > + * > This document describes a content sniffing algorithm that carefully balances the compatibility + * > needs of user agent with the security constraints imposed by existing web content. The algorithm + * > originated from research conducted by Adam Barth, Juan Caballero, and Dawn Song, based on content + * > sniffing algorithms present in popular user agents, an extensive database of existing web content, + * > and metrics collected from implementations deployed to a sizable number of users. + * > + * > - https://mimesniff.spec.whatwg.org/#introduction + * + * Some MIME types are inferred from string sources, such as HTTP headers and HTML meta values. These + * are usually intentional declarations of a MIME type, and while not always accurate, they are meant + * to explicitly convey content types. + * + * Example: + * + * $mime_type = WP_Mime_Sniffer::from_declaration( $headers['content-type'] ); + * if ( isset( $mime_type ) && $mime_type->is_json() ) { + * echo '