@@ -1208,48 +1208,49 @@ In addition to the examples below, more examples are given in
12081208:ref: `urllib-howto `.
12091209
12101210This example gets the python.org main page and displays the first 300 bytes of
1211- it. ::
1211+ it::
12121212
12131213 >>> import urllib.request
12141214 >>> with urllib.request.urlopen('http://www.python.org/') as f:
12151215 ... print(f.read(300))
12161216 ...
1217- b'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1218- "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html
1219- xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n\n<head>\n
1220- <meta http-equiv="content-type" content="text/html; charset=utf-8" />\n
1221- <title>Python Programming '
1217+ b'<!doctype html>\n<!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->\n<!--[if IE 7]> <html class="no-js ie7 lt-ie8 lt-ie9"> <![endif]-->\n<!--[if IE 8]> <html class="no-js ie8 lt-ie9">
12221218
12231219Note that urlopen returns a bytes object. This is because there is no way
12241220for urlopen to automatically determine the encoding of the byte stream
12251221it receives from the HTTP server. In general, a program will decode
12261222the returned bytes object to string once it determines or guesses
12271223the appropriate encoding.
12281224
1229- The following W3C document, https://www.w3. org/International/O- charset\ , lists
1230- the various ways in which an (X) HTML or an XML document could have specified its
1225+ The following HTML spec document, https://html.spec.whatwg. org/# charset, lists
1226+ the various ways in which an HTML or an XML document could have specified its
12311227encoding information.
12321228
1229+ For additional information, see the W3C document: https://www.w3.org/International/questions/qa-html-encoding-declarations.
1230+
12331231As the python.org website uses *utf-8 * encoding as specified in its meta tag, we
1234- will use the same for decoding the bytes object. ::
1232+ will use the same for decoding the bytes object::
12351233
12361234 >>> with urllib.request.urlopen('http://www.python.org/') as f:
12371235 ... print(f.read(100).decode('utf-8'))
12381236 ...
1239- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1240- "http://www.w3.org/TR/xhtml1/DTD/xhtm
1237+ <!doctype html>
1238+ <!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->
1239+ <!-
12411240
12421241It is also possible to achieve the same result without using the
1243- :term: `context manager ` approach. ::
1242+ :term: `context manager ` approach::
12441243
12451244 >>> import urllib.request
12461245 >>> f = urllib.request.urlopen('http://www.python.org/')
12471246 >>> try:
12481247 ... print(f.read(100).decode('utf-8'))
12491248 ... finally:
12501249 ... f.close()
1251- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1252- "http://www.w3.org/TR/xhtml1/DTD/xhtm
1250+ ...
1251+ <!doctype html>
1252+ <!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->
1253+ <!--
12531254
12541255In the following example, we are sending a data-stream to the stdin of a CGI
12551256and reading the data it returns to us. Note that this example will only work
0 commit comments