@@ -1215,48 +1215,49 @@ In addition to the examples below, more examples are given in
12151215:ref: `urllib-howto `.
12161216
12171217This example gets the python.org main page and displays the first 300 bytes of
1218- it. ::
1218+ it::
12191219
12201220 >>> import urllib.request
12211221 >>> with urllib.request.urlopen('http://www.python.org/') as f:
12221222 ... print(f.read(300))
12231223 ...
1224- b'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1225- "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html
1226- xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n\n<head>\n
1227- <meta http-equiv="content-type" content="text/html; charset=utf-8" />\n
1228- <title>Python Programming '
1224+ b'<!doctype html>\n<!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->\n<!--[if IE 7]> <html class="no-js ie7 lt-ie8 lt-ie9"> <![endif]-->\n<!--[if IE 8]> <html class="no-js ie8 lt-ie9">
12291225
12301226Note that urlopen returns a bytes object. This is because there is no way
12311227for urlopen to automatically determine the encoding of the byte stream
12321228it receives from the HTTP server. In general, a program will decode
12331229the returned bytes object to string once it determines or guesses
12341230the appropriate encoding.
12351231
1236- The following W3C document, https://www.w3. org/International/O- charset\ , lists
1237- the various ways in which an (X) HTML or an XML document could have specified its
1232+ The following HTML spec document, https://html.spec.whatwg. org/# charset, lists
1233+ the various ways in which an HTML or an XML document could have specified its
12381234encoding information.
12391235
1236+ For additional information, see the W3C document: https://www.w3.org/International/questions/qa-html-encoding-declarations.
1237+
12401238As the python.org website uses *utf-8 * encoding as specified in its meta tag, we
1241- will use the same for decoding the bytes object. ::
1239+ will use the same for decoding the bytes object::
12421240
12431241 >>> with urllib.request.urlopen('http://www.python.org/') as f:
12441242 ... print(f.read(100).decode('utf-8'))
12451243 ...
1246- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1247- "http://www.w3.org/TR/xhtml1/DTD/xhtm
1244+ <!doctype html>
1245+ <!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->
1246+ <!-
12481247
12491248It is also possible to achieve the same result without using the
1250- :term: `context manager ` approach. ::
1249+ :term: `context manager ` approach::
12511250
12521251 >>> import urllib.request
12531252 >>> f = urllib.request.urlopen('http://www.python.org/')
12541253 >>> try:
12551254 ... print(f.read(100).decode('utf-8'))
12561255 ... finally:
12571256 ... f.close()
1258- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1259- "http://www.w3.org/TR/xhtml1/DTD/xhtm
1257+ ...
1258+ <!doctype html>
1259+ <!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->
1260+ <!--
12601261
12611262In the following example, we are sending a data-stream to the stdin of a CGI
12621263and reading the data it returns to us. Note that this example will only work
0 commit comments