Skip to content
Open
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 37 additions & 8 deletions Doc/library/urllib.parse.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,16 +37,21 @@ URL Parsing
The URL parsing functions focus on splitting a URL string into its components,
or on combining URL components into a URL string.

.. function:: urlparse(urlstring, scheme='', allow_fragments=True)
.. function:: urlparse(url, scheme='', allow_fragments=True)

Parse a URL into six components, returning a 6-item :term:`named tuple`. This
corresponds to the general structure of a URL:
``scheme://netloc/path;parameters?query#fragment``.
Each tuple item is a string, possibly empty. The components are not broken up
into smaller parts (for example, the network location is a single string), and %
escapes are not expanded. The delimiters as shown above are not part of the
result, except for a leading slash in the *path* component, which is retained if
present. For example:

The delimiters as shown above are not part of the result, except for a leading slash in the path
component, which is retained if present.

Additionally, the netloc property is broken down into these additional attributes added to
the returned object: username, password, hostname, and port.

% escapes are not decoded.

For example:

>>> from urllib.parse import urlparse
>>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html')
Expand Down Expand Up @@ -81,7 +86,7 @@ or on combining URL components into a URL string.

The *scheme* argument gives the default addressing scheme, to be
used only if the URL does not specify one. It should be the same type
(text or bytes) as *urlstring*, except that the default value ``''`` is
(text or bytes) as *url*, except that the default value ``''`` is
always allowed, and is automatically converted to ``b''`` if appropriate.

If the *allow_fragments* argument is false, fragment identifiers are not
Expand Down Expand Up @@ -245,7 +250,31 @@ or on combining URL components into a URL string.
states that these are equivalent).


.. function:: urlsplit(urlstring, scheme='', allow_fragments=True)
.. function:: urlsplit(url, scheme='', allow_fragments=True)

Similar to :func:`urlparse`, without splitting the URL parameters.

Parse a URL into five components, returning a 5-item :term:`named tuple`.
This corresponds to the general structure of a URL:
``scheme://netloc/path?query#fragment``.

The delimiters as shown above are not part of the result, except for a
leading slash in the path component, which is retained if present.

Additionally, the netloc property is broken down into these additional attributes added to
the returned object: username, password, hostname, and port.

% escapes are not decoded.

The *scheme* argument gives the default addressing scheme, to be
used only if the URL does not specify one. It should be the same type
(text or bytes) as *url*, except that the default value ``''`` is
always allowed, and is automatically converted to ``b''`` if appropriate.

If the *allow_fragments* argument is false, fragment identifiers are not
recognized. Instead, they are parsed as part of the path, parameters
or query component, and :attr:`fragment` is set to the empty string in
the return value.

This is similar to :func:`urlparse`, but does not split the params from the URL.
This should generally be used instead of :func:`urlparse` if the more recent URL
Expand Down