scrapy
diff --git a/‎.travis.yml‎
Lines changed: 5 additions & 20 deletions b/‎.travis.yml‎
Lines changed: 5 additions & 20 deletions
diff --git a/‎README.rst‎
Lines changed: 4 additions & 0 deletions b/‎README.rst‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/conf.py‎
Lines changed: 24 additions & 1 deletion b/‎docs/conf.py‎
Lines changed: 24 additions & 1 deletion
diff --git a/‎docs/index.rst‎
Lines changed: 1 addition & 0 deletions b/‎docs/index.rst‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/modules.rst‎
Lines changed: 0 additions & 7 deletions b/‎docs/modules.rst‎
Lines changed: 0 additions & 7 deletions
diff --git a/‎docs/parsel.rst‎
Lines changed: 11 additions & 19 deletions b/‎docs/parsel.rst‎
Lines changed: 11 additions & 19 deletions
diff --git a/‎docs/readme.rst‎
Lines changed: 0 additions & 2 deletions b/‎docs/readme.rst‎
Lines changed: 0 additions & 2 deletions
diff --git a/‎docs/usage.rst‎
Lines changed: 70 additions & 52 deletions b/‎docs/usage.rst‎
Lines changed: 70 additions & 52 deletions
diff --git a/‎parsel/csstranslator.py‎
Lines changed: 1 addition & 1 deletion b/‎parsel/csstranslator.py‎
Lines changed: 1 addition & 1 deletion
@@ -8,10 +8,8 @@ matrix:
   include:
     - python: 2.7
       env: TOXENV=py27
-    - python: 2.7
+    - python: pypy
       env: TOXENV=pypy
-    - python: 2.7
-      env: TOXENV=pypy3
     - python: 3.4
       env: TOXENV=py34
     - python: 3.5
@@ -20,24 +18,11 @@ matrix:
       env: TOXENV=py36
     - python: 3.7
       env: TOXENV=py37
-      dist: xenial
-      sudo: true
+    - python: pypy3
+      env: TOXENV=pypy3
+    - python: 3.7
+      env: TOXENV=docs
 install:
-  - |
-      if [ "$TOXENV" = "pypy" ]; then
-        export PYPY_VERSION="pypy-6.0.0-linux_x86_64-portable"
-        wget "https://bitbucket.org/squeaky/portable-pypy/downloads/${PYPY_VERSION}.tar.bz2"
-        tar -jxf ${PYPY_VERSION}.tar.bz2
-        virtualenv --python="$PYPY_VERSION/bin/pypy" "$HOME/virtualenvs/$PYPY_VERSION"
-        source "$HOME/virtualenvs/$PYPY_VERSION/bin/activate"
-      fi
-      if [ "$TOXENV" = "pypy3" ]; then
-        export PYPY_VERSION="pypy3.5-6.0.0-linux_x86_64-portable"
-        wget "https://bitbucket.org/squeaky/portable-pypy/downloads/${PYPY_VERSION}.tar.bz2"
-        tar -jxf ${PYPY_VERSION}.tar.bz2
-        virtualenv --python="$PYPY_VERSION/bin/pypy3" "$HOME/virtualenvs/$PYPY_VERSION"
-        source "$HOME/virtualenvs/$PYPY_VERSION/bin/activate"
-      fi
   - pip install -U pip tox twine wheel codecov
 script: tox
 after_success:
 
@@ -21,6 +21,8 @@ and XML_ using XPath_ and CSS_ selectors, optionally combined with
 
 Find the Parsel online documentation at https://parsel.readthedocs.org.
 
+Example (`open online demo`_):
+
 .. code-block:: python
 
     >>> from parsel import Selector
@@ -42,8 +44,10 @@ Find the Parsel online documentation at https://parsel.readthedocs.org.
     http://example.com
     http://scrapy.org
 
+
 .. _CSS: https://en.wikipedia.org/wiki/Cascading_Style_Sheets
 .. _HTML: https://en.wikipedia.org/wiki/HTML
+.. _open online demo: https://colab.research.google.com/drive/149VFa6Px3wg7S3SEnUqk--TyBrKplxCN#forceEdit=true&sandboxMode=true
 .. _Python: https://www.python.org/
 .. _regular expressions: https://docs.python.org/library/re.html
 .. _XML: https://en.wikipedia.org/wiki/XML
 
@@ -40,7 +40,11 @@
 
 # Add any Sphinx extension module names here, as strings. They can be
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
-extensions = ['sphinx.ext.autodoc', 'sphinx.ext.viewcode']
+extensions = [
+    'sphinx.ext.autodoc',
+    'sphinx.ext.intersphinx',
+    'sphinx.ext.viewcode',
+]
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
@@ -273,3 +277,22 @@
 
 # If true, do not generate a @detailmenu in the "Top" node's menu.
 #texinfo_no_detailmenu = False
+
+
+# -- Options for the InterSphinx extension ------------------------------------
+
+intersphinx_mapping = {
+    'cssselect': ('https://cssselect.readthedocs.io/en/latest', None),
+    'python': ('https://docs.python.org/3', None),
+}
+
+
+# --- Nitpicking options ------------------------------------------------------
+
+nitpicky = True
+nitpick_ignore = [
+    ('py:class', 'cssselect.xpath.GenericTranslator'),
+    ('py:class', 'cssselect.xpath.HTMLTranslator'),
+    ('py:class', 'cssselect.xpath.XPathExpr'),
+    ('py:class', 'lxml.etree.XMLParser'),
+]
@@ -15,6 +15,7 @@ Contents:
 
    installation
    usage
+   parsel
    history
 
 Indices and tables
 
@@ -1,38 +1,30 @@
-parsel package
-==============
+API reference
+=============
 
-Submodules
-----------
-
-parsel.csstranslator module
----------------------------
+parsel.csstranslator
+--------------------
 
 .. automodule:: parsel.csstranslator
     :members:
     :undoc-members:
     :show-inheritance:
 
-parsel.selector module
-----------------------
 
-.. automodule:: parsel.selector
-    :members:
-    :undoc-members:
-    :show-inheritance:
+.. _topics-selectors-ref:
 
-parsel.utils module
--------------------
+parsel.selector
+---------------
 
-.. automodule:: parsel.utils
+.. automodule:: parsel.selector
     :members:
     :undoc-members:
     :show-inheritance:
 
 
-Module contents
----------------
+parsel.utils
+------------
 
-.. automodule:: parsel
+.. automodule:: parsel.utils
     :members:
     :undoc-members:
     :show-inheritance:
@@ -4,51 +4,57 @@
 Usage
 =====
 
-Getting started
-===============
-
-If you already know how to write `CSS`_ or `XPath`_ expressions, using Parsel
-is straightforward: you just need to create a
-:class:`~parsel.selector.Selector` object for the HTML or XML text you want to
-parse, and use the available methods for selecting parts from the text and
-extracting data out of the result.
-
-Creating a :class:`~parsel.selector.Selector` object is simple::
+Create a :class:`~parsel.selector.Selector` object for the HTML or XML text
+that you want to parse::
 
     >>> from parsel import Selector
     >>> text = u"<html><body><h1>Hello, Parsel!</h1></body></html>"
-    >>> sel = Selector(text=text)
+    >>> selector = Selector(text=text)
 
-.. note::
-    One important thing to note is that if you're using Python 2,
-    make sure to use an `unicode` object for the text argument.
-    :class:`~parsel.selector.Selector` expects text to be an `unicode`
-    object in Python 2 or an `str` object in Python 3.
+.. note:: In Python 2, the ``text`` argument must be a ``unicode`` string.
 
-Once you have created the Selector object, you can use `CSS`_ or
-`XPath`_ expressions to select elements::
+Then use `CSS`_ or `XPath`_ expressions to select elements::
 
-    >>> sel.css('h1')
+    >>> selector.css('h1')
     [<Selector xpath='descendant-or-self::h1' data='<h1>Hello, Parsel!</h1>'>]
-    >>> sel.xpath('//h1')  # the same, but now with XPath
+    >>> selector.xpath('//h1')  # the same, but now with XPath
     [<Selector xpath='//h1' data='<h1>Hello, Parsel!</h1>'>]
 
 And extract data from those elements::
 
-    >>> sel.css('h1::text').get()
+    >>> selector.css('h1::text').get()
     'Hello, Parsel!'
-    >>> sel.xpath('//h1/text()').getall()
+    >>> selector.xpath('//h1/text()').getall()
     ['Hello, Parsel!']
 
+.. _CSS: http://www.w3.org/TR/selectors
+.. _XPath: http://www.w3.org/TR/xpath
+
+Learning CSS and XPath
+======================
+
+`CSS`_ is a language for applying styles to HTML documents. It defines
+selectors to associate those styles with specific HTML elements. Resources to
+learn CSS_ selectors include:
+
+-   `CSS selectors in the MDN`_
+
+-   `XPath/CSS Equivalents in Wikibooks`_
+
 `XPath`_ is a language for selecting nodes in XML documents, which can also be
-used with HTML. `CSS`_ is a language for applying styles to HTML documents. It
-defines selectors to associate those styles with specific HTML elements.
+used with HTML. Resources to learn XPath_ include:
 
-You can use either language you're more comfortable with, though you may find
-that in some specific cases `XPath`_ is more powerful than `CSS`_.
+-   `XPath Tutorial in W3Schools`_
 
-.. _XPath: http://www.w3.org/TR/xpath
-.. _CSS: http://www.w3.org/TR/selectors
+-   `XPath cheatsheet`_
+
+You can use either CSS_ or XPath_. CSS_ is usually more readable, but some
+things can only be done with XPath_.
+
+.. _CSS selectors in the MDN: https://developer.mozilla.org/en-US/docs/Learn/CSS/Building_blocks/Selectors
+.. _XPath cheatsheet: https://devhints.io/xpath
+.. _XPath Tutorial in W3Schools: https://www.w3schools.com/xml/xpath_intro.asp
+.. _XPath/CSS Equivalents in Wikibooks: https://en.wikibooks.org/wiki/XPath/CSS_Equivalents
 
 
 Using selectors
@@ -840,6 +846,29 @@ are more predictable: ``.get()`` always returns a single result,
 ``.getall()`` always returns a list of all extracted results.
 
 
+Using CSS selectors in multi-root documents
+-------------------------------------------
+
+Some webpages may have multiple root elements. It can happen, for example, when
+a webpage has broken code, such as missing closing tags.
+
+You can use XPath to determine if a page has multiple root elements::
+
+    >>> len(selector.xpath('/*')) > 1
+    True
+
+CSS selectors only work on the first root element, because the first root
+element is always used as the starting current element, and CSS selectors do
+not allow selecting parent elements (XPath’s ``..``) or elements relative to
+the document root (XPath’s ``/``).
+
+If you want to use a CSS selector that takes into account all root elements,
+you need to precede your CSS query by an XPath query that reaches all root
+elements::
+
+    selector.xpath('/*').css('<your CSS selector>')
+
+
 Command-Line Interface Tools
 ============================
 
@@ -857,27 +886,11 @@ There are third-party tools that allow using Parsel from the command line:
 .. _cURL: https://curl.haxx.se/
 
 
-.. _topics-selectors-ref:
-
-API reference
-=============
-
-Selector objects
-----------------
-
-.. autoclass:: parsel.selector.Selector
-    :members:
-
-
-SelectorList objects
---------------------
-
-.. autoclass:: parsel.selector.SelectorList
-    :members:
-
-
 .. _selector-examples-html:
 
+Examples
+========
+
 Working on HTML
 ---------------
 
@@ -936,7 +949,8 @@ Removing namespaces
 When dealing with scraping projects, it is often quite convenient to get rid of
 namespaces altogether and just work with element names, to write more
 simple/convenient XPaths. You can use the
-:meth:`Selector.remove_namespaces` method for that.
+:meth:`Selector.remove_namespaces <parsel.selector.Selector.remove_namespaces>`
+method for that.
 
 Let's show an example that illustrates this with the Python Insider blog atom feed.
 
@@ -947,10 +961,12 @@ Let's download the atom feed using `requests`_ and create a selector::
     >>> text = requests.get('https://feeds.feedburner.com/PythonInsider').text
     >>> sel = Selector(text=text, type='xml')
 
-This is how the file starts::
+This is how the file starts:
+
+.. code-block:: xml
 
     <?xml version="1.0" encoding="UTF-8"?>
-    <?xml-stylesheet ...
+    <?xml-stylesheet ... ?>
     <feed xmlns="http://www.w3.org/2005/Atom"
           xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/"
           xmlns:blogger="http://schemas.google.com/blogger/2008"
@@ -959,6 +975,7 @@ This is how the file starts::
           xmlns:thr="http://purl.org/syndication/thread/1.0"
           xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
       ...
+    </feed>
 
 You can see several namespace declarations including a default
 "http://www.w3.org/2005/Atom" and another one using the "gd:" prefix for
@@ -970,8 +987,9 @@ We can try selecting all ``<link>`` objects and then see that it doesn't work
     >>> sel.xpath("//link")
     []
 
-But once we call the :meth:`Selector.remove_namespaces` method, all
-nodes can be accessed directly by their names::
+But once we call the :meth:`Selector.remove_namespaces
+<parsel.selector.Selector.remove_namespaces>` method, all nodes can be accessed
+directly by their names::
 
     >>> sel.remove_namespaces()
     >>> sel.xpath("//link")
 
@@ -9,7 +9,7 @@
 from cssselect import HTMLTranslator as OriginalHTMLTranslator
 from cssselect.xpath import XPathExpr as OriginalXPathExpr
 from cssselect.xpath import _unicode_safe_getattr, ExpressionError
-from cssselect.parser import parse, FunctionalPseudoElement
+from cssselect.parser import FunctionalPseudoElement
 
 
 class XPathExpr(OriginalXPathExpr):