|
4 | 4 | Usage |
5 | 5 | ===== |
6 | 6 |
|
7 | | -Getting started |
8 | | -=============== |
9 | | - |
10 | | -If you already know how to write `CSS`_ or `XPath`_ expressions, using Parsel |
11 | | -is straightforward: you just need to create a |
12 | | -:class:`~parsel.selector.Selector` object for the HTML or XML text you want to |
13 | | -parse, and use the available methods for selecting parts from the text and |
14 | | -extracting data out of the result. |
15 | | - |
16 | | -Creating a :class:`~parsel.selector.Selector` object is simple:: |
| 7 | +Create a :class:`~parsel.selector.Selector` object for the HTML or XML text |
| 8 | +that you want to parse:: |
17 | 9 |
|
18 | 10 | >>> from parsel import Selector |
19 | 11 | >>> text = u"<html><body><h1>Hello, Parsel!</h1></body></html>" |
20 | | - >>> sel = Selector(text=text) |
| 12 | + >>> selector = Selector(text=text) |
21 | 13 |
|
22 | | -.. note:: |
23 | | - One important thing to note is that if you're using Python 2, |
24 | | - make sure to use an `unicode` object for the text argument. |
25 | | - :class:`~parsel.selector.Selector` expects text to be an `unicode` |
26 | | - object in Python 2 or an `str` object in Python 3. |
| 14 | +.. note:: In Python 2, the ``text`` argument must be a ``unicode`` string. |
27 | 15 |
|
28 | | -Once you have created the Selector object, you can use `CSS`_ or |
29 | | -`XPath`_ expressions to select elements:: |
| 16 | +Then use `CSS`_ or `XPath`_ expressions to select elements:: |
30 | 17 |
|
31 | | - >>> sel.css('h1') |
| 18 | + >>> selector.css('h1') |
32 | 19 | [<Selector xpath='descendant-or-self::h1' data='<h1>Hello, Parsel!</h1>'>] |
33 | | - >>> sel.xpath('//h1') # the same, but now with XPath |
| 20 | + >>> selector.xpath('//h1') # the same, but now with XPath |
34 | 21 | [<Selector xpath='//h1' data='<h1>Hello, Parsel!</h1>'>] |
35 | 22 |
|
36 | 23 | And extract data from those elements:: |
37 | 24 |
|
38 | | - >>> sel.css('h1::text').get() |
| 25 | + >>> selector.css('h1::text').get() |
39 | 26 | 'Hello, Parsel!' |
40 | | - >>> sel.xpath('//h1/text()').getall() |
| 27 | + >>> selector.xpath('//h1/text()').getall() |
41 | 28 | ['Hello, Parsel!'] |
42 | 29 |
|
| 30 | +.. _CSS: http://www.w3.org/TR/selectors |
| 31 | +.. _XPath: http://www.w3.org/TR/xpath |
| 32 | + |
| 33 | +Learning CSS and XPath |
| 34 | +====================== |
| 35 | + |
| 36 | +`CSS`_ is a language for applying styles to HTML documents. It defines |
| 37 | +selectors to associate those styles with specific HTML elements. Resources to |
| 38 | +learn CSS_ selectors include: |
| 39 | + |
| 40 | +- `CSS selectors in the MDN`_ |
| 41 | + |
| 42 | +- `XPath/CSS Equivalents in Wikibooks`_ |
| 43 | + |
43 | 44 | `XPath`_ is a language for selecting nodes in XML documents, which can also be |
44 | | -used with HTML. `CSS`_ is a language for applying styles to HTML documents. It |
45 | | -defines selectors to associate those styles with specific HTML elements. |
| 45 | +used with HTML. Resources to learn XPath_ include: |
46 | 46 |
|
47 | | -You can use either language. CSS_ is usually more readable, but some things can |
48 | | -only be done with XPath_. See `XPath/CSS Equivalents in Wikibooks`_ to compare |
49 | | -their syntax. |
| 47 | +- `XPath Tutorial in W3Schools`_ |
50 | 48 |
|
51 | | -.. _CSS: http://www.w3.org/TR/selectors |
52 | | -.. _XPath: http://www.w3.org/TR/xpath |
| 49 | +- `XPath cheatsheet`_ |
| 50 | + |
| 51 | +You can use either CSS_ or XPath_. CSS_ is usually more readable, but some |
| 52 | +things can only be done with XPath_. |
| 53 | + |
| 54 | +.. _CSS selectors in the MDN: https://developer.mozilla.org/en-US/docs/Learn/CSS/Building_blocks/Selectors |
| 55 | +.. _XPath cheatsheet: https://devhints.io/xpath |
| 56 | +.. _XPath Tutorial in W3Schools: https://www.w3schools.com/xml/xpath_intro.asp |
53 | 57 | .. _XPath/CSS Equivalents in Wikibooks: https://en.wikibooks.org/wiki/XPath/CSS_Equivalents |
54 | 58 |
|
55 | 59 |
|
|
0 commit comments