Skip to content

Commit 26d5954

Browse files
committed
0.2.0: Changed from scraping to using inventory files. Closes #1.
1 parent fa0cada commit 26d5954

25 files changed

+1903
-2009
lines changed

MANIFEST.in

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
1-
recursive-include stdlib_list/lists *.csv
1+
recursive-include stdlib_list/lists *.txt
22
include README.rst
33
include LICENSE

README.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,17 @@ Listing the modules in the standard library? Wait, why on Earth would you care a
88

99
Because knowing whether or not a module is part of the standard library will come in handy in [a project of mine](https://github.com/jackmaney/pypt). [And I'm not the only one](http://stackoverflow.com/questions/6463918/how-can-i-get-a-list-of-all-the-python-standard-library-modules) who would find this useful. Or, the TL;DR answer is that it's handy in situations when you're analyzing Python code and would like to find module dependencies.
1010

11-
After googling for a way to generate a list of Python standard libraries (and looking through the answers to the previously-linked Stack Overflow question), I decided that I didn't like the existing solutions. So, I used Beautiful Soup 4 to build a scraper and scraped the list of libraries (along with a bit of metadata) from the Python Module Index.
11+
After googling for a way to generate a list of Python standard libraries (and looking through the answers to the previously-linked Stack Overflow question), I decided that I didn't like the existing solutions. So, I started by writing a scraper for the TOC of the Python Module Index for each of the versions of Python above.
12+
13+
However, web scraping can be a fragile affair. Thanks to [a suggestion](https://github.com/jackmaney/python-stdlib-list/issues/1#issuecomment-86517208) by [@ncoghlan](https://github.com/ncoghlan), and some further help from [@birkenfeld](https://github.com/birkenfeld) and [@epc](https://github.com/epc), the population of the lists is now done by grabbing and parsing the Sphinx object inventory for the official Python docs of each relevant version.
1214

1315
Usage
1416
-----
1517

16-
```
17-
>>> from stdlib_list import stdlib_list
18-
>>> libraries = stdlib_list("2.7")
19-
>>> libraries[:10]
20-
['AL', 'BaseHTTPServer', 'Bastion', 'CGIHTTPServer', 'ColorPicker', 'ConfigParser', 'Cookie', 'DEVICE', 'DocXMLRPCServer', 'EasyDialogs']
21-
```
18+
>>> from stdlib_list import stdlib_list
19+
>>> libraries = stdlib_list("2.7")
20+
>>> libraries[:10]
21+
['AL', 'BaseHTTPServer', 'Bastion', 'CGIHTTPServer', 'ColorPicker', 'ConfigParser', 'Cookie', 'DEVICE', 'DocXMLRPCServer', 'EasyDialogs']
2222

2323
For more details, check out [the docs](http://python-stdlib-list.readthedocs.org/en/latest/).
24+

README.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,9 @@ Listing the modules in the standard library? Wait, why on Earth would you care a
88

99
Because knowing whether or not a module is part of the standard library will come in handy in `a project of mine <https://github.com/jackmaney/pypt>`_. `And I'm not the only one <http://stackoverflow.com/questions/6463918/how-can-i-get-a-list-of-all-the-python-standard-library-modules>`_ who would find this useful. Or, the TL;DR answer is that it's handy in situations when you're analyzing Python code and would like to find module dependencies.
1010

11-
After googling for a way to generate a list of Python standard libraries (and looking through the answers to the previously-linked Stack Overflow question), I decided that I didn't like the existing solutions. So, I used Beautiful Soup 4 to build a scraper and scraped the list of libraries (along with a bit of metadata) from the Python Module Index.
11+
After googling for a way to generate a list of Python standard libraries (and looking through the answers to the previously-linked Stack Overflow question), I decided that I didn't like the existing solutions. So, I started by writing a scraper for the TOC of the Python Module Index for each of the versions of Python above.
12+
13+
However, web scraping can be a fragile affair. Thanks to `a suggestion <https://github.com/jackmaney/python-stdlib-list/issues/1#issuecomment-86517208>`_ by `@ncoghlan <https://github.com/ncoghlan>`_, and some further help from `@birkenfeld <https://github.com/birkenfeld>`_ and `@epc <https://github.com/epc>`_, the population of the lists is now done by grabbing and parsing the Sphinx object inventory for the official Python docs of each relevant version.
1214

1315
Usage
1416
=====

docs/fetch.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Fetching The Module Lists
2+
=========================
3+
4+
The lists of modules themselves are grabbed from the Sphinx object inventory (ie the file used by :py:mod`sphinx.ext.intersphinx` in order to build links to existing Python modules/functions/etc). You probably shouldn't need to mess around with it. But if you want to, here you go.
5+
6+
7+
.. automodule:: stdlib_list.fetch
8+
:members: fetch_list

docs/index.rst

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ Listing the modules in the standard library? Wait, why on Earth would you care a
1313

1414
Because knowing whether or not a module is part of the standard library will come in handy in `a project of mine <https://github.com/jackmaney/pypt>`_. `And I'm not the only one <http://stackoverflow.com/questions/6463918/how-can-i-get-a-list-of-all-the-python-standard-library-modules>`_ who would find this useful. Or, the TL;DR answer is that it's handy in situations when you're analyzing Python code and would like to find module dependencies.
1515

16-
After googling for a way to generate a list of Python standard libraries (and looking through the answers to the previously-linked Stack Overflow question), I decided that I didn't like the existing solutions.
16+
After googling for a way to generate a list of Python standard libraries (and looking through the answers to the previously-linked Stack Overflow question), I decided that I didn't like the existing solutions. So, I started by writing a scraper for the TOC of the Python Module Index for each of the versions of Python above.
1717

18-
So, I decided to brute force it a bit, by putting together a scraper for the TOC of the standard library page of the official Python docs (eg `here is the page for Python 2.7 <https://docs.python.org/2/library/index.html>`_).
18+
However, web scraping can be a fragile affair. Thanks to `a suggestion <https://github.com/jackmaney/python-stdlib-list/issues/1#issuecomment-86517208>`_ by `@ncoghlan <https://github.com/ncoghlan>`_, and some further help from `@birkenfeld <https://github.com/birkenfeld>`_ and `@epc <https://github.com/epc>`_, the population of the lists is now done by grabbing and parsing the Sphinx object inventory for the official Python docs of each relevant version.
1919

2020
Contents
2121
========
@@ -25,8 +25,7 @@ Contents
2525

2626
install
2727
usage
28-
metadata
29-
scraper
28+
fetch
3029

3130

3231

docs/metadata.rst

Lines changed: 0 additions & 7 deletions
This file was deleted.

docs/scraper.rst

Lines changed: 0 additions & 7 deletions
This file was deleted.

docs/usage.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,17 @@ Usage (Or: How To Get The List of Libraries)
33

44
The primary function that you'll care about in this package is ``stdlib_list.stdlib_list``.
55

6+
In particular:
7+
8+
::
9+
10+
In [1]: from stdlib_list import stdlib_list
11+
12+
In [2]: libs = stdlib_list("3.4")
13+
14+
In [3]: libs[:6]
15+
Out[3]: ['__future__', '__main__', '_dummy_thread', '_thread', 'abc', 'aifc']
16+
17+
618
.. automodule:: stdlib_list
719
:members: stdlib_list

requirements-dev.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,2 @@
11
-r requirements.txt
2-
sphinx
32
sphinx_rtd_theme

requirements.txt

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1 @@
1-
beautifulsoup4
2-
requests
1+
sphinx

0 commit comments

Comments
 (0)