#project-template - A template for scikit-learn extensions
project-template is a template project for scikit-learn compatible extensions.
It aids development of estimators that can be used in scikit-learn pipelines and (hyper)parameter search, while facilitating testing (including some API compliance), documentation, open source development, packaging, and continuous integration.
HTML Documentation - http://contrib.scikit-learn.org/project-template/
The package by itself comes with a single module and an estimator. Before
installing the module you will need numpy and scipy.
To install the module execute:
$ python setup.py installor
pip install sklearn-template
If the installation is successful, and scikit-learn is correctly installed,
you should be able to execute the following in Python:
>>> from skltemplate import TemplateEstimator
>>> estimator = TemplateEstimator()
>>> estimator.fit(np.arange(10), np.arange(10))TemplateEstimator by itself does nothing useful, but it serves as an example
of how other Estimators should be written. It also comes with its own unit
tests under template/tests which can be run using nosetests.
Clone the project into your computer by executing
$ git clone https://github.com/scikit-learn-contrib/project-template.gitYou should rename the project-template folder to the name of your project.
To host the project on Github, visit https://github.com/new and create a new
repository. To upload your project on Github execute
$ git remote set-url origin https://github.com/username/project-name.git
$ git push origin masterYou are free to modify the source as you want, but at the very least, all your
estimators should pass the check_estimator
test to be scikit-learn compatible.
(If there are valid reasons your estimator cannot pass check_estimator, please
raise an issue at
scikit-learn so we can make check_estimator more flexible.)
This template is particularly useful for publishing open-source versions of algorithms that do not meet the criteria for inclusion in the core scikit-learn package (see FAQ), such as recent and unpopular developments in machine learning. However, developing using this template may also be a stepping stone to eventual inclusion in the core package.
In any case, developers should endeavor to adhere to scikit-learn's Contributor's Guide which promotes the use of:
- algorithm-specific unit tests, in addition to
check_estimator's common tests - PEP8-compliant code
- a clearly documented API using NumpyDoc and PEP257-compliant docstrings
- references to relevant scientific literature in standard citation formats
- doctests to provide succinct usage examples
- standalone examples to illustrate the usage, model visualisation, and benefits/benchmarks of particular algorithms
- efficient code when the need for optimization is supported by benchmarks
The documentation is built using sphinx.
It incorporates narrative documentation from the doc/ directory, standalone
examples from the examples/ directory, and API reference compiled from
estimator docstrings.
To build the documentation locally, ensure that you have sphinx,
sphinx-gallery and matplotlib by executing:
$ pip install sphinx matplotlib sphinx-galleryThe documentation contains a home page (doc/index.rst), an API
documentation page (doc/api.rst) and a page documenting the template module
(doc/template.rst). Sphinx allows you to automatically document your modules
and classes by using the autodoc directive (see template.rst). To change the
asthetics of the docs and other paramteres, edit the doc/conf.py file. For
more information visit the Sphinx Documentation.
You can also add code examples in the examples folder. All files inside
the folder of the form plot_*.py will be executed and their generated
plots will be available for viewing in the /auto_examples URL.
To build the documentation locally execute
$ cd doc
$ make htmlTravisCI allows you to continuously build and test
your code from Github to ensure that no code-breaking changes are pushed. After
you sign up and authourize TravisCI, add your new repository to TravisCI so that
it can start building it. The travis.yml contains the configuration required
for Travis to build the project. You will have to update the variable MODULE
with the name of your module for Travis to test it. Once you add the project on
TravisCI, all subsequent pushes on the master branch will trigger a Travis
build. By default, the project is tested on Python 2.7 and Python 3.5.
Coveralls reports code coverage statistics of your tests on each push. Sign up on Coveralls and add your repository so that Coveralls can start monitoring it. The project already contains the required configuration for Coveralls to work. All subsequent builds after adding your project will generate a coverage report.
The project uses CircleCI to build its documentation
from the master branch and host it using Github Pages.
Again, you will need to Sign Up and authorize CircleCI. The configuration
of CircleCI is governed by the circle.yml file, which needs to be mofified
if you want to setup the docs on your own website. The values to be changed
are
| Variable | Value |
|---|---|
USERNAME |
The name of the user or organization of the repository where the project and documentation is hosted |
DOC_REPO |
The repository where the documentation will be hosted. This can be the same as the project repository |
DOC_URL |
The relative URL where the documentation will be hosted |
EMAIL |
The email id to use while pushing the documentation, this can be any valid email address |
In addition to this, you will need to grant access to the CircleCI computers
to push to your documentation repository. To do this, visit the Project Settings
page of your project in CircleCI. Select Checkout SSH keys option and then
choose Create and add user key option. This should grant CircleCI privileges
to push to the repository https://github.com/USERNAME/DOC_REPO/.
If all goes well, you should be able to visit the documentation of your project on
https://github.com/USERNAME/DOC_REPO/DOC_URL
Follow the instructions to add a Travis Badge,
Coveralls Badge and
CircleCI Badge to your repository's
README.
Once your work is mature enough for the general public to use it, you should
submit a Pull Request to modify scikit-learn's
related projects listing.
Please insert brief description of your project and a link to its code
repository or PyPI page.
You may also wish to announce your work on the
scikit-learn-general mailing list.
Uploading your package to PyPI allows users to
install your package through pip. Python provides two repositories to upload
your packages. The PyPI Test repository,
which is to be used for testing packages before their release, and the
PyPI repository, where you can make your
releases. You need to register a username and password with both these sites.
The username and passwords for both these sites need not be the same. To upload
your package through the command line, you need to store your username and
password in a file called .pypirc in your $HOME directory with the
following format.
[distutils]
index-servers =
pypi
pypitest
[pypi]
repository=https://pypi.python.org/pypi
username=<your-pypi-username>
password=<your-pypi-passowrd>
[pypitest]
repository=https://testpypi.python.org/pypi
username=<your-pypitest-username>
password=<your-pypitest-passowrd>Make sure that all details in setup.py are up to date. To upload your package
to the Test server, execute:
python setup.py register -r pypitest
python setup.py sdist upload -r pypitest
Your package should now be visible on: https://testpypi.python.org/pypi
To install a package from the test server, execute:
pip install -i https://testpypi.python.org/pypi <package-name>
Similary, to upload your package to the PyPI server execute
python setup.py register -r pypi
python setup.py sdist upload -r pypi
To install your package, execute:
pip install <package-name>
Thank you for cleanly contributing to the scikit-learn ecosystem!