Skip to content

Commit bc87333

Browse files
authored
Merge pull request #969 from openml/develop
Prepare 11.0 release
2 parents 55b3343 + 79a6705 commit bc87333

File tree

106 files changed

+8341
-6180
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

106 files changed

+8341
-6180
lines changed

.flake8

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
[flake8]
2+
max-line-length = 100
3+
show-source = True
4+
select = C,E,F,W,B,T
5+
ignore = E203, E402, W503
6+
per-file-ignores =
7+
*__init__.py:F401
8+
exclude =
9+
venv
10+
examples

.pre-commit-config.yaml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
repos:
2+
- repo: https://github.com/psf/black
3+
rev: 19.10b0
4+
hooks:
5+
- id: black
6+
args: [--line-length=100]
7+
- repo: https://github.com/pre-commit/mirrors-mypy
8+
rev: v0.761
9+
hooks:
10+
- id: mypy
11+
name: mypy openml
12+
files: openml/*
13+
- id: mypy
14+
name: mypy tests
15+
files: tests/*
16+
- repo: https://gitlab.com/pycqa/flake8
17+
rev: 3.8.3
18+
hooks:
19+
- id: flake8
20+
name: flake8 openml
21+
files: openml/*
22+
additional_dependencies:
23+
- flake8-print==3.1.4
24+
- id: flake8
25+
name: flake8 tests
26+
files: tests/*
27+
additional_dependencies:
28+
- flake8-print==3.1.4

.travis.yml

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,17 @@ env:
1515
- TEST_DIR=/tmp/test_dir/
1616
- MODULE=openml
1717
matrix:
18-
- DISTRIB="conda" PYTHON_VERSION="3.5" SKLEARN_VERSION="0.21.2"
19-
- DISTRIB="conda" PYTHON_VERSION="3.6" SKLEARN_VERSION="0.21.2"
20-
- DISTRIB="conda" PYTHON_VERSION="3.7" SKLEARN_VERSION="0.21.2" RUN_FLAKE8="true" SKIP_TESTS="true"
21-
- DISTRIB="conda" PYTHON_VERSION="3.7" SKLEARN_VERSION="0.21.2" COVERAGE="true" DOCPUSH="true"
22-
- DISTRIB="conda" PYTHON_VERSION="3.7" SKLEARN_VERSION="0.20.2"
18+
- DISTRIB="conda" PYTHON_VERSION="3.6" SKLEARN_VERSION="0.23.1" COVERAGE="true" DOCPUSH="true" SKIP_TESTS="true"
19+
- DISTRIB="conda" PYTHON_VERSION="3.7" SKLEARN_VERSION="0.23.1" RUN_FLAKE8="true" SKIP_TESTS="true"
20+
- DISTRIB="conda" PYTHON_VERSION="3.8" SKLEARN_VERSION="0.23.1" TEST_DIST="true"
21+
- DISTRIB="conda" PYTHON_VERSION="3.7" SKLEARN_VERSION="0.23.1" TEST_DIST="true"
22+
- DISTRIB="conda" PYTHON_VERSION="3.6" SKLEARN_VERSION="0.23.1" TEST_DIST="true"
23+
- DISTRIB="conda" PYTHON_VERSION="3.8" SKLEARN_VERSION="0.22.2" TEST_DIST="true"
24+
- DISTRIB="conda" PYTHON_VERSION="3.7" SKLEARN_VERSION="0.22.2" TEST_DIST="true"
25+
- DISTRIB="conda" PYTHON_VERSION="3.6" SKLEARN_VERSION="0.22.2" TEST_DIST="true"
26+
- DISTRIB="conda" PYTHON_VERSION="3.7" SKLEARN_VERSION="0.21.2" TEST_DIST="true"
27+
- DISTRIB="conda" PYTHON_VERSION="3.6" SKLEARN_VERSION="0.21.2" TEST_DIST="true"
28+
- DISTRIB="conda" PYTHON_VERSION="3.6" SKLEARN_VERSION="0.20.2"
2329
# Checks for older scikit-learn versions (which also don't nicely work with
2430
# Python3.7)
2531
- DISTRIB="conda" PYTHON_VERSION="3.6" SKLEARN_VERSION="0.19.2"

CONTRIBUTING.md

Lines changed: 129 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,76 @@
1-
How to contribute
2-
-----------------
1+
This document describes the workflow on how to contribute to the openml-python package.
2+
If you are interested in connecting a machine learning package with OpenML (i.e.
3+
write an openml-python extension) or want to find other ways to contribute, see [this page](https://openml.github.io/openml-python/master/contributing.html#contributing).
34

4-
The preferred workflow for contributing to the OpenML python connector is to
5+
Scope of the package
6+
--------------------
7+
8+
The scope of the OpenML Python package is to provide a Python interface to
9+
the OpenML platform which integrates well with Python's scientific stack, most
10+
notably [numpy](http://www.numpy.org/), [scipy](https://www.scipy.org/) and
11+
[pandas](https://pandas.pydata.org/).
12+
To reduce opportunity costs and demonstrate the usage of the package, it also
13+
implements an interface to the most popular machine learning package written
14+
in Python, [scikit-learn](http://scikit-learn.org/stable/index.html).
15+
Thereby it will automatically be compatible with many machine learning
16+
libraries written in Python.
17+
18+
We aim to keep the package as light-weight as possible and we will try to
19+
keep the number of potential installation dependencies as low as possible.
20+
Therefore, the connection to other machine learning libraries such as
21+
*pytorch*, *keras* or *tensorflow* should not be done directly inside this
22+
package, but in a separate package using the OpenML Python connector.
23+
More information on OpenML Python connectors can be found [here](https://openml.github.io/openml-python/master/contributing.html#contributing).
24+
25+
Reporting bugs
26+
--------------
27+
We use GitHub issues to track all bugs and feature requests; feel free to
28+
open an issue if you have found a bug or wish to see a feature implemented.
29+
30+
It is recommended to check that your issue complies with the
31+
following rules before submitting:
32+
33+
- Verify that your issue is not being currently addressed by other
34+
[issues](https://github.com/openml/openml-python/issues)
35+
or [pull requests](https://github.com/openml/openml-python/pulls).
36+
37+
- Please ensure all code snippets and error messages are formatted in
38+
appropriate code blocks.
39+
See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks).
40+
41+
- Please include your operating system type and version number, as well
42+
as your Python, openml, scikit-learn, numpy, and scipy versions. This information
43+
can be found by running the following code snippet:
44+
```python
45+
import platform; print(platform.platform())
46+
import sys; print("Python", sys.version)
47+
import numpy; print("NumPy", numpy.__version__)
48+
import scipy; print("SciPy", scipy.__version__)
49+
import sklearn; print("Scikit-Learn", sklearn.__version__)
50+
import openml; print("OpenML", openml.__version__)
51+
```
52+
53+
Determine what contribution to make
54+
-----------------------------------
55+
Great! You've decided you want to help out. Now what?
56+
All contributions should be linked to issues on the [Github issue tracker](https://github.com/openml/openml-python/issues).
57+
In particular for new contributors, the *good first issue* label should help you find
58+
issues which are suitable for beginners. Resolving these issues allow you to start
59+
contributing to the project without much prior knowledge. Your assistance in this area
60+
will be greatly appreciated by the more experienced developers as it helps free up
61+
their time to concentrate on other issues.
62+
63+
If you encountered a particular part of the documentation or code that you want to improve,
64+
but there is no related open issue yet, open one first.
65+
This is important since you can first get feedback or pointers from experienced contributors.
66+
67+
To let everyone know you are working on an issue, please leave a comment that states you will work on the issue
68+
(or, if you have the permission, *assign* yourself to the issue). This avoids double work!
69+
70+
General git workflow
71+
--------------------
72+
73+
The preferred workflow for contributing to openml-python is to
574
fork the [main repository](https://github.com/openml/openml-python) on
675
GitHub, clone, check out the branch `develop`, and develop on a new branch
776
branch. Steps:
@@ -109,75 +178,79 @@ following rules before you submit a pull request:
109178
- If any source file is being added to the repository, please add the BSD 3-Clause license to it.
110179

111180

112-
You can also check for common programming errors with the following
113-
tools:
114-
115-
- Code with good unittest **coverage** (at least 80%), check with:
116-
181+
First install openml with its test dependencies by running
117182
```bash
118-
$ pip install pytest pytest-cov
119-
$ pytest --cov=. path/to/tests_for_package
183+
$ pip install -e .[test]
120184
```
121-
122-
- No style warnings, check with:
123-
185+
from the repository folder.
186+
Then configure pre-commit through
187+
```bash
188+
$ pre-commit install
189+
```
190+
This will install dependencies to run unit tests, as well as [pre-commit](https://pre-commit.com/).
191+
To run the unit tests, and check their code coverage, run:
124192
```bash
125-
$ pip install flake8
126-
$ flake8 --ignore E402,W503 --show-source --max-line-length 100
193+
$ pytest --cov=. path/to/tests_for_package
127194
```
128-
129-
- No mypy (typing) issues, check with:
130-
195+
Make sure your code has good unittest **coverage** (at least 80%).
196+
197+
Pre-commit is used for various style checking and code formatting.
198+
Before each commit, it will automatically run:
199+
- [black](https://black.readthedocs.io/en/stable/) a code formatter.
200+
This will automatically format your code.
201+
Make sure to take a second look after any formatting takes place,
202+
if the resulting code is very bloated, consider a (small) refactor.
203+
*note*: If Black reformats your code, the commit will automatically be aborted.
204+
Make sure to add the formatted files (back) to your commit after checking them.
205+
- [mypy](https://mypy.readthedocs.io/en/stable/) a static type checker.
206+
In particular, make sure each function you work on has type hints.
207+
- [flake8](https://flake8.pycqa.org/en/latest/index.html) style guide enforcement.
208+
Almost all of the black-formatted code should automatically pass this check,
209+
but make sure to make adjustments if it does fail.
210+
211+
If you want to run the pre-commit tests without doing a commit, run:
131212
```bash
132-
$ pip install mypy
133-
$ mypy openml --ignore-missing-imports --follow-imports skip
213+
$ pre-commit run --all-files
134214
```
215+
Make sure to do this at least once before your first commit to check your setup works.
135216

136-
Filing bugs
137-
-----------
138-
We use GitHub issues to track all bugs and feature requests; feel free to
139-
open an issue if you have found a bug or wish to see a feature implemented.
140-
141-
It is recommended to check that your issue complies with the
142-
following rules before submitting:
143-
144-
- Verify that your issue is not being currently addressed by other
145-
[issues](https://github.com/openml/openml-python/issues)
146-
or [pull requests](https://github.com/openml/openml-python/pulls).
147-
148-
- Please ensure all code snippets and error messages are formatted in
149-
appropriate code blocks.
150-
See [Creating and highlighting code blocks](https://help.github.com/articles/creating-and-highlighting-code-blocks).
217+
Executing a specific unit test can be done by specifying the module, test case, and test.
218+
To obtain a hierarchical list of all tests, run
151219

152-
- Please include your operating system type and version number, as well
153-
as your Python, openml, scikit-learn, numpy, and scipy versions. This information
154-
can be found by running the following code snippet:
220+
```bash
221+
$ pytest --collect-only
222+
223+
<Module 'tests/test_datasets/test_dataset.py'>
224+
<UnitTestCase 'OpenMLDatasetTest'>
225+
<TestCaseFunction 'test_dataset_format_constructor'>
226+
<TestCaseFunction 'test_get_data'>
227+
<TestCaseFunction 'test_get_data_rowid_and_ignore_and_target'>
228+
<TestCaseFunction 'test_get_data_with_ignore_attributes'>
229+
<TestCaseFunction 'test_get_data_with_rowid'>
230+
<TestCaseFunction 'test_get_data_with_target'>
231+
<UnitTestCase 'OpenMLDatasetTestOnTestServer'>
232+
<TestCaseFunction 'test_tagging'>
233+
```
155234

156-
```python
157-
import platform; print(platform.platform())
158-
import sys; print("Python", sys.version)
159-
import numpy; print("NumPy", numpy.__version__)
160-
import scipy; print("SciPy", scipy.__version__)
161-
import sklearn; print("Scikit-Learn", sklearn.__version__)
162-
import openml; print("OpenML", openml.__version__)
163-
```
235+
You may then run a specific module, test case, or unit test respectively:
236+
```bash
237+
$ pytest tests/test_datasets/test_dataset.py
238+
$ pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest
239+
$ pytest tests/test_datasets/test_dataset.py::OpenMLDatasetTest::test_get_data
240+
```
164241

165-
New contributor tips
166-
--------------------
242+
*NOTE*: In the case the examples build fails during the Continuous Integration test online, please
243+
fix the first failing example. If the first failing example switched the server from live to test
244+
or vice-versa, and the subsequent examples expect the other server, the ensuing examples will fail
245+
to be built as well.
167246

168-
A great way to start contributing to openml-python is to pick an item
169-
from the list of [Good First Issues](https://github.com/openml/openml-python/labels/Good%20first%20issue)
170-
in the issue tracker. Resolving these issues allow you to start
171-
contributing to the project without much prior knowledge. Your
172-
assistance in this area will be greatly appreciated by the more
173-
experienced developers as it helps free up their time to concentrate on
174-
other issues.
247+
Happy testing!
175248

176249
Documentation
177250
-------------
178251

179252
We are glad to accept any sort of documentation: function docstrings,
180-
reStructuredText documents (like this one), tutorials, etc.
253+
reStructuredText documents, tutorials, etc.
181254
reStructuredText documents live in the source code repository under the
182255
doc/ directory.
183256

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ all: clean inplace test
99

1010
clean:
1111
$(PYTHON) setup.py clean
12-
rm -rf dist
12+
rm -rf dist openml.egg-info
1313

1414
in: inplace # just a shortcut
1515
inplace:

README.md

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,14 @@
1-
[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
1+
# OpenML-Python
22

33
A python interface for [OpenML](http://openml.org), an online platform for open science collaboration in machine learning.
44
It can be used to download or upload OpenML data such as datasets and machine learning experiment results.
5-
You can find the documentation on the [openml-python website](https://openml.github.io/openml-python).
6-
If you wish to contribute to the package, please see our [contribution guidelines](https://github.com/openml/openml-python/blob/develop/CONTRIBUTING.md).
5+
6+
## General
7+
8+
* [Documentation](https://openml.github.io/openml-python).
9+
* [Contribution guidelines](https://github.com/openml/openml-python/blob/develop/CONTRIBUTING.md).
10+
11+
[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
712

813
Master branch:
914

@@ -16,3 +21,22 @@ Development branch:
1621
[![Build Status](https://travis-ci.org/openml/openml-python.svg?branch=develop)](https://travis-ci.org/openml/openml-python)
1722
[![Build status](https://ci.appveyor.com/api/projects/status/blna1eip00kdyr25/branch/develop?svg=true)](https://ci.appveyor.com/project/OpenML/openml-python/branch/develop)
1823
[![Coverage Status](https://coveralls.io/repos/github/openml/openml-python/badge.svg?branch=develop)](https://coveralls.io/github/openml/openml-python?branch=develop)
24+
25+
## Citing OpenML-Python
26+
27+
If you use OpenML-Python in a scientific publication, we would appreciate a reference to the
28+
following paper:
29+
30+
[Matthias Feurer, Jan N. van Rijn, Arlind Kadra, Pieter Gijsbers, Neeratyoy Mallik, Sahithya Ravi, Andreas Müller, Joaquin Vanschoren, Frank Hutter<br/>
31+
**OpenML-Python: an extensible Python API for OpenML**<br/>
32+
*arXiv:1911.02490 [cs.LG]*](https://arxiv.org/abs/1911.02490)
33+
34+
Bibtex entry:
35+
```bibtex
36+
@article{feurer-arxiv19a,
37+
author = {Matthias Feurer and Jan N. van Rijn and Arlind Kadra and Pieter Gijsbers and Neeratyoy Mallik and Sahithya Ravi and Andreas Müller and Joaquin Vanschoren and Frank Hutter},
38+
title = {OpenML-Python: an extensible Python API for OpenML},
39+
journal = {arXiv:1911.02490},
40+
year = {2019},
41+
}
42+
```

appveyor.yml

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,10 @@ environment:
55
# CMD_IN_ENV: "cmd /E:ON /V:ON /C .\\appveyor\\scikit-learn-contrib\\run_with_env.cmd"
66

77
matrix:
8-
- PYTHON: "C:\\Python35-x64"
9-
PYTHON_VERSION: "3.5"
8+
- PYTHON: "C:\\Python3-x64"
9+
PYTHON_VERSION: "3.6"
1010
PYTHON_ARCH: "64"
11-
MINICONDA: "C:\\Miniconda35-x64"
11+
MINICONDA: "C:\\Miniconda36-x64"
1212

1313
matrix:
1414
fast_finish: true
@@ -35,7 +35,9 @@ install:
3535
# Install the build and runtime dependencies of the project.
3636
- "cd C:\\projects\\openml-python"
3737
- "pip install .[examples,test]"
38-
- conda install --quiet --yes scikit-learn=0.20.0
38+
- "pip install scikit-learn==0.21"
39+
# Uninstall coverage, as it leads to an error on appveyor
40+
- "pip uninstall -y pytest-cov"
3941

4042

4143
# Not a .NET project, we build scikit-learn in the install step instead

ci_scripts/flake8_diff.sh

Lines changed: 0 additions & 9 deletions
This file was deleted.

0 commit comments

Comments
 (0)