Skip to content

Commit 949515f

Browse files
authored
Merge pull request #863 from openml/develop
Release OpenML 0.10.1
2 parents 0f36642 + 34d54d9 commit 949515f

File tree

76 files changed

+3154
-1059
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

76 files changed

+3154
-1059
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)
22

3-
A python interface for [OpenML](http://openml.org). You can find the documentation on the [openml-python website](https://openml.github.io/openml-python).
4-
5-
Please commit to the right branches following the gitflow pattern:
6-
http://nvie.com/posts/a-successful-git-branching-model/
3+
A python interface for [OpenML](http://openml.org), an online platform for open science collaboration in machine learning.
4+
It can be used to download or upload OpenML data such as datasets and machine learning experiment results.
5+
You can find the documentation on the [openml-python website](https://openml.github.io/openml-python).
6+
If you wish to contribute to the package, please see our [contribution guidelines](https://github.com/openml/openml-python/blob/develop/CONTRIBUTING.md).
77

88
Master branch:
99

1010
[![Build Status](https://travis-ci.org/openml/openml-python.svg?branch=master)](https://travis-ci.org/openml/openml-python)
11-
[![Code Health](https://landscape.io/github/openml/openml-python/master/landscape.svg)](https://landscape.io/github/openml/openml-python/master)
11+
[![Build status](https://ci.appveyor.com/api/projects/status/blna1eip00kdyr25?svg=true)](https://ci.appveyor.com/project/OpenML/openml-python)
1212
[![Coverage Status](https://coveralls.io/repos/github/openml/openml-python/badge.svg?branch=master)](https://coveralls.io/github/openml/openml-python?branch=master)
1313

1414
Development branch:
1515

1616
[![Build Status](https://travis-ci.org/openml/openml-python.svg?branch=develop)](https://travis-ci.org/openml/openml-python)
17-
[![Code Health](https://landscape.io/github/openml/openml-python/master/landscape.svg)](https://landscape.io/github/openml/openml-python/master)
17+
[![Build status](https://ci.appveyor.com/api/projects/status/blna1eip00kdyr25/branch/develop?svg=true)](https://ci.appveyor.com/project/OpenML/openml-python/branch/develop)
1818
[![Coverage Status](https://coveralls.io/repos/github/openml/openml-python/badge.svg?branch=develop)](https://coveralls.io/github/openml/openml-python?branch=develop)

appveyor.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,4 +43,4 @@ build: false
4343

4444
test_script:
4545
- "cd C:\\projects\\openml-python"
46-
- "%CMD_IN_ENV% pytest -n 4 --timeout=600 --timeout-method=thread -sv --ignore='test_OpenMLDemo.py'"
46+
- "%CMD_IN_ENV% pytest -n 4 --timeout=600 --timeout-method=thread -sv"

ci_scripts/install.sh

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,12 +36,13 @@ pip install -e '.[test]'
3636
python -c "import numpy; print('numpy %s' % numpy.__version__)"
3737
python -c "import scipy; print('scipy %s' % scipy.__version__)"
3838

39-
if [[ "$EXAMPLES" == "true" ]]; then
40-
pip install -e '.[examples]'
41-
fi
4239
if [[ "$DOCTEST" == "true" ]]; then
4340
pip install sphinx_bootstrap_theme
4441
fi
42+
if [[ "$DOCPUSH" == "true" ]]; then
43+
conda install --yes gxx_linux-64 gcc_linux-64 swig
44+
pip install -e '.[examples,examples_unix]'
45+
fi
4546
if [[ "$COVERAGE" == "true" ]]; then
4647
pip install codecov pytest-cov
4748
fi
@@ -52,3 +53,5 @@ fi
5253
# Install scikit-learn last to make sure the openml package installation works
5354
# from a clean environment without scikit-learn.
5455
pip install scikit-learn==$SKLEARN_VERSION
56+
57+
conda list

ci_scripts/test.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ run_tests() {
2828
PYTEST_ARGS=''
2929
fi
3030

31-
pytest -n 4 --durations=20 --timeout=600 --timeout-method=thread -sv --ignore='test_OpenMLDemo.py' $PYTEST_ARGS $test_dir
31+
pytest -n 4 --durations=20 --timeout=600 --timeout-method=thread -sv $PYTEST_ARGS $test_dir
3232
}
3333

3434
if [[ "$RUN_FLAKE8" == "true" ]]; then

doc/api.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,7 @@ Modules
8585

8686
list_evaluations
8787
list_evaluation_measures
88+
list_evaluations_setups
8889

8990
:mod:`openml.flows`: Flow Functions
9091
-----------------------------------

doc/contributing.rst

Lines changed: 68 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,20 +21,20 @@ you can use github's assign feature, otherwise you can just leave a comment.
2121
Scope of the package
2222
====================
2323

24-
The scope of the OpenML python package is to provide a python interface to
25-
the OpenML platform which integrates well with pythons scientific stack, most
24+
The scope of the OpenML Python package is to provide a Python interface to
25+
the OpenML platform which integrates well with Python's scientific stack, most
2626
notably `numpy <http://www.numpy.org/>`_ and `scipy <https://www.scipy.org/>`_.
2727
To reduce opportunity costs and demonstrate the usage of the package, it also
2828
implements an interface to the most popular machine learning package written
29-
in python, `scikit-learn <http://scikit-learn.org/stable/index.html>`_.
29+
in Python, `scikit-learn <http://scikit-learn.org/stable/index.html>`_.
3030
Thereby it will automatically be compatible with many machine learning
3131
libraries written in Python.
3232

3333
We aim to keep the package as light-weight as possible and we will try to
3434
keep the number of potential installation dependencies as low as possible.
3535
Therefore, the connection to other machine learning libraries such as
3636
*pytorch*, *keras* or *tensorflow* should not be done directly inside this
37-
package, but in a separate package using the OpenML python connector.
37+
package, but in a separate package using the OpenML Python connector.
3838

3939
.. _issues:
4040

@@ -52,7 +52,7 @@ contains longer-term goals.
5252
How to contribute
5353
=================
5454

55-
There are many ways to contribute to the development of the OpenML python
55+
There are many ways to contribute to the development of the OpenML Python
5656
connector and OpenML in general. We welcome all kinds of contributions,
5757
especially:
5858

@@ -158,5 +158,67 @@ Happy testing!
158158
Connecting new machine learning libraries
159159
=========================================
160160

161-
Coming soon - please stay tuned!
161+
Content of the Library
162+
~~~~~~~~~~~~~~~~~~~~~~
162163

164+
To leverage support from the community and to tap in the potential of OpenML, interfacing
165+
with popular machine learning libraries is essential. However, the OpenML-Python team does
166+
not have the capacity to develop and maintain such interfaces on its own. For this, we
167+
have built an extension interface to allows others to contribute back. Building a suitable
168+
extension for therefore requires an understanding of the current OpenML-Python support.
169+
170+
`This example <examples/flows_and_runs_tutorial.html>`_
171+
shows how scikit-learn currently works with OpenML-Python as an extension. The *sklearn*
172+
extension packaged with the `openml-python <https://github.com/openml/openml-python>`_
173+
repository can be used as a template/benchmark to build the new extension.
174+
175+
176+
API
177+
+++
178+
* The extension scripts must import the `openml` package and be able to interface with
179+
any function from the OpenML-Python `API <api.html>`_.
180+
* The extension has to be defined as a Python class and must inherit from
181+
:class:`openml.extensions.Extension`.
182+
* This class needs to have all the functions from `class Extension` overloaded as required.
183+
* The redefined functions should have adequate and appropriate docstrings. The
184+
`Sklearn Extension API :class:`openml.extensions.sklearn.SklearnExtension.html`
185+
is a good benchmark to follow.
186+
187+
188+
Interfacing with OpenML-Python
189+
++++++++++++++++++++++++++++++
190+
Once the new extension class has been defined, the openml-python module to
191+
:meth:`openml.extensions.register_extension.html` must be called to allow OpenML-Python to
192+
interface the new extension.
193+
194+
195+
Hosting the library
196+
~~~~~~~~~~~~~~~~~~~
197+
198+
Each extension created should be a stand-alone repository, compatible with the
199+
`OpenML-Python repository <https://github.com/openml/openml-python>`_.
200+
The extension repository should work off-the-shelf with *OpenML-Python* installed.
201+
202+
Create a `public Github repo <https://help.github.com/en/articles/create-a-repo>`_ with
203+
the following directory structure:
204+
205+
::
206+
207+
| [repo name]
208+
| |-- [extension name]
209+
| | |-- __init__.py
210+
| | |-- extension.py
211+
| | |-- config.py (optionally)
212+
213+
214+
215+
Recommended
216+
~~~~~~~~~~~
217+
* Test cases to keep the extension up to date with the `openml-python` upstream changes.
218+
* Documentation of the extension API, especially if any new functionality added to OpenML-Python's
219+
extension design.
220+
* Examples to show how the new extension interfaces and works with OpenML-Python.
221+
* Create a PR to add the new extension to the OpenML-Python API documentation.
222+
223+
224+
Happy contributing!

doc/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ Example
3838
# Publish the experiment on OpenML (optional, requires an API key.
3939
# You can get your own API key by signing up to OpenML.org)
4040
run.publish()
41-
print('View the run online: %s/run/%d' % (openml.config.server, run.run_id))
41+
print(f'View the run online: {openml.config.server}/run/{run.run_id}')
4242
4343
You can find more examples in our `examples gallery <examples/index.html>`_.
4444

doc/progress.rst

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,57 @@
66
Changelog
77
=========
88

9+
0.10.1
10+
~~~~~~
11+
* ADD #175: Automatically adds the docstring of scikit-learn objects to flow and its parameters.
12+
* ADD #737: New evaluation listing call that includes the hyperparameter settings.
13+
* ADD #744: It is now possible to only issue a warning and not raise an exception if the package
14+
versions for a flow are not met when deserializing it.
15+
* ADD #783: The URL to download the predictions for a run is now stored in the run object.
16+
* ADD #790: Adds the uploader name and id as new filtering options for ``list_evaluations``.
17+
* ADD #792: New convenience function ``openml.flow.get_flow_id``.
18+
* ADD #861: Debug-level log information now being written to a file in the cache directory (at most 2 MB).
19+
* DOC #778: Introduces instructions on how to publish an extension to support other libraries
20+
than scikit-learn.
21+
* DOC #785: The examples section is completely restructured into simple simple examples, advanced
22+
examples and examples showcasing the use of OpenML-Python to reproduce papers which were done
23+
with OpenML-Python.
24+
* DOC #788: New example on manually iterating through the split of a task.
25+
* DOC #789: Improve the usage of dataframes in the examples.
26+
* DOC #791: New example for the paper *Efficient and Robust Automated Machine Learning* by Feurer
27+
et al. (2015).
28+
* DOC #803: New example for the paper *Don’t Rule Out Simple Models Prematurely:
29+
A Large Scale Benchmark Comparing Linear and Non-linear Classifiers in OpenML* by Benjamin
30+
Strang et al. (2018).
31+
* DOC #808: New example demonstrating basic use cases of a dataset.
32+
* DOC #810: New example demonstrating the use of benchmarking studies and suites.
33+
* DOC #832: New example for the paper *Scalable Hyperparameter Transfer Learning* by
34+
Valerio Perrone et al. (2019)
35+
* DOC #834: New example showing how to plot the loss surface for a support vector machine.
36+
* FIX #305: Do not require the external version in the flow XML when loading an object.
37+
* FIX #734: Better handling of *"old"* flows.
38+
* FIX #736: Attach a StreamHandler to the openml logger instead of the root logger.
39+
* FIX #758: Fixes an error which made the client API crash when loading a sparse data with
40+
categorical variables.
41+
* FIX #779: Do not fail on corrupt pickle
42+
* FIX #782: Assign the study id to the correct class attribute.
43+
* FIX #819: Automatically convert column names to type string when uploading a dataset.
44+
* FIX #820: Make ``__repr__`` work for datasets which do not have an id.
45+
* MAINT #796: Rename an argument to make the function ``list_evaluations`` more consistent.
46+
* MAINT #811: Print the full error message given by the server.
47+
* MAINT #828: Create base class for OpenML entity classes.
48+
* MAINT #829: Reduce the number of data conversion warnings.
49+
* MAINT #831: Warn if there's an empty flow description when publishing a flow.
50+
* MAINT #837: Also print the flow XML if a flow fails to validate.
51+
* FIX #838: Fix list_evaluations_setups to work when evaluations are not a 100 multiple.
52+
* FIX #847: Fixes an issue where the client API would crash when trying to download a dataset
53+
when there are no qualities available on the server.
54+
* MAINT #849: Move logic of most different ``publish`` functions into the base class.
55+
* MAINt #850: Remove outdated test code.
56+
957
0.10.0
1058
~~~~~~
59+
1160
* ADD #737: Add list_evaluations_setups to return hyperparameters along with list of evaluations.
1261
* FIX #261: Test server is cleared of all files uploaded during unit testing.
1362
* FIX #447: All files created by unit tests no longer persist in local.
@@ -25,6 +74,7 @@ Changelog
2574
* ADD #412: The scikit-learn extension populates the short name field for flows.
2675
* MAINT #726: Update examples to remove deprecation warnings from scikit-learn
2776
* MAINT #752: Update OpenML-Python to be compatible with sklearn 0.21
77+
* ADD #790: Add user ID and name to list_evaluations
2878

2979

3080
0.9.0

doc/usage.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,11 @@ Installation & Set up
2121
~~~~~~~~~~~~~~~~~~~~~~
2222

2323
The OpenML Python package is a connector to `OpenML <https://www.openml.org/>`_.
24-
It allows to use and share datasets and tasks, run
24+
It allows you to use and share datasets and tasks, run
2525
machine learning algorithms on them and then share the results online.
2626

2727
The following tutorial gives a short introduction on how to install and set up
28-
the OpenML python connector, followed up by a simple example.
28+
the OpenML Python connector, followed up by a simple example.
2929

3030
* `Introduction <examples/introduction_tutorial.html>`_
3131

@@ -52,7 +52,7 @@ Working with tasks
5252
~~~~~~~~~~~~~~~~~~
5353

5454
You can think of a task as an experimentation protocol, describing how to apply
55-
a machine learning model to a dataset in a way that it is comparable with the
55+
a machine learning model to a dataset in a way that is comparable with the
5656
results of others (more on how to do that further down). Tasks are containers,
5757
defining which dataset to use, what kind of task we're solving (regression,
5858
classification, clustering, etc...) and which column to predict. Furthermore,
@@ -86,7 +86,7 @@ predictions of that run. When a run is uploaded to the server, the server
8686
automatically calculates several metrics which can be used to compare the
8787
performance of different flows to each other.
8888

89-
So far, the OpenML python connector works only with estimator objects following
89+
So far, the OpenML Python connector works only with estimator objects following
9090
the `scikit-learn estimator API <http://scikit-learn.org/dev/developers/contributing.html#apis-of-scikit-learn-objects>`_.
9191
Those can be directly run on a task, and a flow will automatically be created or
9292
downloaded from the server if it already exists.
@@ -114,7 +114,7 @@ requirements and how to download a dataset:
114114
OpenML is about sharing machine learning results and the datasets they were
115115
obtained on. Learn how to share your datasets in the following tutorial:
116116

117-
* `Upload a dataset <examples/create_upload_tutorial.html>`_
117+
* `Upload a dataset <examples/30_extended/create_upload_tutorial.html>`_
118118

119119
~~~~~~~~~~~~~~~~~~~~~~~
120120
Extending OpenML-Python

examples/20_basic/README.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Introductory Examples
2+
=====================
3+
4+
Introductory examples to the usage of the OpenML python connector.

0 commit comments

Comments
 (0)