Skip to content

Commit e0a77f8

Browse files
committed
make release-tag: Merge branch 'master' into stable
2 parents bb0bb0d + f93c8b1 commit e0a77f8

27 files changed

+1632
-477
lines changed

CONTRIBUTING.rst

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -172,24 +172,26 @@ The process of releasing a new version involves several steps combining both ``g
172172

173173
1. Merge what is in ``master`` branch into ``stable`` branch.
174174
2. Update the version in ``setup.cfg``, ``mlblocks/__init__.py`` and ``HISTORY.md`` files.
175-
3. Create a new TAG pointing at the correspoding commit in ``stable`` branch.
175+
3. Create a new git tag pointing at the corresponding commit in ``stable`` branch.
176176
4. Merge the new commit from ``stable`` into ``master``.
177-
5. Update the version in ``setup.cfg`` and ``mlblocks/__init__.py`` to open the next
178-
development interation.
177+
5. Update the version in ``setup.cfg`` and ``mlblocks/__init__.py``
178+
to open the next development iteration.
179179

180-
**Note:** Before starting the process, make sure that ``HISTORY.md`` has a section titled
181-
**Unreleased** with the list of changes that will be included in the new version, and that
182-
these changes are committed and available in ``master`` branch.
183-
Normally this is just a list of the Pull Requests that have been merged since the latest version.
180+
.. note:: Before starting the process, make sure that ``HISTORY.md`` has been updated with a new
181+
entry that explains the changes that will be included in the new version.
182+
Normally this is just a list of the Pull Requests that have been merged to master
183+
since the last release.
184184

185-
Once this is done, just run the following commands::
185+
Once this is done, run of the following commands:
186+
187+
1. If you are releasing a patch version::
186188

187-
git checkout stable
188-
git merge --no-ff master # This creates a merge commit
189-
bumpversion release # This creates a new commit and a TAG
190-
git push --tags origin stable
191189
make release
192-
git checkout master
193-
git merge stable
194-
bumpversion --no-tag patch
195-
git push
190+
191+
2. If you are releasing a minor version::
192+
193+
make release-minor
194+
195+
3. If you are releasing a major version::
196+
197+
make release-major

HISTORY.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,20 @@
11
Changelog
22
=========
33

4+
0.3.1 - Pipelines Discovery
5+
---------------------------
6+
7+
* Support flat hyperparameter dictionaries
8+
[Issue #92](https://github.com/HDI-Project/MLBlocks/issues/92) by @csala
9+
* Load pipelines by name and register them as `entry_points`
10+
[Issue #88](https://github.com/HDI-Project/MLBlocks/issues/88) by @csala
11+
* Implement partial re-fit
12+
[Issue #61](https://github.com/HDI-Project/MLBlocks/issues/61) by @csala
13+
* Move argument parsing to MLBlock
14+
[Issue #86](https://github.com/HDI-Project/MLBlocks/issues/86) by @csala
15+
* Allow getting intermediate outputs
16+
[Issue #58](https://github.com/HDI-Project/MLBlocks/issues/58) by @csala
17+
418
0.3.0 - New Primitives Discovery
519
--------------------------------
620

Makefile

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,11 @@ fix-lint: ## fix lint issues using autoflake, autopep8, and isort
9898
autopep8 --in-place --recursive --aggressive tests
9999
isort --apply --atomic --recursive tests
100100

101+
.PHONY: lint-docs
102+
lint-docs: ## check docs formatting with doc8 and pydocstyle
103+
doc8 mlblocks/
104+
pydocstyle mlblocks/
105+
101106

102107
# TEST TARGETS
103108

@@ -122,7 +127,6 @@ coverage: ## check code coverage quickly with the default Python
122127
.PHONY: docs
123128
docs: clean-docs ## generate Sphinx HTML documentation, including API docs
124129
$(MAKE) -C docs html
125-
touch docs/_build/html/.nojekyll
126130

127131
.PHONY: view-docs
128132
view-docs: docs ## view docs in browser

README.md

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -58,11 +58,26 @@ make install
5858
For development, you can use `make install-develop` instead in order to install all
5959
the required dependencies for testing and code linting.
6060

61+
## MLPrimitives
62+
63+
In order to be usable, MLBlocks requires a compatible primitives library.
64+
65+
The official library, required in order to follow the following MLBlocks tutorial,
66+
is [MLPrimitives](https://github.com/HDI-Project/MLPrimitives), which you can install
67+
with this command:
68+
69+
```bash
70+
pip install mlprimitives
71+
```
72+
6173
# Usage Example
6274

6375
Below there is a short example about how to use MLBlocks to create a simple pipeline, fit it
6476
using demo data and use it to make predictions.
6577

78+
Please make sure to having installed [MLPrimitives](https://github.com/HDI-Project/MLPrimitives)
79+
before following it.
80+
6681
For advance usage and more detailed explanation about each component, please have a look
6782
at the [documentation](https://HDI-Project.github.io/MLBlocks)
6883

@@ -81,10 +96,10 @@ them to the `MLPipeline` class.
8196
>>> pipeline = MLPipeline(primitives)
8297
```
8398

84-
Optionally, specific hyperparameters can be also set by specifying them in a dictionary:
99+
Optionally, specific initialization arguments can be also set by specifying them in a dictionary:
85100

86101
```python
87-
>>> hyperparameters = {
102+
>>> init_params = {
88103
... 'skimage.feature.hog': {
89104
... 'multichannel': True,
90105
... 'visualize': False
@@ -93,7 +108,7 @@ Optionally, specific hyperparameters can be also set by specifying them in a dic
93108
... 'n_estimators': 100,
94109
... }
95110
... }
96-
>>> pipeline = MLPipeline(primitives, hyperparameters)
111+
>>> pipeline = MLPipeline(primitives, init_params=init_params)
97112
```
98113

99114
If you can see which hyperparameters a particular pipeline is using, you can do so by calling

docs/advanced_usage/adding_primitives.rst

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -91,20 +91,27 @@ In order to make **MLBLocks** able to find the primitives defined in such a libr
9191
all you need to do is setting up an `Entry Point`_ in your `setup.py` script with the
9292
following specification:
9393

94-
1. It has to be published under the name ``mlprimitives``.
95-
2. It has to be named exactly ``jsons_path``.
96-
3. It has to point at a variable that contains the path to the JSONS folder.
94+
1. It has to be published under the group ``mlblocks``.
95+
2. It has to be named exactly ``primitives``.
96+
3. It has to point at a variable that contains a path or a list of paths to the JSONS folder(s).
9797

9898
An example of such an entry point would be::
9999

100100
entry_points = {
101-
'mlprimitives': [
102-
'jsons_path=some_module:SOME_VARIABLE'
101+
'mlblocks': [
102+
'primitives=some_module:SOME_VARIABLE'
103103
]
104104
}
105105

106106
where the module `some_module` contains a variable such as::
107107

108-
SOME_VARIABLE = os.path.join(os.path.dirname(__file__), 'jsons')
108+
SOME_VARIABLE = 'path/to/primitives'
109+
110+
or::
111+
112+
SOME_VARIABLE = [
113+
'path/to/primitives',
114+
'path/to/more/primitives'
115+
]
109116

110117
.. _Entry Point: https://packaging.python.org/specifications/entry-points/

docs/advanced_usage/pipelines.rst

Lines changed: 82 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ This can be done by passing an extra dictionary to the MLPipeline when it is cre
8686
'n_estimators': 100
8787
}
8888
}
89-
pipeline = MLPipeline(primitives, init_params)
89+
pipeline = MLPipeline(primitives, init_params=init_params)
9090
9191
This dictionary must have as keys the name of the blocks that the arguments belong to, and
9292
as values the dictionary that contains the argument names and their values.
@@ -271,7 +271,7 @@ Like primitives, Pipelines can also be annotated and stored as dicts or JSON fil
271271
the different arguments expected by the ``MLPipeline`` class, as well as the set hyperparameters
272272
and tunable hyperparameters.
273273

274-
Representing a Pipeline as a dict
274+
Representing a Pipeline as a dict
275275
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
276276

277277
The dict representation of an Pipeline can be obtained directly from an ``MLPipeline`` instance,
@@ -344,6 +344,86 @@ that allows loading the pipeline directly from a JSON file:
344344
345345
pipeline = MLPipeline.load('pipeline.json')
346346
347+
348+
Intermediate Outputs and Partial Execution
349+
------------------------------------------
350+
351+
Sometimes we might be interested in capturing an intermediate output within a
352+
pipeline execution in order to inspect it, for debugging purposes, or to reuse
353+
it later on in order to speed up a tuning process where the pipeline needs
354+
to be executed multiple times over the same data.
355+
356+
For this, two special arguments have been included in the ``fit`` and ``predict``
357+
methods of an MLPipeline:
358+
359+
output\_
360+
~~~~~~~~
361+
362+
The ``output_`` argument indicates which block within the pipeline we are interested
363+
in taking the output values from. This, implicitly, indicates up to which block the
364+
pipeline needs to be executed within ``fit`` and ``predict`` before returning.
365+
366+
The ``output_`` argument is optional, and it can either be ``None``, which is the default,
367+
and Integer or a String.
368+
369+
And its format is as follows:
370+
371+
* If it is ``None`` (default), the ``fit`` method will return nothing and the
372+
``predict`` method will return the output of the last block in the pipeline.
373+
* If an integer is given, it is interpreted as the block index, starting on 0,
374+
and the whole context after executing the specified block will be returned.
375+
In case of ``fit``, this means that the outputs will be returned after fitting
376+
a block and then producing it on the same data.
377+
* If it is a string, it can be interpreted in three ways:
378+
379+
* **block name**: If the string matches a block name exactly, including
380+
its hash and counter number ``#n`` at the end, the whole context will be
381+
returned after that block is produced.
382+
* **variable_name**: If the string does not match any block name and does
383+
not contain any dot character, ``'.'``, it will be considered a variable
384+
name. In this case, the indicated variable will be extracted from the
385+
context and returned after the last block has been produced.
386+
* **block_name + variable_name**: If the complete string does not match a
387+
block name but it contains at least one dot, ``'.'``, it will be split
388+
in two parts on the last dot. If the first part of the string matches a
389+
block name exactly, the second part of the string will be considered a
390+
variable name, assuming the format ``{block_name}.{variable_name}``, and
391+
the indicated variable will be extracted from the context and returned
392+
after the block has been produced. Otherwise, if the extracted
393+
``block_name`` does not match a block name exactly, a ``ValueError``
394+
will be raised.
395+
396+
start\_
397+
~~~~~~~
398+
399+
The ``start_`` argument indicates which block within the pipeline we are interested
400+
in starting the computation from when executing ``fit`` and ``predict``, allowing us
401+
to skip some of the initial blocks.
402+
403+
The ``start_`` argument is optional, and it can either be ``None``, which is the default,
404+
and Integer or a String.
405+
406+
And its format is as follows:
407+
408+
* If it is ``None``, the execution will start on the first block.
409+
* If it is an integer, it is interpreted as the block index
410+
* If it is a string, it is expected to be the name of the block, including the counter
411+
number at the end.
412+
413+
This is specially useful when used in combination with the ``output_`` argument, as it
414+
effectively allows us to both capture intermediate outputs for debugging purposes or
415+
reusing intermediate states of the pipeline to accelerate tuning processes.
416+
417+
An example of this situation, where we want to reuse the output of the first block, could be::
418+
419+
context_0 = pipeline.fit(X_train, y_train, output_=0)
420+
421+
# Afterwards, within the tuning loop
422+
pipeline.fit(start_=1, **context_0)
423+
predictions = pipeline.predict(X_test)
424+
score = compute_score(y_test, predictions)
425+
426+
347427
.. _API Reference: ../api_reference.html
348428
.. _primitives: ../primitives.html
349429
.. _mlblocks.MLPipeline: ../api_reference.html#mlblocks.MLPipeline

docs/api/mlblocks.discovery.rst

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
mlblocks.discovery
2+
==================
3+
4+
.. automodule:: mlblocks.discovery
5+
:members:

docs/api/mlblocks.primitives.rst

Lines changed: 0 additions & 5 deletions
This file was deleted.

docs/changelog.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
.. include:: ../HISTORY.md
1+
.. mdinclude:: ../HISTORY.md

docs/conf.py

Lines changed: 9 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -18,18 +18,9 @@
1818
# relative to the documentation root, use os.path.abspath to make it
1919
# absolute, like shown here.
2020

21-
import os
22-
import sys
23-
2421
import sphinx_rtd_theme # For read the docs theme
25-
from recommonmark.parser import CommonMarkParser
26-
# from recommonmark.transform import AutoStructify
27-
28-
# sys.path.insert(0, os.path.abspath('..'))
2922

3023
import mlblocks
31-
#
32-
# mlblocks.add_primitives_path('../mlblocks_primitives')
3324

3425
# -- General configuration ---------------------------------------------
3526

@@ -40,13 +31,21 @@
4031
# Add any Sphinx extension module names here, as strings. They can be
4132
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
4233
extensions = [
43-
'sphinx.ext.napoleon',
34+
'm2r',
35+
'sphinx.ext.autodoc',
4436
'sphinx.ext.githubpages',
37+
'sphinx.ext.viewcode',
38+
'sphinx.ext.napoleon',
4539
'sphinx.ext.graphviz',
4640
'IPython.sphinxext.ipython_console_highlighting',
4741
'IPython.sphinxext.ipython_directive',
42+
'autodocsumm',
4843
]
4944

45+
autodoc_default_options = {
46+
'autosummary': True,
47+
}
48+
5049
ipython_execlines = ["import pandas as pd", "pd.set_option('display.width', 1000000)"]
5150

5251
# Add any paths that contain templates here, relative to this directory.
@@ -56,10 +55,6 @@
5655
# You can specify multiple suffix as a list of string:
5756
source_suffix = ['.rst', '.md', '.ipynb']
5857

59-
source_parsers = {
60-
'.md': CommonMarkParser,
61-
}
62-
6358
# The master toctree document.
6459
master_doc = 'index'
6560

0 commit comments

Comments
 (0)