Skip to content

Commit e952b51

Browse files
committed
ADD example for sequential execution of model and ensemble fitting
1 parent 5e4686a commit e952b51

File tree

7 files changed

+20
-11
lines changed

7 files changed

+20
-11
lines changed

autosklearn/ensemble_builder.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -194,7 +194,8 @@ def main(self):
194194
predictions.shape[1])
195195

196196
except Exception as e:
197-
self.logger.warning('Error loading %s: %s', basename, e)
197+
self.logger.warning('Error loading %s: %s - %s',
198+
basename, type(e), e)
198199
score = -1
199200

200201
model_names_to_scores[model_name] = score

autosklearn/estimators.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -256,7 +256,8 @@ def fit(self, *args, **kwargs):
256256
def fit_ensemble(self, task=None, metric=None, precision='32',
257257
dataset_name=None, ensemble_nbest=None,
258258
ensemble_size=None):
259-
self._automl = self.build_automl()
259+
if self._automl is None:
260+
self._automl = self.build_automl()
260261
return self._automl.fit_ensemble(task, metric, precision,
261262
dataset_name, ensemble_nbest,
262263
ensemble_size)

doc/manual.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,20 @@ model by writing it to disk after every iteration. At the beginning of each
2323
iteration, SMAC loads all newly found data points. An example can be found in
2424
the example directory.
2525

26+
In it's default mode, auto-sklearn already uses two cores. The first one is
27+
used for model building, the second for building an ensemble every time a new
28+
machine learning model has finished training. The file `example_sequential
29+
.py` in the example directory describes how to run these tasks sequentially
30+
to use only a single core at a time.
31+
32+
Furthermore, depending on the installation of scikit-learn and numpy,
33+
the model building procedure may use up to all cores. Such behaviour is
34+
unintended by auto-sklearn and is most likely due to numpy being installed
35+
from `pypi` as a binary wheel (`see here http://scikit-learn-general.narkive
36+
.com/44ywvAHA/binary-wheel-packages-for-linux-are-coming`_). Executing
37+
`export OPENBLAS_NUM_THREADS=1` should disable such behaviours and make numpy
38+
only use a single core at a time.
39+
2640
Model persistence
2741
*****************
2842

example/example_crossvalidation.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,4 @@
11
# -*- encoding: utf-8 -*-
2-
from __future__ import print_function
3-
42
import sklearn.datasets
53
import numpy as np
64

example/example_holdout.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,6 @@
1-
# -*- encoding: utf-8 -*-
2-
from __future__ import print_function
31
from operator import itemgetter
42

53
import numpy as np
6-
import pandas as pd
74
import sklearn.datasets
85
import sklearn.metrics
96

example/example_parallel.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import logging
1+
# -*- encoding: utf-8 -*-
22
import multiprocessing
33

44
import numpy as np

example/example_regression.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,4 @@
11
# -*- encoding: utf-8 -*-
2-
from __future__ import print_function
3-
42
import numpy as np
53
import sklearn.datasets
64
import sklearn.metrics
@@ -16,7 +14,7 @@ def main():
1614
np.random.shuffle(indices)
1715
X_train, X_test, y_train, y_test = train_test_split(X, y)
1816
automl = autosklearn.regression.AutoSklearnRegressor(
19-
time_left_for_this_task=60, per_run_time_limit=30,
17+
time_left_for_this_task=600,
2018
tmp_folder='/tmp/autoslearn_regression_example_tmp',
2119
output_folder='/tmp/autosklearn_regression_example_out')
2220
automl.fit(X_train, y_train, dataset_name='boston')

0 commit comments

Comments
 (0)