Skip to content

Commit 49c4617

Browse files
committed
ADD example on how to use feat_type
1 parent e87e812 commit 49c4617

File tree

3 files changed

+60
-2
lines changed

3 files changed

+60
-2
lines changed

autosklearn/estimators.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -388,7 +388,7 @@ def fit(self, X, y,
388388
List of str of `len(X.shape[1])` describing the attribute type.
389389
Possible types are `Categorical` and `Numerical`. `Categorical`
390390
attributes will be automatically One-Hot encoded. The values
391-
used for a categorical attribute must be integers, obtainde for
391+
used for a categorical attribute must be integers, obtained for
392392
example by `sklearn.preprocessing.LabelEncoder
393393
<http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html>`_.
394394

doc/manual.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ aspects of its usage:
2020
* `Parallel usage <https://github.com/automl/auto-sklearn/blob/master/example/example_parallel.py>`_
2121
* `Sequential usage <https://github.com/automl/auto-sklearn/blob/master/example/example_sequential.py>`_
2222
* `Regression <https://github.com/automl/auto-sklearn/blob/master/example/example_regression.py>`_
23+
* `Continuous and Categorical Data <https://github.com/automl/auto-sklearn/blob/master/example/example_feature_types.py>`_
2324

2425
Time and memory limits
2526
======================
@@ -64,7 +65,7 @@ For a full list please have a look at the source code (in `autosklearn/pipeline/
6465
* `Regressors <https://github.com/automl/auto-sklearn/tree/master/autosklearn/pipeline/components/regression>`_
6566
* `Preprocessors <https://github.com/automl/auto-sklearn/tree/master/autosklearn/pipeline/components/feature_preprocessing>`_
6667

67-
Turning of preprocessing
68+
Turning off preprocessing
6869
~~~~~~~~~~~~~~~~~~~~~~~~
6970

7071
Preprocessing in *auto-sklearn* is divided into data preprocessing and

example/example_feature_types.py

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# -*- encoding: utf-8 -*-
2+
import sklearn.model_selection
3+
import sklearn.datasets
4+
import sklearn.metrics
5+
6+
import autosklearn.classification
7+
8+
try:
9+
import openml
10+
except ImportError:
11+
print("#"*80 + """
12+
To run this example you need to install openml-python:
13+
14+
git+https://github.com/renatopp/liac-arff
15+
# OpenML is currently not on pypi, use an old version to not depend on
16+
# scikit-learn 0.18
17+
requests
18+
xmltodict
19+
git+https://github.com/renatopp/liac-arff
20+
git+https://github.com/openml/""" +
21+
"openml-python@0b9009b0436fda77d9f7c701bd116aff4158d5e1\n""" +
22+
"#"*80)
23+
raise
24+
25+
26+
def main():
27+
# Load adult dataset from openml.org, see https://www.openml.org/t/2117
28+
openml.config.apikey = '610344db6388d9ba34f6db45a3cf71de'
29+
30+
task = openml.tasks.get_task(2117)
31+
train_indices, test_indices = task.get_train_test_split_indices()
32+
X, y = task.get_X_and_y()
33+
34+
X_train = X[train_indices]
35+
y_train = y[train_indices]
36+
X_test = X[test_indices]
37+
y_test = y[test_indices]
38+
39+
dataset = task.get_dataset()
40+
_, _, categorical_indicator = dataset.\
41+
get_data(target=task.target_name, return_categorical_indicator=True)
42+
43+
# Create feature type list from openml.org indicator and run autosklearn
44+
feat_type = ['categorical' if ci else 'numerical'
45+
for ci in categorical_indicator]
46+
47+
cls = autosklearn.classification.\
48+
AutoSklearnClassifier(time_left_for_this_task=120,
49+
per_run_time_limit=30)
50+
cls.fit(X_train, y_train, feat_type=feat_type)
51+
52+
predictions = cls.predict(X_test)
53+
print("Accuracy score", sklearn.metrics.accuracy_score(y_test, predictions))
54+
55+
56+
if __name__ == "__main__":
57+
main()

0 commit comments

Comments
 (0)