Reproducibility issue

Hello,

I noticed that results are not reproducible by using the library i.e. when using sklearn drop-down-replacement classes, they will each time produce slightly different results.

For example, when using:
```python
features_engineer = AutoFeatClassifier()
features_engineer.fit_transform(data_train.data, data_train.target.value)
```
, it will calculate (or select) different features each time.

The issue above I temporarily fixed by using:
```python
 random.seed(seed)
 np.random.seed(seed)
```

, so that the outputs produced by `AutoFeatClassifier`  stay constant among runs.

However, when I tried using the following:
```python
selector = FeatureSelector(verbose=self.verbose, problem_type="classification", featsel_runs=5)
selector.fit_transform(df_indices, target)
```

, the above-mentioned seed setting trick didn't translate into desirable outcome - the selected features still change during runs...

Is there an easy fix to correct this? Somewhere in the source randomness must be introduced somewhere, damn.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducibility issue #43

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Reproducibility issue #43

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions