01 Apr 01:09

chukarsten

v0.49.0

Enhancements

Added use_covariates parameter to ARIMARegressor #3407
AutoMLSearch will set use_covariates to False for ARIMA when dataset is large #3407
Add ability to retrieve logical types to a component in the graph via get_component_input_logical_types #3428
Add ability to get logical types passed to the last component via last_component_input_logical_types #3428

Fixes

Fix conda build after PR 3407 #3429

Changes

Moved model understanding metrics from graph.py into a separate file #3417
Unpin click dependency #3420
For IterativeAlgorithm, put time series algorithms first #3407
Use prophet-prebuilt to install prophet in extras #3407

Breaking Changes

Moved model understanding metrics from graph.py to metrics.py #3417

Assets 2

28 Mar 23:27

chukarsten

v0.48.0

v0.48.0 Mar. 28, 2022

Enhancements

Replaced pipeline_parameters and custom_hyperparameters with search_parameters in AutoMLSearch #3373
Add support for oversampling in time series classification problems #3387

Fixes

Fixed TimeSeriesFeaturizer to make it deterministic when creating and choosing columns #3384
Fixed bug where Email/URL features with missing values would cause the imputer to error out #3388

Changes

Update maintainers to add Frank #3382
Allow woodwork version 0.14.0 to be installed #3381
Moved partial dependence functions from graph.py to a separate file #3404
Pin click at 8.0.4 due to incompatibility with black #3413

Documentation Changes

Added automl user guide section covering search algorithms #3394
Updated broken links and automated broken link detection #3398
Upgraded nbconvert #3402, #3411

Testing Changes

Updated scheduled workflows to only run on Alteryx owned repos (#3395)
Exclude documentation versions other than latest from broken link check #3401

Breaking Changes

Moved partial dependence functions from graph.py to partial_dependence.py #3404

Assets 2

17 Mar 20:55

chukarsten

v0.47.0

v0.47.0 Mar. 17, 2022

Enhancements

Added TimeSeriesFeaturizer into ARIMA-based pipelines #3313
Added caching capability for ensemble training during AutoMLSearch #3257
Added new error code for zero unique values in NoVarianceDataCheck #3372

Fixes

Fixed get_pipelines to reset pipeline threshold for binary cases #3360

Changes

Update maintainers #3365

Documentation Changes

Fixed documentation links to point to correct pages #3358

Testing Changes

Checkout main branch in build_conda_pkg job #3375

Assets 2

03 Mar 19:01

chukarsten

v0.46.0

v0.46.0 Mar. 3, 2022

Enhancements

Added test_size parameter to ClassImbalanceDataCheck #3341
Make target optional for NoVarianceDataCheck #3339

Changes

Removed python_version<3.9 environment marker from sktime dependency #3332
Updated DatetimeFormatDataCheck to return all messages and not return early if NaNs are detected #3354

Documentation Changes

Added in-line tabs and copy-paste functionality to documentation, overhauled Install page #3353

Assets 2

18 Feb 21:12

chukarsten

v0.45.0

v0.45.0 Feb. 18, 2022

Enhancements

Added support for pandas >= 1.4.0 #3324
Standardized feature importance for estimators #3305
Replaced usage of private method with Woodwork's public get_subset_schema method #3325

Fixes

Changes

Added an is_cv property to the datasplitters used #3297
Changed SimpleImputer to ignore Natural Language columns #3324
Added drop NaN component to some time series pipelines #3310

Documentation Changes

Update README.md with Alteryx link (#3319)
Added formatting to the AutoML user guide to shorten result outputs #3328

Testing Changes

Add auto approve dependency workflow schedule for every 30 mins #3312

Assets 2

04 Feb 19:30

chukarsten

v0.44.0

v0.44.0 Feb. 4, 2022

Enhancements

Updated DefaultAlgorithm to also limit estimator usage for long-running multiclass problems #3099
Added make_pipeline_from_data_check_output() utility method #3277
Added more specific data check errors to DatetimeFormatDataCheck #3288

Fixes

Updated the binary classification pipeline's optimize_thresholds method to use Nelder-Mead #3280
Fixed bug where feature importance on time series pipelines only showed 0 for time index #3285

Changes

Removed DateTimeNaNDataCheck and NaturalLanguageNaNDataCheck in favor of NullDataCheck #3260
Drop support for Python 3.7 #3291
Updated minimum version of woodwork to v0.12.0 #3290

Documentation Changes

Update documentation and docstring for validate_holdout_datasets for time series problems #3278
Fixed mistake in documentation where wrong objective was used for calculating percent-better-than-baseline #3285

Testing Changes

Breaking Changes

Removed DateTimeNaNDataCheck and NaturalLanguageNaNDataCheck in favor of NullDataCheck #3260
Dropped support for Python 3.7 #3291

Assets 2

25 Jan 19:31

angela97lin

v0.43.0

v0.43.0 Jan. 25, 2022

Enhancements

Updated new NullDataCheck to return a warning and suggest an action to impute columns with null values #3197
Updated make_pipeline_from_actions to handle null column imputation #3237
Updated data check actions API to return options instead of actions and add functionality to suggest and take action on columns with null values #3182

Fixes

Fixed categorical data leaking into non-categorical sub-pipelines in DefaultAlgorithm #3209
Fixed Python 3.9 installation for prophet by updating pmdarima version in requirements #3268
Allowed DateTime columns to pass through PerColumnImputer without breaking #3267

Changes

Updated DataCheck validate() output to return a dictionary instead of list for actions #3142
Updated DataCheck validate() API to use the new DataCheckActionOption class instead of DataCheckAction #3152
Uncapped numba version and removed it from requirements #3263
Renamed HighlyNullDataCheck to NullDataCheck #3197
Updated data check validate() output to return a list of warnings and errors instead of a dictionary #3244
Capped pandas at < 1.4.0 #3274

Testing Changes

Bumped minimum IPython version to 7.16.3 in test-requirements.txt based on dependabot feedback #3269

Breaking Changes

Renamed HighlyNullDataCheck to NullDataCheck #3197
Updated data check validate() output to return a list of warnings and errors instead of a dictionary. See the Data Check or Data Check Actions pages (under User Guide) for examples. #3244
Removed impute_all and default_impute_strategy parameters from the PerColumnImputer #3267
Updated PerColumnImputer such that columns not specified in impute_strategies dict will not be imputed anymore #3267

Assets 2

20 Jan 17:01

chukarsten

v0.42.0

v0.42.0 Jan. 20, 2022

Enhancements

Required the separation of training and test data by gap + 1 units to be verified by time_index for time series problems #3208
Added support for boolean features for ARIMARegressor #3187
Updated dependency bot workflow to remove outdated description and add new configuration to delete branches automatically #3212
Added n_obs and n_splits to TimeSeriesParametersDataCheck error details #3246

Fixes

Fixed classification pipelines to only accept target data with the appropriate number of classes #3185
Added support for time series in DefaultAlgorithm #3177
Standardized names of featurization components #3192
Removed empty cell in text_input.ipynb #3234
Removed potential prediction explanations failure when pipelines predicted a class with probability 1 #3221
Dropped NaNs before partial dependence grid generation #3235
Allowed prediction explanations to be json-serializable #3262
Fixed bug where InvalidTargetDataCheck would not check time series regression targets #3251
Fixed bug in are_datasets_separated_by_gap_time_index #3256

Changes

Raised lowest compatible numpy version to 1.21.0 to address security concerns #3207
Changed the default objective to MedianAE from R2 for time series regression #3205
Removed all-nan Unknown to Double logical conversion in infer_feature_types #3196
Checking the validity of holdout data for time series problems can be performed by calling pipelines.utils.validate_holdout_datasets prior to calling predict #3208

Documentation Changes

Testing Changes

Breaking Changes

Renamed DateTime Featurizer Component to DateTime Featurizer and Natural Language Featurization Component to Natural Language Featurizer #3192

Assets 2

10 Jan 16:54

chukarsten

v0.41.0

v0.41.0 Jan. 10, 2022

Enhancements

Added string support for DataCheckActionCode #3167
Added DataCheckActionOption class #3134
Add issue templates for bugs, feature requests and documentation improvements for GitHub #3199

Fixes

Fix bug where prediction explanations class_name was shown as float for boolean targets #3179
Fixed bug in nightly linux tests #3189

Changes

Removed usage of scikit-learn's LabelEncoder in favor of ours #3161
Removed nullable types checking from infer_feature_types #3156
Fixed mean_cv_data and validation_score values in AutoMLSearch.rankings to reflect cv score or NaN when appropriate #3162

Documentation Changes

Testing Changes

Add workflow to auto-merge dependency PRs if status checks pass #3184

Assets 2

22 Dec 23:30

angela97lin

v0.40.0

v0.40.0 Dec. 22, 2021

❄️ ☃️ Happy holidays! ☃️ ❄️

Enhancements

Added TimeSeriesSplittingDataCheck to DefaultDataChecks to verify adequate class representation in time series classification problems #3141
Added the ability to accept serialized features and skip computation in DFSTransformer #3106
Added support for known-in-advance features #3149

Fixes

Fixed error caused when tuning threshold for time series binary classification #3140

Changes

TimeSeriesParametersDataCheck was added to DefaultDataChecks for time series problems #3139
Renamed date_index to time_index in problem_configuration for time series problems #3137
Updated nlp-primitives minimum version to 2.1.0 #3166
Updated minimum version of woodwork to v0.11.0 #3171

Documentation Changes

Added comments to provide clarity on doctests #3155

Testing Changes

Parameterized tests in test_datasets.py #3145

Assets 2