@@ -210,6 +210,32 @@ attribute.
210
210
211
211
Note this does not work together with the ``default=True `` or ``sparse=True `` arguments to the mapper.
212
212
213
+ Dropping columns explictly
214
+ *******************************
215
+
216
+ Sometimes it is required to drop a specific column/ list of columns.
217
+ For this purpose, ``drop_cols `` argument for ``DataFrameMapper `` can be used.
218
+ Default value is ``None ``
219
+
220
+ >>> mapper_df = DataFrameMapper([
221
+ ... (' pet' , sklearn.preprocessing.LabelBinarizer()),
222
+ ... ([' children' ], sklearn.preprocessing.StandardScaler())
223
+ ... ], drop_cols= [' salary' ])
224
+
225
+ Now running ``fit_transform `` will run transformations on 'pet' and 'children' and drop 'salary' column:
226
+
227
+ >>> np.round(mapper_df.fit_transform(data.copy()), 1 )
228
+ array([[ 1. , 0. , 0. , 0.2],
229
+ [ 0. , 1. , 0. , 1.9],
230
+ [ 0. , 1. , 0. , -0.6],
231
+ [ 0. , 0. , 1. , -0.6],
232
+ [ 1. , 0. , 0. , -1.5],
233
+ [ 0. , 1. , 0. , -0.6],
234
+ [ 1. , 0. , 0. , 1. ],
235
+ [ 0. , 0. , 1. , 0.2]])
236
+
237
+ Transformations may require multiple input columns. In these
238
+
213
239
Transform Multiple Columns
214
240
**************************
215
241
@@ -395,7 +421,7 @@ The stacking of the sparse features is done without ever densifying them.
395
421
396
422
397
423
Using ``NumericalTransformer ``
398
- ****************************
424
+ ***********************************
399
425
400
426
While you can use ``FunctionTransformation `` to generate arbitrary transformers, it can present serialization issues
401
427
when pickling. Use ``NumericalTransformer `` instead, which takes the function name as a string parameter and hence
@@ -419,8 +445,15 @@ can be easily serialized.
419
445
420
446
Changelog
421
447
---------
448
+ 2.0.1 (2020-09-07)
449
+ ******************
450
+
451
+ * Added an option to explicitly drop columns.
452
+
453
+
422
454
2.0.0 (2020-08-01)
423
455
******************
456
+
424
457
* Deprecated support for Python < 3.6.
425
458
* Deprecated support for old versions of scikit-learn, pandas and numpy. Please check setup.py for minimum requirement.
426
459
* Removed CategoricalImputer, cross_val_score and GridSearchCV. All these functionality now exists as part of
@@ -430,32 +463,39 @@ Changelog
430
463
* Added ``NumericalTransformer `` for common numerical transformations. Currently it implements log and log1p
431
464
transformation.
432
465
* Added prefix and suffix options. See examples above. These are usually helpful when using gen_features.
466
+ * Added ``drop_cols `` argument to DataframeMapper. This can be used to explicitly drop columns
433
467
434
468
435
469
1.8.0 (2018-12-01)
436
470
******************
471
+
437
472
* Add ``FunctionTransformer `` class (#117).
438
473
* Fix column names derivation for dataframes with multi-index or non-string
439
474
columns (#166).
440
475
* Change behaviour of DataFrameMapper's fit_transform method to invoke each underlying transformers'
441
476
native fit_transform if implemented. (#150)
442
477
478
+
443
479
1.7.0 (2018-08-15)
444
480
******************
481
+
445
482
* Fix issues with unicode names in ``get_names `` (#160).
446
483
* Update to build using ``numpy==1.14 `` and ``python==3.6 `` (#154).
447
484
* Add ``strategy `` and ``fill_value `` parameters to ``CategoricalImputer `` to allow imputing
448
485
with values other than the mode (#144), (#161).
449
486
* Preserve input data types when no transform is supplied (#138).
450
487
488
+
451
489
1.6.0 (2017-10-28)
452
490
******************
491
+
453
492
* Add column name to exception during fit/transform (#110).
454
493
* Add ``gen_feature `` helper function to help generating the same transformation for multiple columns (#126).
455
494
456
495
457
496
1.5.0 (2017-06-24)
458
497
******************
498
+
459
499
* Allow inputting a dataframe/series per group of columns.
460
500
* Get feature names also from ``estimator.get_feature_names() `` if present.
461
501
* Attempt to derive feature names from individual transformers when applying a
@@ -466,6 +506,7 @@ Changelog
466
506
467
507
1.4.0 (2017-05-13)
468
508
******************
509
+
469
510
* Allow specifying a custom name (alias) for transformed columns (#83).
470
511
* Capture output columns generated names in ``transformed_names_ `` attribute (#78).
471
512
* Add ``CategoricalImputer `` that replaces null-like values with the mode
@@ -543,3 +584,4 @@ Other contributors:
543
584
* Timothy Sweetser (@hacktuarial)
544
585
* Vitaley Zaretskey (@vzaretsk)
545
586
* Zac Stewart (@zacstewart)
587
+ * Parul Singh (@paro1234)
0 commit comments