Skip to content

Commit 47a8b7a

Browse files
solegalliNicoGalli
andauthored
add transformer to combine features with reference variable (#226)
* Initial commit issue #68 - Class Sketch * Issue #68 initial commit * Issue #68 - Docstring, unit tests Draft and other yerbas. * #68 - code refactor and unit tests * #68 - formatting and MyPy errors fixes * #68 MyPy errors on tests directory fixes * #68 - Circle CI error fixes * fix grammar, rename new variable * expands docstrings * fix line lenght and indentation * remove whitespace * fixes whitespaces in docstrings * edits jupyter notebook * remove checks and fix typos * change wording in docs Co-authored-by: NicoGalli <[email protected]>
1 parent 6fcdb40 commit 47a8b7a

File tree

8 files changed

+1659
-2
lines changed

8 files changed

+1659
-2
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,7 @@ More resources will be added as they appear online!
9090

9191
### Variable Combinations:
9292
* MathematicalCombination
93+
* CombineWithReferenceFeature
9394

9495
### Feature Selection:
9596
* DropFeatures
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
CombineWithReferenceFeature
2+
===========================
3+
4+
API Reference
5+
-------------
6+
7+
.. autoclass:: feature_engine.creation.CombineWithReferenceFeature
8+
:members:
9+
10+
11+
Example
12+
-------
13+
14+
CombineWithReferenceFeature() combines a group of variables with a group of reference
15+
variables utilizing basic mathematical operations (subtraction, division, addition and
16+
multiplication), returning one or more additional features in the dataframe as a result.
17+
18+
In this example, we subtract 2 variables from the house prices dataset.
19+
20+
.. code:: python
21+
22+
import pandas as pd
23+
from sklearn.model_selection import train_test_split
24+
25+
from feature_engine.creation import CombineWithReferenceFeature
26+
27+
data = pd.read_csv('houseprice.csv').fillna(0)
28+
29+
X_train, X_test, y_train, y_test = train_test_split(
30+
data.drop(['Id', 'SalePrice'], axis=1),
31+
data['SalePrice'],
32+
test_size=0.3,
33+
random_state=0
34+
)
35+
36+
combinator = CombineWithReferenceFeature(
37+
variables_to_combine=['LotArea'],
38+
reference_variables=['LotFrontage'],
39+
operations = ['sub'],
40+
new_variables_names = ['LotPartial']
41+
)
42+
43+
combinator.fit(X_train, y_train)
44+
X_train_ = combinator.transform(X_train)
45+
46+
print(X_train_[["LotPartial","LotFrontage","LotArea"]].head())
47+
48+
.. code:: python
49+
50+
LotTotal LotFrontage LotArea
51+
64 9375.0 0.0 9375
52+
682 2887.0 0.0 2887
53+
960 7157.0 50.0 7207
54+
1384 9000.0 60.0 9060
55+
1100 8340.0 60.0 8400

docs/creation/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,5 @@ various mathematical and other methods.
99
.. toctree::
1010
:maxdepth: 2
1111

12-
MathematicalCombination
12+
MathematicalCombination
13+
CombineWithReferenceFeature

docs/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,7 @@ Mathematical Combination:
149149
~~~~~~~~~~~~~~~~~~~~~~~~~
150150

151151
- :doc:`creation/MathematicalCombination`: creates new variables by combining features with mathematical operations
152+
- :doc:`creation/CombineWithReferenceFeature`: creates variables with reference features through mathematical operations
152153

153154
Feature Selection:
154155
~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)