You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AutoNormalize is a Python library for automated datatable normalization, intended for use with [Feature Tools](https://github.com/Featuretools/featuretools). AutoNormalize allows you to build an `EntitySet` from a single denormalized table and generate features for machine learning.
5
+
AutoNormalize is a Python library for automated datatable normalization, intended for use with [Featuretools](https://github.com/Featuretools/featuretools). AutoNormalize allows you to build an `EntitySet` from a single denormalized table and generate features for machine learning.
`df` (pd.Dataframe) : the dataframe containing data
35
+
*`df` (pd.Dataframe) : the dataframe containing data
34
36
35
-
`accuracy` (0 < float <= 1.00; default = 0.98) : the accuracy threshold required in order to conclude a dependency (i.e. with accuracy = 0.98, 0.98 of the rows must hold true the dependency LHS --> RHS)
37
+
*`accuracy` (0 < float <= 1.00; default = 0.98) : the accuracy threshold required in order to conclude a dependency (i.e. with accuracy = 0.98, 0.98 of the rows must hold true the dependency LHS --> RHS)
36
38
37
-
`index` (str, optional) : name of column that is intended index of df
39
+
*`index` (str, optional) : name of column that is intended index of df
38
40
39
-
`name` (str, optional) : the name of created EntitySet
41
+
*`name` (str, optional) : the name of created EntitySet
40
42
41
-
`time_index` (str, optional) : name of time column in the dataframe.
43
+
*`time_index` (str, optional) : name of time column in the dataframe.
42
44
43
-
Returns:
45
+
**Returns:**
44
46
45
-
`entityset` (ft.EntitySet) : created entity set
47
+
*`entityset` (ft.EntitySet) : created entity set
46
48
47
49
<br />
48
50
51
+
#### `find_dependencies`
52
+
49
53
```shell
50
54
find_dependencies(df, accuracy=0.98, index=None)
51
55
```
52
56
Finds dependencies within dataframe with the DFD search algorithm.
53
57
54
-
Returns:
58
+
**Returns:**
55
59
56
-
`dependencies` (Dependencies) : the dependencies found in the data within the contraints provided
60
+
*`dependencies` (Dependencies) : the dependencies found in the data within the contraints provided
57
61
58
62
<br />
59
63
64
+
#### `normalize_dataframe`
65
+
60
66
```shell
61
67
normalize_dataframe(df, dependencies)
62
68
```
@@ -65,20 +71,20 @@ Normalizes dataframe based on the dependencies given. Keys for the newly created
65
71
2) has "id" in some form in the name of an attribute
66
72
3) has attribute furthest to left in the table
67
73
68
-
Returns:x
69
-
70
-
`new_dfs` (list[pd.DataFrame]) : list of new dataframes
74
+
**Returns:**
75
+
*`new_dfs` (list[pd.DataFrame]) : list of new dataframes
Creates a normalized EntitySet from dataframe based on the dependencies given. Keys are chosen in the same fashion as for `normalize_dataframe`and a new index will be created if any key has more than a single attribute.
0 commit comments