You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"In MissingnessEncoder, elements have type $(col_type). The supported types are `Char`, `AbstractString`, and `Number`"
3
+
VALID_TYPES_NEW_VAL_ME(possible_col_type) =
4
+
"In MissingnessEncoder, label_for_missing keys have type $(possible_col_type). The supported types are `Char`, `AbstractString`, and `Number`"
5
+
COLLISION_NEW_VAL_ME(value) =
6
+
"In MissingnessEncoder, label_for_missing specifies new feature name $(value). However, this name already exists in one of the features. Please respecify label_for_missing."
dictionary where the possible values for keys are the types in `Char`, `AbstractString`, and `Number` and where each value
108
+
signifies the new level to map into given a column raw super type. By default, if the raw type of the column subtypes `AbstractString`
109
+
then missing values will be replaced with `"missing"` and if the raw type subtypes `Char` then the new value is `'m'`
110
+
and if the raw type subtypes `Number` then the new value is the lowest value in the column - 1.
111
+
112
+
# Operations
113
+
114
+
- `transform(mach, Xnew)`: Apply cardinality reduction to selected `Multiclass` or `OrderedFactor` features of `Xnew` specified by hyper-parameters, and
115
+
return the new table. Features that are neither `Multiclass` nor `OrderedFactor`
116
+
are always left unchanged.
117
+
118
+
# Fitted parameters
119
+
120
+
The fields of `fitted_params(mach)` are:
121
+
122
+
- `label_for_missing_given_feature`: A dictionary that for each column, maps `missing` into some value according to `label_for_missing`
123
+
124
+
# Report
125
+
126
+
The fields of `report(mach)` are:
127
+
128
+
- `encoded_features`: The subset of the categorical features of X that were encoded
129
+
130
+
# Examples
131
+
132
+
```julia
133
+
import StatsBase.proportionmap
134
+
using MLJ
135
+
136
+
# Define a table with missing values
137
+
Xm = (
138
+
A = categorical(["Ben", "John", missing, missing, "Mary", "John", missing]),
139
+
B = [1.85, 1.67, missing, missing, 1.5, 1.67, missing],
0 commit comments