Skip to content

Commit 647621a

Browse files
committed
✅ Replace ScientificTypes.SupportedTypes with its definition
1 parent 8212fd5 commit 647621a

File tree

3 files changed

+10
-9
lines changed

3 files changed

+10
-9
lines changed

src/transformers/cardinality_reducer/cardinality_reducer.jl

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ include("errors.jl")
77
Fit a transformer that maps any level of a categorical column that occurs with
88
frequency < `min_frequency` into a new level (e.g., "Other"). This is useful when some categorical columns have
99
high cardinality and many levels are infrequent. This assumes that the categorical columns have raw
10-
types that are in `ScientificTypes.SupportedTypes` (e.g., Number, AbstractString, Char).
10+
types that are in `Union{Char, AbstractString, Number}`.
1111
1212
# Arguments
1313
@@ -19,7 +19,7 @@ types that are in `ScientificTypes.SupportedTypes` (e.g., Number, AbstractString
1919
- `min_frequency::Real=3`: Any level of a categorical column that occurs with frequency < `min_frequency` will be mapped to a new level. Could be
2020
an integer or a float which decides whether raw counts or normalized frequencies are used.
2121
- `label_for_infrequent=Dict{<:Type, <:Any}()= Dict( AbstractString => "Other", Char => 'O', )`: A
22-
dictionary where the possible values for keys are the types in `ScientificTypes.SupportedTypes` and each value signifies
22+
dictionary where the possible values for keys are the types in `Union{Char, AbstractString, Number}` and each value signifies
2323
the new level to map into given a column raw super type. By default, if the raw type of the column subtypes `AbstractString`
2424
then the new value is `"Other"` and if the raw type subtypes `Char` then the new value is `'O'`
2525
and if the raw type subtypes `Number` then the new value is the lowest value in the column - 1.
@@ -41,6 +41,7 @@ function cardinality_reducer_fit(
4141
Char => 'O',
4242
),
4343
)
44+
supportedtypes = Union{Char, AbstractString, Number}
4445

4546
# 1. Define column mapper
4647
function feature_mapper(col, name)
@@ -50,13 +51,13 @@ function cardinality_reducer_fit(
5051

5152
# Ensure column type is valid (can't test because never occurs)
5253
# Converting array elements to strings before wrapping in a `CategoricalArray`, as...
53-
if !(col_type <: ScientificTypes.SupportedTypes)
54+
if !(col_type <: supportedtypes)
5455
throw(ArgumentError(UNSUPPORTED_COL_TYPE(col_type)))
5556
end
5657

5758
# Ensure label_for_infrequent keys are valid types
5859
for possible_col_type in keys(label_for_infrequent)
59-
if !(possible_col_type in union_types(ScientificTypes.SupportedTypes))
60+
if !(possible_col_type in union_types(supportedtypes))
6061
throw(ArgumentError(VALID_TYPES_NEW_VAL(possible_col_type)))
6162
end
6263
end
@@ -70,7 +71,7 @@ function cardinality_reducer_fit(
7071

7172
# Get ancestor type of column
7273
elgrandtype = nothing
73-
for allowed_type in union_types(ScientificTypes.SupportedTypes)
74+
for allowed_type in union_types(supportedtypes)
7475
if col_type <: allowed_type
7576
elgrandtype = allowed_type
7677
break

src/transformers/cardinality_reducer/errors.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
UNSUPPORTED_COL_TYPE(col_type) =
2-
"In CardinalityReducer, elements have type $(col_type). The supported types are $(ScientificTypes.SupportedTypes)"
2+
"In CardinalityReducer, elements have type $(col_type). The supported types are `Union{Char, Number, AbstractString}`"
33
VALID_TYPES_NEW_VAL(possible_col_type) =
4-
"In CardinalityReducer, label_for_infrequent keys have type $(possible_col_type). The supported types are $(ScientificTypes.SupportedTypes)"
4+
"In CardinalityReducer, label_for_infrequent keys have type $(possible_col_type). The supported types are `Union{Char, Number, AbstractString}`"
55
COLLISION_NEW_VAL(value) =
66
"In CardinalityReducer, label_for_infrequent specifies new column name $(value). However, this name already exists in one of the columns. Please respecify label_for_infrequent."
77
UNSPECIFIED_COL_TYPE(col_type, label_for_infrequent) =

src/transformers/cardinality_reducer/interface_mlj.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ $(MMI.doc_header(CardinalityReducer))
8787
`CardinalityReducer` maps any level of a categorical column that occurs with
8888
frequency < `min_frequency` into a new level (e.g., "Other"). This is useful when some categorical columns have
8989
high cardinality and many levels are infrequent. This assumes that the categorical columns have raw
90-
types that are in `ScientificTypes.SupportedTypes` (e.g., Number, AbstractString, Char).
90+
types that are in `Union{AbstractString, Char, Number}`.
9191
9292
9393
# Training data
@@ -112,7 +112,7 @@ Train the machine using `fit!(mach, rows=...)`.
112112
- `min_frequency::Real=3`: Any level of a categorical column that occurs with frequency < `min_frequency` will be mapped to a new level. Could be
113113
an integer or a float which decides whether raw counts or normalized frequencies are used.
114114
- `label_for_infrequent=Dict{<:Type, <:Any}()= Dict( AbstractString => "Other", Char => 'O', )`: A
115-
dictionary where the possible values for keys are the types in `ScientificTypes.SupportedTypes` and each value signifies
115+
dictionary where the possible values for keys are the types in `Union{Char, AbstractString, Number}` and each value signifies
116116
the new level to map into given a column raw super type. By default, if the raw type of the column subtypes `AbstractString`
117117
then the new value is `"Other"` and if the raw type subtypes `Char` then the new value is `'O'`
118118
and if the raw type subtypes `Number` then the new value is the lowest value in the column - 1.

0 commit comments

Comments
 (0)