Skip to content

Commit d4f456f

Browse files
committed
DOC replace nominal by categorical when needed
1 parent d676f94 commit d4f456f

File tree

7 files changed

+20
-18
lines changed

7 files changed

+20
-18
lines changed

doc/metrics.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ Value Difference Metric
8787
The class :class:`~imblearn.metrics.pairwise.ValueDifferenceMetric` is
8888
implementing the Value Difference Metric proposed in
8989
:cite:`stanfill1986toward`. This measure is used to compute the proximity
90-
of two samples composed of only nominal values.
90+
of two samples composed of only categorical values.
9191

9292
Given a single feature, categories with similar correlation with the target
9393
vector will be considered closer. Let's give an example to illustrate this

doc/over_sampling.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -212,9 +212,9 @@ columns are belonging to the same categories originally presented without any
212212
other extra interpolation.
213213

214214
However, :class:`SMOTENC` is only working when data is a mixed of numerical and
215-
categorical features. If data are made of only nominal categorical data, one
216-
can use the :class:`SMOTEN` variant :cite:`chawla2002smote`. The algorithm
217-
changes in two ways:
215+
categorical features. If data are made of only categorical data, one can use
216+
the :class:`SMOTEN` variant :cite:`chawla2002smote`. The algorithm changes in
217+
two ways:
218218

219219
* the nearest neighbors search does not rely on the Euclidean distance. Indeed,
220220
the value difference metric (VDM) also implemented in the class

doc/whats_new/v0.8.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,11 +16,12 @@ New features
1616
:pr:`780` by :user:`Aurélien Massiot <AurelienMassiot>`.
1717

1818
- Add the class :class:`imblearn.metrics.pairwise.ValueDifferenceMetric` to
19-
compute pairwise distances between samples containing only nominal values.
19+
compute pairwise distances between samples containing only categorical
20+
values.
2021
:pr:`796` by :user:`Guillaume Lemaitre <glemaitre>`.
2122

2223
- Add the class :class:`imblearn.over_sampling.SMOTEN` to over-sample data
23-
only containing nominal categorical features.
24+
only containing categorical features.
2425
:pr:`802` by :user:`Guillaume Lemaitre <glemaitre>`.
2526

2627
Enhancements

imblearn/metrics/pairwise.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,9 @@
1414
class ValueDifferenceMetric(BaseEstimator):
1515
r"""Class implementing the Value Difference Metric.
1616
17-
This metric computes the distance between samples containing only nominal
18-
features. The distance between feature values of two samples
19-
is defined as:
17+
This metric computes the distance between samples containing only
18+
categorical features. The distance between feature values of two samples is
19+
defined as:
2020
2121
.. math::
2222
\delta(x, y) = \sum_{c=1}^{C} |p(c|x_{f}) - p(c|y_{f})|^{k} \ ,

imblearn/over_sampling/_adasyn.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ class ADASYN(BaseOverSampler):
5252
5353
SMOTENC : Over-sample using SMOTE for continuous and categorical features.
5454
55-
SMOTEN : Over-sample using the SMOTE variant specifically for nominal
55+
SMOTEN : Over-sample using the SMOTE variant specifically for categorical
5656
features only.
5757
5858
SVMSMOTE : Over-sample using SVM-SMOTE variant.

imblearn/over_sampling/_random_over_sampler.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ class RandomOverSampler(BaseOverSampler):
7676
7777
SMOTENC : Over-sample using SMOTE for continuous and categorical features.
7878
79-
SMOTEN : Over-sample using the SMOTE variant specifically for nominal
79+
SMOTEN : Over-sample using the SMOTE variant specifically for categorical
8080
features only.
8181
8282
SVMSMOTE : Over-sample using SVM-SMOTE variant.

imblearn/over_sampling/_smote.py

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -450,7 +450,7 @@ class SVMSMOTE(BaseSMOTE):
450450
451451
SMOTENC : Over-sample using SMOTE for continuous and categorical features.
452452
453-
SMOTEN : Over-sample using the SMOTE variant specifically for nominal
453+
SMOTEN : Over-sample using the SMOTE variant specifically for categorical
454454
features only.
455455
456456
BorderlineSMOTE : Over-sample using Borderline-SMOTE.
@@ -648,7 +648,7 @@ class SMOTE(BaseSMOTE):
648648
--------
649649
SMOTENC : Over-sample using SMOTE for continuous and categorical features.
650650
651-
SMOTEN : Over-sample using the SMOTE variant specifically for nominal
651+
SMOTEN : Over-sample using the SMOTE variant specifically for categorical
652652
features only.
653653
654654
BorderlineSMOTE : Over-sample using the borderline-SMOTE variant.
@@ -743,7 +743,7 @@ def _fit_resample(self, X, y):
743743
class SMOTENC(SMOTE):
744744
"""Synthetic Minority Over-sampling Technique for Nominal and Continuous.
745745
746-
Unlike :class:`SMOTE`, SMOTE-NC for dataset containing continuous and
746+
Unlike :class:`SMOTE`, SMOTE-NC for dataset containing numerical and
747747
categorical features. However, it is not designed to work with only
748748
categorical features.
749749
@@ -774,7 +774,7 @@ class SMOTENC(SMOTE):
774774
--------
775775
SMOTE : Over-sample using SMOTE.
776776
777-
SMOTEN : Over-sample using the SMOTE variant specifically for nominal
777+
SMOTEN : Over-sample using the SMOTE variant specifically for categorical
778778
features only.
779779
780780
SVMSMOTE : Over-sample using SVM-SMOTE variant.
@@ -1068,7 +1068,7 @@ class KMeansSMOTE(BaseSMOTE):
10681068
10691069
SMOTENC : Over-sample using SMOTE for continuous and categorical features.
10701070
1071-
SMOTEN : Over-sample using the SMOTE variant specifically for nominal
1071+
SMOTEN : Over-sample using the SMOTE variant specifically for categorical
10721072
features only.
10731073
10741074
SVMSMOTE : Over-sample using SVM-SMOTE variant.
@@ -1272,9 +1272,10 @@ def _fit_resample(self, X, y):
12721272
random_state=_random_state_docstring,
12731273
)
12741274
class SMOTEN(SMOTE):
1275-
"""Perform SMOTE over-sampling for nominal categorical features only.
1275+
"""Synthetic Minority Over-sampling Technique for Nominal.
12761276
1277-
This method is refered as SMOTEN in [1]_.
1277+
This method is refered as SMOTEN in [1]_. It expects that the data to
1278+
resample are only made of categorical features.
12781279
12791280
Read more in the :ref:`User Guide <smote_adasyn>`.
12801281

0 commit comments

Comments
 (0)