Skip to content

Commit ef11cc8

Browse files
author
Guillaume Lemaitre
committed
Finish to update the doc
1 parent fbeb485 commit ef11cc8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+381
-418
lines changed

doc/api.rst

Lines changed: 20 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -15,20 +15,21 @@ Under-sampling methods
1515

1616
Classes
1717
-------
18+
.. currentmodule:: unbalanced_dataset
1819

1920
.. autosummary::
2021
:toctree: generated/
2122

22-
unbalanced_dataset.under_sampling.ClusterCentroids
23-
unbalanced_dataset.under_sampling.CondensedNearestNeighbour
24-
unbalanced_dataset.under_sampling.EditedNearestNeighbours
25-
unbalanced_dataset.under_sampling.RepeatedEditedNearestNeighbours
26-
unbalanced_dataset.under_sampling.InstanceHardnessThreshold
27-
unbalanced_dataset.under_sampling.NearMiss
28-
unbalanced_dataset.under_sampling.NeighbourhoodCleaningRule
29-
unbalanced_dataset.under_sampling.OneSidedSelection
30-
unbalanced_dataset.under_sampling.RandomUnderSampler
31-
unbalanced_dataset.under_sampling.TomekLinks
23+
under_sampling.ClusterCentroids
24+
under_sampling.CondensedNearestNeighbour
25+
under_sampling.EditedNearestNeighbours
26+
under_sampling.RepeatedEditedNearestNeighbours
27+
under_sampling.InstanceHardnessThreshold
28+
under_sampling.NearMiss
29+
under_sampling.NeighbourhoodCleaningRule
30+
under_sampling.OneSidedSelection
31+
under_sampling.RandomUnderSampler
32+
under_sampling.TomekLinks
3233

3334

3435
.. _over_sampling_ref:
@@ -42,12 +43,13 @@ Over-sampling methods
4243

4344
Classes
4445
-------
46+
.. currentmodule:: unbalanced_dataset
4547

4648
.. autosummary::
4749
:toctree: generated/
4850

49-
unbalanced_dataset.over_sampling.RandomOverSampler
50-
unbalanced_dataset.over_sampling.SMOTE
51+
over_sampling.RandomOverSampler
52+
over_sampling.SMOTE
5153

5254

5355
.. _combine_ref:
@@ -61,12 +63,13 @@ Combination of over- and under-sampling methods
6163

6264
Classes
6365
-------
66+
.. currentmodule:: unbalanced_dataset
6467

6568
.. autosummary::
6669
:toctree: generated/
6770

68-
unbalanced_dataset.combine.SMOTEENN
69-
unbalanced_dataset.combine.SMOTETomek
71+
combine.SMOTEENN
72+
combine.SMOTETomek
7073

7174
.. _ensemble_ref:
7275

@@ -79,9 +82,10 @@ Ensemble methods
7982

8083
Classes
8184
-------
85+
.. currentmodule:: unbalanced_dataset
8286

8387
.. autosummary::
8488
:toctree: generated/
8589

86-
unbalanced_dataset.ensemble.BalanceCascade
87-
unbalanced_dataset.ensemble.EasyEnsemble
90+
ensemble.BalanceCascade
91+
ensemble.EasyEnsemble

examples/combine/README.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
.. _combine_examples:
22

3-
Example using combine class methods
4-
===================================
3+
Examples using combine class methods
4+
====================================
55

6-
Combine examples.
6+
Combine methods mixed over- and under-sampling methods. Generally SMOTE is used for over-sampling while some cleaning methods (i.e., ENN and Tomek links) are used to under-sample.

examples/combine/plot_smote_enn.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
SMOTE + ENN
44
===========
55
6-
An illustration of the random SMOTE + ENN method.
6+
An illustration of the SMOTE + ENN method.
77
88
"""
99

@@ -33,9 +33,9 @@
3333
# Fit and transform x to visualise inside a 2D feature space
3434
X_vis = pca.fit_transform(X)
3535

36-
# Apply the random under-sampling
36+
# Apply SMOTE + ENN
3737
sm = SMOTEENN()
38-
X_resampled, y_resampled = sm.fit_transform(X, y)
38+
X_resampled, y_resampled = sm.fit_sample(X, y)
3939
X_res_vis = pca.transform(X_resampled)
4040

4141
# Two subplots, unpack the axes array immediately

examples/combine/plot_smote_tomek.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
SMOTE + Tomek
44
=============
55
6-
An illustration of the random SMOTE + Tomek method.
6+
An illustration of the SMOTE + Tomek method.
77
88
"""
99

@@ -33,9 +33,9 @@
3333
# Fit and transform x to visualise inside a 2D feature space
3434
X_vis = pca.fit_transform(X)
3535

36-
# Apply the random under-sampling
36+
# Apply SMOTE + Tomek links
3737
sm = SMOTETomek()
38-
X_resampled, y_resampled = sm.fit_transform(X, y)
38+
X_resampled, y_resampled = sm.fit_sample(X, y)
3939
X_res_vis = pca.transform(X_resampled)
4040

4141
# Two subplots, unpack the axes array immediately

examples/ensemble/README.txt

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
.. _ensemble_examples:
22

33
Example using ensemble class methods
4-
===================================
4+
====================================
55

6-
Ensemble examples.
6+
Under-sampling methods implies that samples of the majority class are lost during the balancing procedure.
7+
Ensemble methods offer an alternative to use most of the samples.
8+
In fact, an ensemble of balanced sets is created and used to later train any classifier.

examples/ensemble/plot_balance_cascade.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Balance cascade
44
===============
55
6-
An illustration of the balance cascade method.
6+
An illustration of the balance cascade ensemble method.
77
88
"""
99

@@ -34,9 +34,9 @@
3434
# Fit and transform x to visualise inside a 2D feature space
3535
X_vis = pca.fit_transform(X)
3636

37-
# Apply the random under-sampling
37+
# Apply Balance Cascade method
3838
bc = BalanceCascade()
39-
X_resampled, y_resampled = bc.fit_transform(X, y)
39+
X_resampled, y_resampled = bc.fit_sample(X, y)
4040
X_res_vis = []
4141
for X_res in X_resampled:
4242
X_res_vis.append(pca.transform(X_res))

examples/ensemble/plot_easy_ensemble.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,9 @@
3434
# Fit and transform x to visualise inside a 2D feature space
3535
X_vis = pca.fit_transform(X)
3636

37-
# Apply the random under-sampling
37+
# Apply Easy Ensemble
3838
ee = EasyEnsemble()
39-
X_resampled, y_resampled = ee.fit_transform(X, y)
39+
X_resampled, y_resampled = ee.fit_sample(X, y)
4040
X_res_vis = []
4141
for X_res in X_resampled:
4242
X_res_vis.append(pca.transform(X_res))

examples/over-sampling/README.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@
33
Example using over-sampling class methods
44
=========================================
55

6-
Over-sampling examples.
6+
Data balancing can be performed by over-sampling such that new samples are generated in the minority class to reach a given balancing ratio.

examples/over-sampling/plot_random_over_sampling.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,9 @@
3333
# Fit and transform x to visualise inside a 2D feature space
3434
X_vis = pca.fit_transform(X)
3535

36-
# Apply the random under-sampling
36+
# Apply the random over-sampling
3737
ros = RandomOverSampler()
38-
X_resampled, y_resampled = ros.fit_transform(X, y)
38+
X_resampled, y_resampled = ros.fit_sample(X, y)
3939
X_res_vis = pca.transform(X_resampled)
4040

4141
# Two subplots, unpack the axes array immediately

examples/over-sampling/plot_smote.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,9 @@
3333
# Fit and transform x to visualise inside a 2D feature space
3434
X_vis = pca.fit_transform(X)
3535

36-
# Apply the random under-sampling
36+
# Apply regular SMOTE
3737
sm = SMOTE(kind='regular')
38-
X_resampled, y_resampled = sm.fit_transform(X, y)
38+
X_resampled, y_resampled = sm.fit_sample(X, y)
3939
X_res_vis = pca.transform(X_resampled)
4040

4141
# Two subplots, unpack the axes array immediately

0 commit comments

Comments
 (0)