You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
3.[bSMOTE(1 & 2) - Borderline SMOTE of types 1 and 2](#ref9)
79
+
4.[SVM SMOTE - Support Vectors SMOTE](#ref10)
80
80
81
81
* Over-sampling followed by under-sampling
82
-
1.[SMOTE + Tomek links](ref12)
83
-
2.[SMOTE + ENN](ref11)
82
+
1.[SMOTE + Tomek links](#ref12)
83
+
2.[SMOTE + ENN](#ref11)
84
84
85
85
* Ensemble sampling
86
-
1.[EasyEnsemble](ref13)
87
-
2.[BalanceCascade](ref13)
86
+
1.[EasyEnsemble](#ref13)
87
+
2.[BalanceCascade](#ref13)
88
88
89
89
The different algorithms are presented in the [following notebook](https://github.com/fmfn/UnbalancedDataset/blob/master/examples/plot_unbalanced_dataset.ipynb).
90
90
@@ -94,27 +94,15 @@ References:
94
94
-----------
95
95
96
96
<aname="ref1"></a>[1]: I. Tomek, [“Two modifications of CNN,”](http://sci2s.ugr.es/keel/pdf/algorithm/articulo/1976-Tomek-IEEETSMC(2).pdf) In Systems, Man, and Cybernetics, IEEE Transactions on, vol. 6, pp 769-772, 2010.
97
-
98
97
<aname="ref2"></a>[2]: I. Mani, I. Zhang. [“kNN approach to unbalanced data distributions: a case study involving information extraction,”](http://web0.site.uottawa.ca:4321/~nat/Workshop2003/jzhang.pdf) In Proceedings of workshop on learning from imbalanced datasets, 2003.
99
-
100
98
<aname="ref3"></a>[3]: P. Hart, [“The condensed nearest neighbor rule,”](http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1054155&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D1054155) In Information Theory, IEEE Transactions on, vol. 14(3), pp. 515-516, 1968.
101
-
102
99
<aname="ref4"></a>[4]: M. Kubat, S. Matwin, [“Addressing the curse of imbalanced training sets: one-sided selection,”](http://sci2s.ugr.es/keel/pdf/algorithm/congreso/kubat97addressing.pdf) In ICML, vol. 97, pp. 179-186, 1997.
103
-
104
100
<aname="ref5"></a>[5]: J. Laurikkala, [“Improving identification of difficult small classes by balancing class distribution,”](http://sci2s.ugr.es/keel/pdf/algorithm/congreso/2001-Laurikkala-LNCS.pdf) Springer Berlin Heidelberg, 2001.
105
-
106
101
<aname="ref6"></a>[6]: D. Wilson, [“Asymptotic Properties of Nearest Neighbor Rules Using Edited Data,”](http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=4309137&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D4309137) In IEEE Transactions on Systems, Man, and Cybernetrics, vol. 2 (3), pp. 408-421, 1972.
107
-
108
102
<aname="ref7"></a>[7]: D. Smith, Michael R., Tony Martinez, and Christophe Giraud-Carrier. [“An instance level analysis of data complexity.”](http://axon.cs.byu.edu/papers/smith.ml2013.pdf) Machine learning 95.2 (2014): 225-256.
109
-
110
103
<aname="ref8"></a>[8]: N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer, [“SMOTE: synthetic minority over-sampling technique,”](https://www.jair.org/media/953/live-953-2037-jair.pdf) Journal of artificial intelligence research, 321-357, 2002.
111
-
112
104
<aname="ref9"></a>[9]: H. Han, W. Wen-Yuan, M. Bing-Huan, [“Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning,”](http://sci2s.ugr.es/keel/keel-dataset/pdfs/2005-Han-LNCS.pdf) Advances in intelligent computing, 878-887, 2005.
113
-
114
105
<aname="ref10"></a>[10]: H. M. Nguyen, E. W. Cooper, K. Kamei, [“Borderline over-sampling for imbalanced data classification,”](https://www.google.fr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0CDAQFjABahUKEwjH7qqamr_HAhWLthoKHUr0BIo&url=http%3A%2F%2Fousar.lib.okayama-u.ac.jp%2Ffile%2F19617%2FIWCIA2009_A1005.pdf&ei=a7zZVYeNDIvtasrok9AI&usg=AFQjCNHoQ6oC_dH1M1IncBP0ZAaKj8a8Cw&sig2=lh32CHGjs5WBqxa_l0ylbg) International Journal of Knowledge Engineering and Soft Data Paradigms, 3(1), pp.4-21, 2001.
115
-
116
106
<aname="ref11"></a>[11]: G. Batista, R. C. Prati, M. C. Monard. [“A study of the behavior of several methods for balancing machine learning training data,”](http://www.sigkdd.org/sites/default/files/issues/6-1-2004-06/batista.pdf) ACM Sigkdd Explorations Newsletter 6 (1), 20-29, 2004.
117
-
118
107
<aname="ref12"></a>[12]: G. Batista, B. Bazzan, M. Monard, [“Balancing Training Data for Automated Annotation of Keywords: a Case Study,”](http://www.icmc.usp.br/~gbatista/files/wob2003.pdf) In WOB, 10-18, 2003.
119
-
120
108
<aname="ref13"></a>[13]: X. Y. Liu, J. Wu and Z. H. Zhou, [“Exploratory Undersampling for Class-Imbalance Learning,”](http://cse.seu.edu.cn/people/xyliu/publication/tsmcb09.pdf) in IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 39, no. 2, pp. 539-550, April 2009.
0 commit comments