A hierarchical classification model for class-imbalanced data

doi:10.3969/j.issn.0253-2778.2015.01.010

Abstract

Abstract: Traditional machine learning methods have lower classification performance when dealing with class imbalanced data. A hierarchical classification model for class imbalanced data was thus proposed. With an AdaBoost classifier as its basis classifier, the model builds mathematical models by the features and false positive rates of the classifier, and demonstrates that parameters of the hierarchical classification model could be calculated. First, the hierarchical classification tree was as the structure, and then the classification cost of the hierarchical classification tree mode was obtained as well as a quantitative and mathematical description of the features of each layer. Finally, the classification cost could be converted to a optimization problem, and the solving process of the optimization problem was given. Meanwhile, results of the hierarchical classification are presented. Experiments have been conducted on UCI dataset, and the results show that the proposed method has higher AUC and F-measure compared to many existing class-imbalanced learning methods.

Key words: machine learning, class-imbalanced, hierarchical classification, feature, evaluation criteria

CLC Number:

TP391

SHI Peibei, LIU Guiquan, WANG Zhong, WEI Bing. A hierarchical classification model for class-imbalanced data[J]. Journal of University of Science and Technology of China, 2015, 45(1): 61-68.

References

［1］ Phua C, Alahakoon D, Lee V. Minority report in fraud detection: classification of skewed data［J］. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 50-59.
［2］ Sun A X, Lim E P, Liu Y. On strategies for imbalanced text classification using SVM: A comparative study［J］. Decision Support Systems, 2009, 48(1): 191-201.
［3］ Turney P D. Learning algorithms for key phrase extraction［J］. Information Retrieval, 2000, 2(4): 303-336.
［4］ Burez J, van den Poel D. Handling class imbalance in customer churn prediction［J］. Expert Systems with Applications, 2009, 36(3): 4 626-4 636.
［5］ Brekke C, Solberg A H S. Oil spill detection by satellite remote sensing［J］. Remote sensing of environment, 2005, 95(1): 1-13.
［6］ Plant C, Bhm C, Tilg B, et al. Enhancing instance-based classification with local density: a new algorithm for classifying unbalanced biomedical data［J］. Bioinformatics, 2006, 22(8): 981-988.
［7］ Branch J W, Giannella C, Szymanski B, et al. In-network outlier detection in wireless sensor networks［J］. Knowledge and information systems, 2013, 34(1): 23-54.
［8］ Sahbi H, Geman D. A hierarchy of support vector machines for pattern detection［J］. Journal of Machine Learning Research, 2006, 7: 2 087-2 123.
［9］ Blake C, Keogh E, Merz C J. UCI repository of machine learning databases［EB/OL］. http://www.ics.uci.edu/_mlearn/MLRepository.html.
［10］ Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: Synthetic minority over-sampling technique［J］. Journal of Artificial Intelligence Research, 2002, 16: 321-357.
［11］ Han H, Wang W Y, Mao B H. Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning［C］//Advances in Intelligent Computing. Berlin Heidelberg, Germany: Springer, 2005: 878-887.
［12］ Liu A, Ghosh J, Martin C E. Generative oversampling for mining imbalanced datasets［C］// Proceedings of International Conference on Data Mining. Las Vegas, USA: IEEE Press, 2007: 66-72.
［13］ Batista G E, Prati R C, Monard M C. A study of the behavior of several methods for balancing machine learning training data［J］. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 20-29.
［14］ Weiss G M, Provost F J. Learning when training data are costly: The effect of class distribution on tree induction［J］. Journal of Artificial Intelligence Research, 2003, 19: 315-354.
［15］ Domingos P. MetaCost: A general method for making classifiers cost-sensitive［C］// Proceedings of the International Conference on Knowledge Discovery and Data Mining. San Diego, USA: ACM Press, 1999: 155-164.
［16］ Chen C, Liaw A, Breiman L. Using random forest to learn imbalanced data［R］. TR666, Statistics Department, University of California at Berkeley, 2004.
［17］ Chew H G, Bogner R E, Lim C C. Dual ν-support vector machine with error rate and training size biasing［C］// Proceedings of the 26th International Conference on Acoustics, Speech and Signal Processing. Salt Lake City, USA: IEEE Press, 2001, 2: 1 269-1 272.
［18］ Raskutti B, Kowalczyk A. Extreme re-balancing for SVMs: A case study［J］. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 60-69.
［19］ Juszczak P, Duin R P W. Uncertainty sampling methods for one-class classifiers［C］// Proceedings of International Conference on Machine Learning. Washington, USA: IEEE Press, 2003: 81-88.
［20］ Zhou Z H, Liu X Y. Training cost-sensitive neural networks with methods addressing the class imbalance problem［J］. IEEE Transactions on Knowledge and Data Engineering, 2006, 18(1): 63-77.
［21］ Liu X Y, Wu J X, Zhou Z H. Exploratory undersampling for class-imbalance learning［J］. IEEE Transactions on Systems, Man, and Cybernetics, 2009, 39(2): 539-550.
［22］ Galar M, Fernandez A, Barrenechea E, et al. A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches［J］. IEEE Transactions on Systems, Man, and Cybernetics, 2012, 42(4): 463-484.
［23］ Viola P, Jones M. Rapid object detection using a boosted cascade of simple features［C］// Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. London, IEEE Press, 2001, 1: I-511-518.
［24］ Liu X Y, Li Q Q, Zhou Z H. Learning imbalanced multi-class data with optimal dichotomy weights［C］// IEEE 13th International Conference on Data Mining. Omaha, USA: IEEE Press, 2013: 478-487.
［25］ Drummond C, Holte R C. C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling［EB/OL］. http://www.site.uottawa.ca/~nat/Workshop2003/drummondc.pdf.

[1]	Ma Yulian, Cui Wenquan. An end-to-end multitask method with two targets for high-frequency price movement prediction [J]. Journal of University of Science and Technology of China, 2021, 51(3): 246-258.
[2]	GAO Xiang, CHEN Li. Group stochastic gradient descent: A tradeoff between straggler and staleness [J]. Journal of University of Science and Technology of China, 2020, 50(8): 1156-1161.
[3]	. UAV target tracking based on visual attention mechanism [J]. Journal of University of Science and Technology of China, 2020, 50(8): 1162-1169.
[4]	LI Yongjun, CAO Weihua, LING Qiang. A multi-target tracking algorithm based on feature point trajectories [J]. Journal of University of Science and Technology of China, 2020, 50(6): 726-732.
[5]	LI Jiancong, WANG Tairan, SHU Wu, HU Sulei, OUYANG Runhai, LI Weixue. AI-based descriptor for predicting alloy formation energy [J]. Journal of University of Science and Technology of China, 2020, 50(6): 844-851.
[6]	LONG Fei, HUANG Kun, LI Feng. A novel mapping between machine learning and phase transition [J]. Journal of University of Science and Technology of China, 2020, 50(1): 18-28.
[7]	LIU Yan, RAO Yuan. A new method node importance evaluation based on multi-domain topology characteristics in complex networks [J]. Journal of University of Science and Technology of China, 2019, 49(7): 533-543.
[8]	SONG Hui, YANG Ming. Mixed linear matrix completion model based on auxiliary information [J]. Journal of University of Science and Technology of China, 2019, 49(7): 572-578.
[9]	SONG Hui, YANG Ming. Mixed linear matrix completion model based on auxiliary information [J]. Journal of University of Science and Technology of China, 2019, 49(2): 159-165.
[10]	WANG Zhong, CHEN Enhong, LIU Guiquan. Feature fusion-based face verification on second generation identity card [J]. Journal of University of Science and Technology of China, 2019, 49(10): 828-834.
[11]	TAN Jiali, HE Yu, WU Yanjing, SUN Guangzhong. Dialogue matching prediction model applied in campus psychological counseling [J]. Journal of University of Science and Technology of China, 2018, 48(9): 739-747.
[12]	LI Yongming, XIAO Jie, WANG Pin, YAN Fang. Heart physiological and pathological age estimation based on wrapper deviation regression [J]. Journal of University of Science and Technology of China, 2018, 48(9): 762-769.
[13]	YAN Fei, WANG Xiaodong. Unsupervised feature selection method based on adaptive locality preserving projection [J]. Journal of University of Science and Technology of China, 2018, 48(4): 290-297.
[14]	ZHANG Huimin, YANG Ming, LV Jing. Multifeature hyperspectral image classification based on adaptive kernel joint sparse representation [J]. Journal of University of Science and Technology of China, 2018, 48(4): 298-306.
[15]	WANG Zesheng, DONG Baotian,LUO Wenhui. Research on flow-limiting facility optimization in rail transit stations based on optical feature descriptor [J]. Journal of University of Science and Technology of China, 2018, 48(4): 341-346.