中国科学技术大学学报 ›› 2018, Vol. 48 ›› Issue (1): 65-74.DOI: 10.3969/j.issn.0253-2778.2018.01.009

• 论著 • 上一篇    下一篇

一种基于朴素贝叶斯的校准标签排序方法

张其龙,邓维斌,胡峰,瞿原,胡宗容   

  1. 重庆邮电大学计算智能重庆市重点实验室,重庆 400065
  • 收稿日期:2017-05-22 修回日期:2017-06-23 出版日期:2018-01-01 发布日期:2018-01-01
  • 通讯作者: 胡峰
  • 作者简介:张其龙,男,1989年生,硕士生.研究方向:计算智能、数据挖掘.E-mail:814150638@qq.com
  • 基金资助:
    国家自然科学基金(61473001,71071045,71131002)资助.

A calibrated lable ranking method based on naive Bayes

ZHANG Qilong, DENG Weibin, HU Feng, QU Yuan, HU Zongrong   

  1. Chongqing Key Laboratory of Computational Intelligence ,Chongqing University of Posts and Telecommunications ,Chongqing 400065 ,China)
  • Received:2017-05-22 Revised:2017-06-23 Online:2018-01-01 Published:2018-01-01

摘要: 传统的校准标签排序算法(calibrated label ranking, CLR)利用成对标签关联进行转化来预测结果. 该算法的校准是在二元关系算法(binary relevance, BR)基础上进行比较产生结果,其预测对BR产生结果具有一定的依赖性, 因此该算法在预测某些数据集时具有一定的局限性.为了更好地区分标签的相关性和不相关性,提出了一种用于标签边界域的校准方法,对处于相关性标签和不相关性标签的边界部分采用贝叶斯概率进一步校正,从而提高边界域部分分类的准确性.基于朴素贝叶斯校准的标签排序方法(calibrated lable ranking method based on naive bayes, NBCLRM)与校准标签排序等7种传统的方法进行对比,实验结果表明, 本文提出的算法不仅可以根据需求修改阈值ε和μ来调节预测结果, 而且能够有效地提升传统多标签学习方法的性能.

关键词: 数据挖掘, 朴素贝叶斯, 校准标签排序算法, 多标签学习算法

Abstract: The traditional calibrated label ranking algorithm (calibrated label ranking, CLR) uses pairs of label associations to transform and predict results. Its algorithmic calibration is achievely comparing it with the basis of binary relevance (BR). Its prediction has a certain dependence on the results of BR, thus incurring some limitations on the prediction of some datasets. To better distinguish between the relevance and irrelevance of the label, a method is presented for calibrating label boundary regions, which further corrects the boundary portion of the relevant label and the irrelevant label using Bayesian probability, thereby improving the accuracy of the classification of the boundary domain. CLR method based on naive Bayes(NBCLRM) presented is compared with seven traditional methods such as calibrated label ranking. Experimental results show that the proposed algorithm can not only adjust prediction results by modifying the thresholds ε and μ, but also effectively improve the performance of traditional multi-label learning methods.

Key words: data mining, Naive Bayes, calibrated label ranking, multi-label learning algorithm

中图分类号: