中国科学技术大学学报 ›› 2020, Vol. 50 ›› Issue (8): 1110-1115.DOI: 10.3969/j.issn.0253-2778.2020.08.010

• 论著 • 上一篇    下一篇

XmR控制图的异常点检测算法研究

陈丽芳,王荣杰,刘云庆,周旭   

  1. 1.华北理工大学理学院, 河北唐山 063000;2.河北省数据科学与应用重点实验室, 河北唐山 063000
  • 收稿日期:2020-04-29 修回日期:2020-08-06 接受日期:2020-08-06 出版日期:2020-08-31 发布日期:2020-08-06
  • 通讯作者: 周旭
  • 作者简介:陈丽芳,女,1973年生,博士/教授. 研究方向:机器学习、智能计算、数据挖掘. E-mail: hblg_clf@163.com
  • 基金资助:
    河北省自然科学基金(F2014209086)资助.

Research on outlier detection algorithm of XmR control chart

CHEN Lifang, WANG Rongjie, LIU Yunqing, ZHOU Xu   

  1. 1. College of Science, North China University of Technology, Tangshan 063000,China;2. Hebei Key Laboratory of Data Science and Application, Tangshan 063000, China
  • Received:2020-04-29 Revised:2020-08-06 Accepted:2020-08-06 Online:2020-08-31 Published:2020-08-06

摘要: 针对隔离森林异常点检测方法计算烦琐、耗时长等不足,提出基于XmR控制图的异常点检测算法.通过计算样本属性的单值均值、移动极差及其均值,绘制X图与mR图的控制界限和中心线,同时在图中绘制样本的单值属性;根据X图中超出界限的点对应的样本序号,与mR图中超出界限的点对应的样本序号加1,取并集,从数据中将其删除,然后将删除异常点后的数据代入CART、随机森林和支持向量机算法中进行实验验证.结果表明该方法与隔离森林方法相比具有更快的速度和更好的精度,为异常点检测提供了一种新的研究思路.

关键词: XmR控制图, 异常点检测, 控制界限, 中心线

Abstract: A novel outlier detection algorithm was proposed based on the XmR control chart to address the complicated calculation and its time-consuming method in detecting isolated forest anomalies. By calculating the single-valued mean, its moving range and average of the sample attributes, we can draw the control limits and centerlines of the X and mR charts, and the single-valued attributes of the samples in the chart. According to the points in the X chart that exceeds the limits Sample number, add 1 to the sample number corresponding to the point that exceeds the limit in the mR graph, we take the union and delete it from the data, and then replace them after the deletion of the anomaly point with the CART. We use the random forest and support vector machine algorithm for experimental validations. The results show that this method has a faster speed and better precisions compared with the isolation forest method, which provides a new research idea for outlier detection.

Key words: XmR control chart, outlier detection, control limit, centerline

中图分类号: