中国科学技术大学学报 ›› 2019, Vol. 49 ›› Issue (7): 555-563.DOI: 10.3969/j.issn.0253-2778.2019.07.005

• 原创论文 • 上一篇    下一篇

AnomalyDetect:一种基于欧式距离的在线异常检测算法

霍文君   

  1. 1.同济大学电子与信息工程学院,上海 200092; 2.华东师范大学数据科学与工程学院, 上海 200062
  • 收稿日期:2018-09-25 修回日期:2018-12-04 出版日期:2019-07-31 发布日期:2019-07-31
  • 通讯作者: 王伟
  • 作者简介:霍文君,女,1994年生,硕士生.研究方向: 分布式计算. E-mail: huowj@tongji.edu.cn
  • 基金资助:
    国家自然科学基金(61672384),同济大学中央高校基本科研业务费专项资金(0800219373)资助.

AnomayDetect:An online distance-based anomaly detection algorithm

HUO Wenjun   

  1. 1. Department of Computer Science and Engineering, Tongji University, Shanghai 200092, China; 2. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
  • Received:2018-09-25 Revised:2018-12-04 Online:2019-07-31 Published:2019-07-31

摘要: 异常检测是数据挖掘中的一项关键技术,在计算机和互联网领域有广泛的应用,包括网络安全、图像识别、智能运维等,特别是智能运维,近几年取得了长足的发展.已有的异常检测算法会有低准确度、离线、无法自动更新等问题.为此对智能运维背景下的真实异常检测问题进行研究,构建高准确度、在线、通用异常检测算法,并据此在已有时间序列异常检测算法的基础上,提出了一种新的基于欧式距离的在线异常检测算法.通过实际的运维时序数据实验,发现该算法可以实时快速准确地检测流式时间序列数据中的异常数据,验证了该算法的有效性.

关键词: 异常检测, 时间序列, 在线算法, 欧式距离, 智能运维

Abstract: Anomaly detection is a key challenge in data mining which has a wide range of applications in the field of the Internet, including network security, image recognition and intelligent operation. In particular, intelligent operation has made great progress in recent years. Existing anomaly detection algorithms have many problems, such as low accuracy and inability to update automatically. The problem of anomaly detection in the context of intelligent operation and a practical need for high-accuracy, online and universal anomaly detection algorithms is studied. Based on the existing algorithms, an online distance-based anomaly detection algorithm is identified. Through the experiments on Yahoo Web-scope S5 dataset it is shown that the algorithm can detect anomalies successfully. A comparative study of several anomaly detectors verifies the effectiveness of the proposed algorithm.

Key words: anomaly detection, time series, online algorithm, euclidean distance, intelligent operation