中国科学技术大学学报 ›› 2016, Vol. 46 ›› Issue (3): 247-252.DOI: 10.3969/j.issn.0253-2778.2016.03.010

• 论著 • 上一篇    

一种基于轨迹大数据离线挖掘与在线实时监测的出租车异常轨迹检测算法

韩博洋,汪兆洋,金蓓弘   

  1. 中国科学院软件所软件工程技术研究开发中心,北京 100190
  • 收稿日期:2015-09-12 修回日期:2015-12-29 接受日期:2015-12-29 出版日期:2015-12-29 发布日期:2015-12-29
  • 通讯作者: 金蓓弘
  • 作者简介:韩博洋,男,1993年生,硕士生. 研究方向:数据挖掘. E-mail: hanboyang15@otcaix.iscas.ac.cn

An anomaly detection algorithm for taxis based on trajectory data mining and online real-time monitoring

HAN Boyang, WANG Zhaoyang, JIN Beihong   

  1. Research and Development Center of Software Engineering and Technology, Institute of Software Chinese Academy of Sciences, Beijing 100190, China
  • Received:2015-09-12 Revised:2015-12-29 Accepted:2015-12-29 Online:2015-12-29 Published:2015-12-29

摘要: 以防止出租车欺诈绕路为例,提出一种基于出租车GPS时空轨迹数据离线挖掘与在线实时检测相结合的异常轨迹检测算法,获得快速反馈实时检测的结果.首先,将路网地图进行网格化切分并编号,用Pathlet方法优化常用的以GPS点组成的轨迹序列,并将轨迹通过匹配、补全等处理变换为Pathlet序列.然后,从大量出租车历史数据中,获得轨迹的Pathlet序列,并聚类得到起点与终点之间正常的K类轨迹.当实时轨迹需要被检测时,便与K类正常轨迹进行匹配,只需计算两段Pathlet序列的编辑距离,并同时考量时间和空间两个维度设定合理阈值,判断是否抛出异常.最后,基于北京地区2011年3月到5月出租车GPS轨迹的真实数据集进行了大量实验,对比了相关工作,印证了所提出算法的有效性和高效性.

关键词: GPS轨迹, 异常轨迹检测, Pathlet方法, 时空数据挖掘

Abstract: Taking the prevention of taxi frauds as a motivating example, an anomalous spatio-temporal trajectory detection method that combines offline mining and online detection was proposed. A city roadmap was partitioned into a grid based on the longitude and latitude, using Pathlet sequences to express taxi trajectories instead of the traditional GPS sequences. Then, K-racial classes’ normal sequences were clustered in the same origin-destination pair from history data sets. The incoming online GPS data was transformed into Pathlet sequences and matched with K-racial classes’ normal sequences. The distance was computed and scored. Distance along with spatial and temporal factors together forms the criterion for determing anomalous taxi trajectories. Finally, based on the real taxi GPS data sets in Beijing area during March, 2011 to May, 2011, experimental results indicate that the proposed method is able to detect online anomalous trajectories efficiently and quickly.

Key words: GPS trajectory, anomalous trajectory detection, pathlet method, spatio-temporal data mining

中图分类号: