中国科学技术大学学报 ›› 2016, Vol. 46 ›› Issue (1): 36-46.DOI: 10.3969/j.issn.0253-2778.2016.01.006

• 论著 • 上一篇    

MCDS:大规模移动通信数据计算的单机实现

刘志鹏   

  1. 南京邮电大学计算机学院,江苏南京 210003
  • 收稿日期:2015-08-27 修回日期:2015-09-29 接受日期:2015-09-29 出版日期:2015-09-29 发布日期:2015-09-29
  • 作者简介:刘志鹏,男1980年生,博士/讲师.研究方向:数据挖掘.E-mail:liuzhipengcs@139.com
  • 基金资助:
    国家自然科学基金(61473001,71071045,71131002);安徽大学青年科学研究基金(33050054)资助.

MCDS: Large-scale mobile communication data computation on just a PC

LIU Zhipeng   

  1. School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
  • Received:2015-08-27 Revised:2015-09-29 Accepted:2015-09-29 Online:2015-09-29 Published:2015-09-29

摘要: 移动数据具有数量庞大、类型多样、时效性强和高价值等特点.移动通信数据是一种重要的移动数据,对高效地存储和访问移动通信数据进行研究,并在此基础上更加有效地开展移动数据挖掘的相关研究,具有重大现实意义.当前,使用并行数据挖掘技术进行数据挖掘得到普遍认可,但并行数据挖掘技术需要较高的硬件成本,并行算法代码调试和优化较为困难.为此提出大规模移动通信数据的单机实现MCDS(mobile communication data processing system).MCDS基于GraphChi,改进了数据格式、分片机制、数据分片换入换出机制.实验结果验证了MCDS的有效性,为移动数据挖掘提供了切实可行的实验环境.

关键词: 移动数据, 移动数据挖掘, 移动通信数据, MCDS

Abstract: Mobile data has the characteristics of high volume, variety, velocity and value. Mobile communication data is an important part of mobile data, and it has great research value. It is of tremendous significance to efficiently store and retrieve mobile data. At present, utilizing parallel technology to perform data mining has become the main stream, but the technology is very costly in terms of hardware, and code debugging and optimization of parallel algorithms is difficult. A mobile communication data processing system operational on a single PC was proposed. MCDS is based on GraphChi, and improves GraphChi from 3 aspects: data format, sharding mechanism and memory replacement algorithm. Experimental results verify the effectiveness of MCDS, and it provides a feasible experimental environment for mobile communication data mining.

Key words: mobile data, mobile data mining, mobile communication data, MCDS

中图分类号: