  1. 中国科学院大学工程科学学院,北京 100049
  • 收稿日期:2016-08-28 修回日期:2016-12-08 出版日期:2017-04-30 发布日期:2017-04-30
Bursty topic detection method for microblog based on influence from user behaviors

WAN Yue, SUI Jie   

  1. School of Engineering Science, University of Chinese Academy of Science, Beijing 100049, China
  • Received:2016-08-28 Revised:2016-12-08 Online:2017-04-30 Published:2017-04-30

摘要: 考虑到微博数据存在时序性特征以及包含用户的社交网络行为特征,提出一种动量信号增强模型算法来有效地检测微博突发话题.由于传统模型未考虑微博数据变化以及用户社交行为的影响,为此首次提出影响力因子以及热度因子,用以修正动量模型.为获取影响力因子,将计算出当前时点前给定周期内的数据对当前数据的变化差值的指数累计影响作为影响力的衡量标准,以体现词频在该区间段的重要性.影响力因子将用于修正词频序列,以获取MACD值指标.由于用户的社交行为对话题产生影响巨大,进而提出热度因子用以修正MACD值指标.当模型满足指标阈值时,特征词则列为突发特征词.最后,通过K-means聚类算法将特征词进行归类合并,以获取突发话题.实验结果表明,模型精度能达到81.82%,表现良好.

关键词: 突发话题, 信号增强, 热度, 影响力, 动量, 聚类

Abstract: Social networks are becoming more and more popular where people can post anything anytime. Due to the huge user community, social network data is increasing with each passing day. Therefore how to explore the knowledge in huge data seems to be hard work. As microblog has time-related characteristics and social network behavior attributes, momentum signal enhancement model is put forward to detect bursty microblog topics effectively. Influence factor and hot energy factor are put forward to improve the momentum model. The influence factor uses the data before the current point but within a given period to calculate the

Key words: bursty topic, momentum, signal enhancement, hot energy; influence
