中国科学技术大学学报

• 原创论文 • 上一篇    

基于谱聚类带有节点特征的社区发现算法

唐风琴,2,丁文文   

  1. 1.淮北师范大学数学科学学院,安徽淮北 235000;2.兰州大学数学与统计学院,甘肃兰州 730000
  • 收稿日期:2017-10-28 修回日期:2018-01-03 出版日期:2018-02-28 发布日期:2018-02-28
  • 通讯作者: 唐风琴
  • 作者简介:唐风琴(通讯作者),女, 1983年生,博士生/讲师. 研究方向:统计机器学习. E-mail: tfq05@163.com
  • 基金资助:
    国家自然科学基金(11301236),安徽省自然科学基金(1608085QG169),安徽省高校自然科学研究重点项目(KJ2017A377, KJ2017A376)资助.

Community detection based on spectral clustering with node attributes

TANG Fengqin,2, DING Wenwen   

  1. 1. School of Mathematical Sciences,Huaibei Normal University,Huaibei 235000,China;
    2. School of Mathematics and Statistics,Lanzhou University,Lanzhou 730000,China)
  • Received:2017-10-28 Revised:2018-01-03 Online:2018-02-28 Published:2018-02-28
  • Contact: 唐风琴
  • About author:唐风琴(通讯作者),女, 1983年生,博士生/讲师. 研究方向:统计机器学习. E-mail: tfq05@163.com
  • Supported by:
    国家自然科学基金(11301236),安徽省自然科学基金(1608085QG169),安徽省高校自然科学研究重点项目(KJ2017A377, KJ2017A376)资助.

摘要: 提出一类基于谱聚类算法的带有节点特征的社区发现算法(SCSA),该算法首先将带有节点特征的网络图转化为加权图,其中边的权重用节点特征相似度度量,然后将谱聚类算法应用到加权图上进行社区检测. SCSA算法将带有节点特征的网络图分成K个社区,每个社区内节点不仅连接良好而且具有相似的特征属性.注意到不是所有节点的特征在社区划分过程中都是有用的,与划分无关的特征信息会降低社区发现算法的准确度.为此,提出了一类节点特征权重自调整机制嵌入到谱聚类中以提高社区检测质量. 数值实验的结果验证了所提算法的有效性.

关键词: 社区发现, 谱聚类, 随机分块模型, 归一化互信息

Abstract: A community detection approach (SCSA) based on the spectral clustering method that combines both structural information and node attributes information was proposed.Firstly,the SCSA algorithm converted the node-attributed graph to a weighted graph,where the edge weights are measured by attribute similarities.Then,the spectral clustering was applied on the weighted graph.The SCSA algorithm partitioned a network associated with attributes into K communities in which the nodes are not only well connected but also have similar attributes.Notice that not all attributes are useful in the clustering process,and irrelevant attributes can lower the overall accuracy of community detection by adding noise.To address this issue,an attribute weight self-adjustment mechanism embedded into spectral clustering was proposed in order to improve the community detection quality.Experiments demonstrate the effectiveness of the proposed algorithm.

Key words: community detection, spectral clustering, stochastic block model, normalized mutual information