Journal of University of Science and Technology of China ›› 2017, Vol. 47 ›› Issue (4): 311-319.DOI: 10.3969/j.issn.0253-2778.2017.04.005

• Original Paper • Previous Articles     Next Articles

Application of sparse spectral clustering algorithm in high-dimensional data

XU Xueli, ZHAO Xuejing   

  1. School of Mathematics and Statistics, Lanzhou University, Lanzhou 730000, China
  • Received:2016-08-28 Revised:2016-12-08 Online:2017-04-30 Published:2017-04-30

Abstract: A new sparse spectral clustering algorithm——high-dimensional sparse spectral clustering based on partitioning around medoids (HSSPAM) was proposed, which takes advantage of the sparse similarity matrix in computation as well as the superiority of the PAM algorithm over K-means. To reduce or even eliminate the impact of “dimensionality curse” on high dimensional data processing, the high correlation filter (HCF) and the principal component analysis (PCA) method are also investigated in the algorithm. The proposed method has higher precision and more stable clustering results than the algorithms introduced in this paper for comparison in the real high-dimensional gene data under different clustering evaluation criteria.

Key words: clustering of high-dimensional data, sparse spectral clustering algorithm, dimension-reduction technique, block diagonal matrix, clustering evaluation index

CLC Number: