Journal of University of Science and Technology of China ›› 2018, Vol. 48 ›› Issue (6): 477-485.DOI: 10.3969/j.issn.0253-2778.2018.06.006

• Original Paper • Previous Articles     Next Articles

Split and merge algorithm for Gaussian mixture model based on KS test

JIANG Shuoran, CHEN Yarui, QIN Zhifei, YANG Jucheng   

  1. Tianjin University of Science & Technology, College of Computer Science and Information Engineering, Tianjin 300457
  • Received:2017-09-20 Revised:2018-04-10 Accepted:2018-04-10 Online:2018-06-30 Published:2018-04-10

Abstract: Gaussian mixture model is a linear combination of finite numbers of independent Gaussian models. Estimating the number of components is an important research area. One class of algorithms based on the minimum description length determine the number of components by splitting and merging components during the iterations. Traditional algorithms use entropy ratio, KL divergence, model similarity as split and merge criteria. However, entropy ratio and KL divergence might result in excessive split because of their excessive sensitivity to sparse or concave models, and model similarity might result in excessive merge because of its inability to assess the merged models’ goodness of fitting Gaussian. In the iterations of algorithm, these excessive splitting and merging operations may cause oscillations. For these problems entropy ratio and KS test as split criteria, and models similarity and KS test were used as merge criteria, which be called problems, a split and merge algorithm for Gaussian mixture model based on KS test is proposed, with entropy ratio and KS test used as split criteria and model similarity and KS test as merge criteria. This algorithm is capable of preventing excessive split and merge, as validated by experiments conducted on seven datasets.

Key words: Gaussian mixture model, minimum description length, entropy ratio, KS test

CLC Number: