Journal of University of Science and Technology of China ›› 2017, Vol. 47 ›› Issue (7): 575-582.DOI: 10.3969/j.issn.0253-2778.2017.07.005

• Original Paper • Previous Articles     Next Articles

Adaptive ensemble classification algorithm for data streams based on information entropy

SUN Yange, WANG Zhihai, YUAN Jidong, BAI Yang   

  1. 1.School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China;
    2.School of Computer and Information Technology, Xinyang Normal University, Xingyang 464000, China
  • Received:2016-08-28 Revised:2016-12-08 Online:2017-07-31 Published:2017-07-31

Abstract: The processing of streaming data implies new requirements concerning limited amount of memory, small processing time, and one scan of incoming instances. Most of the approaches in the literature to deal with concept drift only focus on gradual or abrupt concept drift and have not addressed the problem of recurring concepts. Motivated by this challenge, an ensemble with internal change detection was proposed to enhance performance by exploring the recurring concepts. It is done by maintaining a pool of classifiers, which dynamically adds and removes classifiers in response to the change detector. The algorithm adopts a two window change detection model, which adopts the Jensen-Shannon divergence to measure the distance of the distributions between two consecutive windows. When a change is detected, the repository of stored historical concepts is checked for reuse. The proposed algorithm has been experimentally compared with the state-of-the-art algorithms on synthetic and real datasets. The results show the suitability of the proposed algorithm for different types of drift as well as static environments.

Key words: data streams, concept drift, ensemble classifier, entropy, recurring concepts

CLC Number: