Journal of University of Science and Technology of China ›› 2017, Vol. 47 ›› Issue (10): 799-807.DOI: 10.3969/j.issn.0253-2778.2017.10.001

• Original Paper •     Next Articles

Ensemble max-pooling: Is only the maximum activation useful when pooling

ZHANG Hao, WU Jianxin,   

  1. 1. National Key Laboratory for Novel Software Technology, Nanjing 210023, China;
    2. Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China)
  • Received:2017-05-22 Revised:2017-06-24 Online:2017-10-31 Published:2017-10-31

Abstract: The pooling layer in convolutional neural networks performs subsampling on the basis of the local correlation principle, reducing the data size while keeping useful information in order to improve generalization, and effectively increase receptive fields simultaneously. The winner-take-all strategy is used in classical max-pooling, which will affect the generalization of the network sometimes. A simple and effective pooling method named ensemble max-pooling was introduced, which can replace the pooling layer in conventional convolutional neural networks. In each pooling region, ensemble max-pooling drops the neuron with maximum activation with probability p, and outputs the neuron with second largest activation. Ensemble max-pooling can be viewed as an ensemble of many basic underlying networks, and it can also be viewed as the classical max-pooling with some local distortion of the input. The results achieved are better than classical pooling methods and other related pooling approaches. DFN-MR is derived from ResNet, compared with which it has more basic underlying networks and avoids very deep networks. By keeping other hyperparameters unchanged, and replacing each convolutional layer in DFN-MR with a tandem form, i.e., a combination of an ensemble max-pooling layer and a convolutional layer with stride 1, it is shown to deliver significant gains in performance.

Key words: convolutional neural network, pooling layer, network ensemble, data augmentation

CLC Number: