Ensemble max-pooling: Is only the maximum activation useful when pooling

doi:10.3969/j.issn.0253-2778.2017.10.001

Abstract

Abstract: The pooling layer in convolutional neural networks performs subsampling on the basis of the local correlation principle, reducing the data size while keeping useful information in order to improve generalization, and effectively increase receptive fields simultaneously. The winner-take-all strategy is used in classical max-pooling, which will affect the generalization of the network sometimes. A simple and effective pooling method named ensemble max-pooling was introduced, which can replace the pooling layer in conventional convolutional neural networks. In each pooling region, ensemble max-pooling drops the neuron with maximum activation with probability p, and outputs the neuron with second largest activation. Ensemble max-pooling can be viewed as an ensemble of many basic underlying networks, and it can also be viewed as the classical max-pooling with some local distortion of the input. The results achieved are better than classical pooling methods and other related pooling approaches. DFN-MR is derived from ResNet, compared with which it has more basic underlying networks and avoids very deep networks. By keeping other hyperparameters unchanged, and replacing each convolutional layer in DFN-MR with a tandem form, i.e., a combination of an ensemble max-pooling layer and a convolutional layer with stride 1, it is shown to deliver significant gains in performance.

Key words: convolutional neural network, pooling layer, network ensemble, data augmentation

CLC Number:

TP183

ZHANG Hao, WU Jianxin,. Ensemble max-pooling: Is only the maximum activation useful when pooling[J]. Journal of University of Science and Technology of China, 2017, 47(10): 799-807.

References

［1］

ZEILER M D, FERGUS R. Stochastic pooling for regularization of deep convolutional neural networks [J]. Eprint, 2013: arXiv:1301.3557.
[2] HUANG Yuchi, SUN Xiuyu, LU Ming, et al. Channel-max, channel-drop and stochastic max-pooling [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Boston, USA: IEEE, 2015: 9-17.
[3] CAI Meng, SHI Yongzhe, LIU Jia. Stochastic pooling maxout networks for low-resource speech recognition [C]// Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. Florence, Italy: IEEE, 2014: 3266-3270.
[4] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778.
[5] VEIT A, WILBER M, BELONGIE S. Residual networks are exponential ensembles of relatively shallow networks [EB/OL]. [2017-02-14] https://arxiv.org/abs/1605.06431v1.
[6] ZHAO Liming, WANG Jingdong, LI Xi, et al. On the connection of deep fusion to ensembling[EB/OL]. [2017-02-14] https://arxiv.org/abs/1611.07718.
[7] WU Haibing, GU Xiaodong. Max-pooling dropout for regularization of convolutional neural networks [C]// Proceedings of the International Conference on Neural Information Processing. Berlin: Springer, 2015: 46-54.
[8] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Identity mappings in deep residual networks [C]// Proceedings of the 14th European Conference on Computer Vision. Berlin: Springer, 2016: 630-645.
[9] SIMARD P Y, STEINKRAUS D, PLATT J C, et al. Best practices for convolutional neural networks applied to visual document analysis [C]// Proceedings of the International Conference on Document Analysis and Recognition. Washington: IEEE Computer Society, 2003: 958-962.
[10] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[11] JIA Y Q, SHELHAMER E, DONAHUE J, et al. Caffe: Convolutional architecture for fast feature embedding [C]// Proceedings of the 22nd ACM International Conference on Multimedia. Orlando, USA: ACM, 2014: 675-678.
[12] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [J]. International Conference on Neural Information Processing Systems, 2012, 25(2): 1097-1105.
[13] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [J]. Eprint, 2015: arXiv:1409.1556.
[14] DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database [C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009: 248-255.
[15] KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images[J]. Eprint, 2009: arXiv:1011.1669v3.
[16] GEMAN S, BIENENSTOCK E, DOURSAT R. Neural networks and the bias/variance dilemma [J]. Neural computation, 1992, 4(1): 1-58.
[17] HUANG Gao, SUN Yu, LIU Zhuang, et al. Deep networks with stochastic depth [C]// Proceedings of the 14th European Conference on Computer Vision. Berlin: Springer, 2016: 646-661.
[18] CHEN Tianqi, LI Mu, LI Yutian, et al. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems [J]. Eprint, 2015: arXiv:1512.01274.
[19] DANIELS H, KAMP B, VERKOOIJEN W. Application of Neural Networks to House Pricing and Bond Rating [M]. Tilburg University, 1997.
[20] COBHAM A. The intrinsic computational difficulty of functions [J]. International Congress for Logic, 1969, 31(1): 43-52.
[21] EDMONDS J. Paths, trees, and flowers [J]. Canadian Journal of Mathematics, 2009, 17(3):361-379.
[22] IOFFE S, SZEGEDY C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [C]// Proceedings of the 32nd International Conference on Machine Learning. Lille, France: ACM, 2015: 448-456.
[23] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification [C]// Proceedings of the 27th International Conference on Computer Vision. Santiago, USA: ACM, 2015: 1026-1034.

()
(

[1]	WANG Yue, LI Jing. Research on optimization method of convolutional neural network based on visualization [J]. Journal of University of Science and Technology of China, 2020, 50(7): 959-967.
[2]	DU Shuying, DU Peng, DING Shifei. A malicious domain name detection method based on CNN [J]. Journal of University of Science and Technology of China, 2020, 50(7): 1019-1025.
[3]	XIONG Junlin, ZHAO Duo. Two-stage grasping detection for robots based on RGB images [J]. Journal of University of Science and Technology of China, 2020, 50(1): 1-10.
[4]	ZENG Weihui, LI Miao, ZHANG Jian, HUANG Xiaoping, WANG Jingxian, YUAN Yuan. Research on high-order residual convolution neural network for crop disease recognition application [J]. Journal of University of Science and Technology of China, 2019, 49(10): 781-790.
[5]	YANG Ziwen, CHEN Lei, PU Jianyu. Recognizing emotions from abstract paintings using convolutional neural network with two-layer transfer learning scheme [J]. Journal of University of Science and Technology of China, 2019, 49(1): 40-48.
[6]	SUN Dachang, BI Xiuchung. High-frequency trading strategies based on deep learning algorithms and their profitability [J]. Journal of University of Science and Technology of China, 2018, 48(11): 923-932.
[7]	CHEN Dongjie, ZHANG Wensheng, YANG Yang. Detection and recognition of high-speed railway catenary locator based on Deep Learning [J]. Journal of University of Science and Technology of China, 2017, 47(4): 320-327.