集成最大汇合: 最大汇合时只有最大值有用吗

doi:10.3969/j.issn.0253-2778.2017.10.001

中国科学技术大学学报 ›› 2017, Vol. 47 ›› Issue (10): 799-807.DOI: 10.3969/j.issn.0253-2778.2017.10.001

• 论著 • 下一篇

集成最大汇合: 最大汇合时只有最大值有用吗

张皓，吴建鑫，

1.计算机软件新技术国家重点实验室, 江苏南京 210023;
2.南京大学计算机科学与技术系, 江苏南京 210023

收稿日期:2017-05-22 修回日期:2017-06-24 出版日期:2017-10-31 发布日期:2017-10-31
通讯作者: 吴建鑫
作者简介:张皓，男，1990年生，博士生，研究方向：机器学习与计算机视觉. E-mail: zhangh@lamda.nju.edu.cn

Ensemble max-pooling: Is only the maximum activation useful when pooling

ZHANG Hao, WU Jianxin,

1. National Key Laboratory for Novel Software Technology, Nanjing 210023, China;
2. Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China）

Received:2017-05-22 Revised:2017-06-24 Online:2017-10-31 Published:2017-10-31

摘要/Abstract

摘要： 卷积神经网络中的汇合层基于局部相关性原理进行亚采样, 在减少数据量的同时保留有用信息, 从而有助于提升泛化能力. 同时, 汇合层可以有效提高感受野. 经典的最大汇合采用赢者通吃策略, 这有时会影响网络的泛化能力. 为此提出集成最大汇合,用于替代传统卷积神经网络中的汇合层. 在每个局部汇合区域, 集成最大汇合以p的概率使输出最大的神经元失活, 激活输出第二大的神经元. 集成最大汇合可以看作多个基础潜在网络的集成, 也可以理解为一种输入经历一定局部形变下的经典最大汇合过程. 实验结果表明, 相比经典汇合方法及其他相关汇合方法, 集成最大汇合取得了更好的性能. DFN-MR是近期主流结构ResNet的一个衍生, 相比ResNet, DFN-MR有着更多的基础潜在网络数目，同时避免了极深网络. 保持其他超参数不变, 通过将DFN-MR中步长为2的卷积层改为集成最大汇合串联步长为1的卷积层的结构, 可以使网络性能得到显著提高.

关键词: 卷积神经网络, 汇合层, 网络集成, 数据扩充

Abstract: The pooling layer in convolutional neural networks performs subsampling on the basis of the local correlation principle, reducing the data size while keeping useful information in order to improve generalization, and effectively increase receptive fields simultaneously. The winner-take-all strategy is used in classical max-pooling, which will affect the generalization of the network sometimes. A simple and effective pooling method named ensemble max-pooling was introduced, which can replace the pooling layer in conventional convolutional neural networks. In each pooling region, ensemble max-pooling drops the neuron with maximum activation with probability p, and outputs the neuron with second largest activation. Ensemble max-pooling can be viewed as an ensemble of many basic underlying networks, and it can also be viewed as the classical max-pooling with some local distortion of the input. The results achieved are better than classical pooling methods and other related pooling approaches. DFN-MR is derived from ResNet, compared with which it has more basic underlying networks and avoids very deep networks. By keeping other hyperparameters unchanged, and replacing each convolutional layer in DFN-MR with a tandem form, i.e., a combination of an ensemble max-pooling layer and a convolutional layer with stride 1, it is shown to deliver significant gains in performance.

Key words: convolutional neural network, pooling layer, network ensemble, data augmentation

中图分类号:

TP183

张皓，吴建鑫，. 集成最大汇合: 最大汇合时只有最大值有用吗[J]. 中国科学技术大学学报, 2017, 47(10): 799-807.

ZHANG Hao, WU Jianxin,. Ensemble max-pooling: Is only the maximum activation useful when pooling[J]. Journal of University of Science and Technology of China, 2017, 47(10): 799-807.

参考文献

［1］

ZEILER M D, FERGUS R. Stochastic pooling for regularization of deep convolutional neural networks [J]. Eprint, 2013: arXiv:1301.3557.
[2] HUANG Yuchi, SUN Xiuyu, LU Ming, et al. Channel-max, channel-drop and stochastic max-pooling [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Boston, USA: IEEE, 2015: 9-17.
[3] CAI Meng, SHI Yongzhe, LIU Jia. Stochastic pooling maxout networks for low-resource speech recognition [C]// Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. Florence, Italy: IEEE, 2014: 3266-3270.
[4] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016: 770-778.
[5] VEIT A, WILBER M, BELONGIE S. Residual networks are exponential ensembles of relatively shallow networks [EB/OL]. [2017-02-14] https://arxiv.org/abs/1605.06431v1.
[6] ZHAO Liming, WANG Jingdong, LI Xi, et al. On the connection of deep fusion to ensembling[EB/OL]. [2017-02-14] https://arxiv.org/abs/1611.07718.
[7] WU Haibing, GU Xiaodong. Max-pooling dropout for regularization of convolutional neural networks [C]// Proceedings of the International Conference on Neural Information Processing. Berlin: Springer, 2015: 46-54.
[8] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Identity mappings in deep residual networks [C]// Proceedings of the 14th European Conference on Computer Vision. Berlin: Springer, 2016: 630-645.
[9] SIMARD P Y, STEINKRAUS D, PLATT J C, et al. Best practices for convolutional neural networks applied to visual document analysis [C]// Proceedings of the International Conference on Document Analysis and Recognition. Washington: IEEE Computer Society, 2003: 958-962.
[10] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[11] JIA Y Q, SHELHAMER E, DONAHUE J, et al. Caffe: Convolutional architecture for fast feature embedding [C]// Proceedings of the 22nd ACM International Conference on Multimedia. Orlando, USA: ACM, 2014: 675-678.
[12] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks [J]. International Conference on Neural Information Processing Systems, 2012, 25(2): 1097-1105.
[13] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [J]. Eprint, 2015: arXiv:1409.1556.
[14] DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database [C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009: 248-255.
[15] KRIZHEVSKY A, HINTON G. Learning multiple layers of features from tiny images[J]. Eprint, 2009: arXiv:1011.1669v3.
[16] GEMAN S, BIENENSTOCK E, DOURSAT R. Neural networks and the bias/variance dilemma [J]. Neural computation, 1992, 4(1): 1-58.
[17] HUANG Gao, SUN Yu, LIU Zhuang, et al. Deep networks with stochastic depth [C]// Proceedings of the 14th European Conference on Computer Vision. Berlin: Springer, 2016: 646-661.
[18] CHEN Tianqi, LI Mu, LI Yutian, et al. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems [J]. Eprint, 2015: arXiv:1512.01274.
[19] DANIELS H, KAMP B, VERKOOIJEN W. Application of Neural Networks to House Pricing and Bond Rating [M]. Tilburg University, 1997.
[20] COBHAM A. The intrinsic computational difficulty of functions [J]. International Congress for Logic, 1969, 31(1): 43-52.
[21] EDMONDS J. Paths, trees, and flowers [J]. Canadian Journal of Mathematics, 2009, 17(3):361-379.
[22] IOFFE S, SZEGEDY C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [C]// Proceedings of the 32nd International Conference on Machine Learning. Lille, France: ACM, 2015: 448-456.
[23] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification [C]// Proceedings of the 27th International Conference on Computer Vision. Santiago, USA: ACM, 2015: 1026-1034.

()
(

[1]	魏俣童, 鲍秉坤, 张子祺, 朱进. 不稳定传输中受损视频的低延迟修复方法[J]. 中国科学技术大学学报, 2021, 51(10): 717-724.
[2]	王悦，李京. 基于可视化的卷积神经网络优化方法研究[J]. 中国科学技术大学学报, 2020, 50(7): 959-967.
[3]	杜淑颖，杜鹏，丁世飞. 基于CNN的假冒域名识别方法研究[J]. 中国科学技术大学学报, 2020, 50(7): 1019-1025.
[4]	熊军林, 赵铎. 基于RGB图像的二阶段机器人抓取位置检测方法[J]. 中国科学技术大学学报, 2020, 50(1): 1-10.
[5]	曾伟辉，李淼，张健，黄小平，王敬贤，袁媛. 面向农作物病害识别的高阶残差卷积神经网络研究[J]. 中国科学技术大学学报, 2019, 49(10): 781-790.
[6]	杨子文，陈蕾，浦建宇. 基于两层迁移卷积神经网络的抽象图像情感识别[J]. 中国科学技术大学学报, 2019, 49(1): 40-48.
[7]	孙达昌，毕秀春. 基于深度学习算法的高频交易策略及其盈利能力[J]. 中国科学技术大学学报, 2018, 48(11): 923-932.
[8]	陈东杰，张文生，杨阳. 基于深度学习的高铁接触网定位器检测与识别[J]. 中国科学技术大学学报, 2017, 47(4): 320-327.

集成最大汇合: 最大汇合时只有最大值有用吗

Ensemble max-pooling: Is only the maximum activation useful when pooling

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 8

编辑推荐

Metrics

本文评价