中国科学技术大学学报 ›› 2017, Vol. 47 ›› Issue (10): 799-807.DOI: 10.3969/j.issn.0253-2778.2017.10.001

• 论著 •    下一篇

集成最大汇合: 最大汇合时只有最大值有用吗

张皓,吴建鑫,   

  1. 1.计算机软件新技术国家重点实验室, 江苏南京 210023;
    2.南京大学计算机科学与技术系, 江苏南京 210023
  • 收稿日期:2017-05-22 修回日期:2017-06-24 出版日期:2017-10-31 发布日期:2017-10-31
  • 通讯作者: 吴建鑫
  • 作者简介:张皓,男,1990年生,博士生,研究方向:机器学习与计算机视觉. E-mail: zhangh@lamda.nju.edu.cn

Ensemble max-pooling: Is only the maximum activation useful when pooling

ZHANG Hao, WU Jianxin,   

  1. 1. National Key Laboratory for Novel Software Technology, Nanjing 210023, China;
    2. Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China)
  • Received:2017-05-22 Revised:2017-06-24 Online:2017-10-31 Published:2017-10-31

摘要: 卷积神经网络中的汇合层基于局部相关性原理进行亚采样, 在减少数据量的同时保留有用信息, 从而有助于提升泛化能力. 同时, 汇合层可以有效提高感受野. 经典的最大汇合采用赢者通吃策略, 这有时会影响网络的泛化能力. 为此提出集成最大汇合,用于替代传统卷积神经网络中的汇合层. 在每个局部汇合区域, 集成最大汇合以p的概率使输出最大的神经元失活, 激活输出第二大的神经元. 集成最大汇合可以看作多个基础潜在网络的集成, 也可以理解为一种输入经历一定局部形变下的经典最大汇合过程. 实验结果表明, 相比经典汇合方法及其他相关汇合方法, 集成最大汇合取得了更好的性能. DFN-MR是近期主流结构ResNet的一个衍生, 相比ResNet, DFN-MR有着更多的基础潜在网络数目,同时避免了极深网络. 保持其他超参数不变, 通过将DFN-MR中步长为2的卷积层改为集成最大汇合串联步长为1的卷积层的结构, 可以使网络性能得到显著提高.

关键词: 卷积神经网络, 汇合层, 网络集成, 数据扩充

Abstract: The pooling layer in convolutional neural networks performs subsampling on the basis of the local correlation principle, reducing the data size while keeping useful information in order to improve generalization, and effectively increase receptive fields simultaneously. The winner-take-all strategy is used in classical max-pooling, which will affect the generalization of the network sometimes. A simple and effective pooling method named ensemble max-pooling was introduced, which can replace the pooling layer in conventional convolutional neural networks. In each pooling region, ensemble max-pooling drops the neuron with maximum activation with probability p, and outputs the neuron with second largest activation. Ensemble max-pooling can be viewed as an ensemble of many basic underlying networks, and it can also be viewed as the classical max-pooling with some local distortion of the input. The results achieved are better than classical pooling methods and other related pooling approaches. DFN-MR is derived from ResNet, compared with which it has more basic underlying networks and avoids very deep networks. By keeping other hyperparameters unchanged, and replacing each convolutional layer in DFN-MR with a tandem form, i.e., a combination of an ensemble max-pooling layer and a convolutional layer with stride 1, it is shown to deliver significant gains in performance.

Key words: convolutional neural network, pooling layer, network ensemble, data augmentation

中图分类号: