中国科学技术大学学报 ›› 2020, Vol. 50 ›› Issue (8): 1170-1180.DOI: 10.3969/j.issn.0253-2778.2020.08.018

• 论著 • 上一篇    下一篇

MAEA-DeepLab:具有多特征注意力有效聚合模块的语义分割网络

赵柳,陆军,刘杨   

  1. 1.黑龙江大学计算机科学与技术学院,黑龙江哈尔滨150080; 2.黑龙江省数据库与并行计算重点实验室(黑龙江大学),黑龙江哈尔滨150080
  • 收稿日期:2020-07-11 修回日期:2020-08-04 接受日期:2020-08-04 出版日期:2020-08-31 发布日期:2020-08-04
  • 通讯作者: 陆军
  • 作者简介:赵柳,男,1995年生,硕士生. 研究方向:深度学习 计算机视觉. E-mail: 2181411@s.hlju.edu.cn

MAEA-DeepLab: A semantic segmentation network with multi-feature attention effective aggregation module

ZHAO Liu, LU Jun, LIU Yang   

  1. 1. College of Computer Science and Technology, Heilongjiang University, Harbin 150080, China; 2. Key Laboratory of Database and Parallel Computing of Heilongjiang Province(Heilongjiang University), Harbin 150080, China
  • Received:2020-07-11 Revised:2020-08-04 Accepted:2020-08-04 Online:2020-08-31 Published:2020-08-04

摘要: 为了实现网络的低训练成本,在保持高精度的同时大大降低计算复杂性,提出了带有多特征注意力有效聚合模块(MAEA)的语义分割网络:MAEA-DeepLab.该编码器主网络采用了下采样16步幅的低分辨率特征映射,获得高级特征.解码器通过MAEA模块充分利用特征的空间注意力机制,有效聚合多特征,获得具有强大语义表示的高分辨率特征,有效地提高了解码器恢复重要细节信息的能力,实现了高精度分割.MAEA-DeepLab的Multiply-Adds只有DeepLabV3+架构的30.9%,即943.02 B,大大降低计算复杂性.架构不经过COCO数据集预训练,仅使用两张RTX 2080 ti GPU,在PASCAL VOC 2012数据集和CityScapes数据集的测试集上进行了语义分割基准测试,mIOU分数分别达到了87.5%和79.9%.实验结果表明,MAEA-DeepLab以低计算开销达到了很好的语义分割精度.

关键词: 语义分割, 编码器-解码器, MAEA-DeepLab, 空间注意力

Abstract: To realize the low cost of network training, the computational complexity is greatly reduced while maintaining high precision. A semantic segmentation network with multi-feature attention effective aggregation module(MAEA) is proposed: MAEA-DeepLab. A 16 stride low-resolution feature map for down-sampling is adopted in the encoder’s network backbone, and high-level features are obtained. The decoder makes full use of the feature's spatial attention mechanism through the MAEA module, effectively aggregates multiple features, and obtains high-resolution features with strong semantic representation. Then the ability of the decoder to recover important details is effectively improved, and high-precision segmentation is achieved. Multiply-adds in MAEA-DeepLab is 943.02B, only 30.9% of the DeepLabV3+ architecture, which greatly reduces the computational complexity. The architecture is not pre-training on the COCO dataset. It performs semantic semantic segmentation Benchmark tests on the test set of with PASCAL VOC 2012 dataset and CityScapes dataset with only two RTX 2080ti GPUs, and the mlOU scores reach 87.5% and 79.9%, respectively. The experimental results show that good semantic segmentation accuracy is achieved with low computational cost in MAEA-DeepLab.

Key words: semantic segmentation, encoder-decoder, MAEA-DeepLab, spatial attention

中图分类号: