中国科学技术大学学报 ›› 2019, Vol. 49 ›› Issue (2): 105-111.DOI: 10.3969/j.issn.0253-2778.2019.02.004

• 原创论文 • 上一篇    下一篇

基于深度学习与Dense SIFT融合的人脸表情识别

彭玉青   

  1. 河北工业大学计算机科学与软件学院,天津 300400
  • 收稿日期:2019-06-15 修回日期:2019-09-18 出版日期:2019-02-28 发布日期:2019-02-28
  • 通讯作者: 彭玉青
  • 作者简介:彭玉青(通讯作者),女,博士/教授.研究方向:数据挖掘、图像处理、锁模式信息融合. E-mail:pengyuqing@scse.hebut.edu.cn
  • 基金资助:
    河北省教育厅青年基金(QN2017314);河北省自然基金重点项目(F2016202144); 河北省自然基金面上项目(F2017202145)资助.

Facial expression recognition based on fusion of deep learning and dense SIFT

PENG Yuqing   

  1. School of Computer Science and Software, Hebei University of Technology, Tianjin 300400, China
  • Received:2019-06-15 Revised:2019-09-18 Online:2019-02-28 Published:2019-02-28

摘要: 为了准确高效地实现人脸表情识别.提出将一种将卷积神经网络与Dense SIFT特征进行融合的混合模型,该混合模型所用的网络结构是在深度可分离的卷积神经网络 MobileNet的思想上加以改进.在通道卷积(深度卷积)与空间卷积(点卷积)分离的基础上,将MobileNet结构的点卷积部分使用多尺度卷积核,保证了提取特征的丰富细微性,更加适用于人脸表情特征提取;同时引入DenseNet结构的思想,提升了网络的性能.利用Dense SIFT的128维描述子对特征描述较丰富的优势。将其与改进的MobileNet网络在全连接层进行融合,采用Eltwise层在全连接层元素之间做比较并取最大值,以保证特征的多样性,且更具代表性.在FER2013和JAFFE人脸表情数据集上运用该混合模型,识别率可以达到73.2%和96.5%.

关键词: 混合模型, MobileNet, 深度可分离, 多尺度卷积, Dense SIFT

Abstract: With the wide application of facial expression recognition in the field of human-computer interaction, accurate and efficient expression recognition methods are of particular important. A hybrid model that combines the convolutional neural network with Dense SIFT features is proposed. The network structure used in the hybrid model is improved in the idea of depth-separable convolutional neural network MobileNet. Based on the separation of channel convolution ( depth convolution)and space convolution (point convolution), the multi-scale convolution kernel is used in the point convolution part of the MobileNet structure, which ensures the diversity and subtleness of the extracted features and is more suitable for facial expression feature extraction, and the introduction of DenseNet network structure ideas improve the performance of the network structure. Using Dense SIFT's 128-dimension descriptors to provide greater advantages for feature descriptions, the improved MobileNet network is integrated with its fully connected layer, and the Eltwise layer is used to compare the elements of the fully connected layer, taking the maximum value to ensure the diversity of features, as well as greater representation. Using this hybrid model on FER2013 and JAFFE face expression data sets, the recognition rate can reach 73.2% and 96.5%.