Journal of University of Science and Technology of China ›› 2019, Vol. 49 ›› Issue (1): 8-14.DOI: 10.3969/j.issn.0253-2778.2019.01.002

Previous Articles     Next Articles

A multi-domain sentiment classification model based on sample filtering and transfer learning

QU Zhaowei   

  1. School of Computer Science and Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2018-05-29 Revised:2018-09-18 Online:2019-01-31 Published:2019-01-31

Abstract: Most of the models for sentiment classification are trained and tested on a single dataset. However, the model parameters obtained by training on one dataset are not suitable for another dataset and the model is not generic. A multi-domain sentiment classification model (MDSC) was proposed. With sample filtering and transfer learning, the trained model can be applied to different datasets in multiple domains and the model is more applicable and expandable. Specifically, a document is first mapped to the domain distribution which is used as a bridge between domain classification and sentiment classification, and then sentiment classification is completed. In order to make the model more generic, representative data samples should be selected. MDSC constructs a domain-independent sentiment lexicon to filter sentences that belong to the same document and obtain a high-quality training dataset. At the same time, to improve the classification accuracy and reduce the training time, parameter-based transfer learning with neutral networks is used to obtain the document embeddings for classification. Extensive experiments on datasets containing 15 different domains show that the proposed model can achieve better performance compared with traditional models when applied to datasets in multiple domains.

Key words: sentiment classification, sample filtering, transfer learning, sentiment lexicon, neural network