Journal of University of Science and Technology of China ›› 2017, Vol. 47 ›› Issue (1): 63-69.DOI: 10.3969/j.issn.0253-2778.2017.01.009

Previous Articles     Next Articles

Improving emotion expression extraction in Chinese microblogs via new words detection

WAN Qi   

  1. 1. College of Computer Science, Sichuan University, Chengdu 610065, China; 2. College of Mathematics, Physics and Information Engineering, Zhejiang Normal University, Jinghua 321004, China
  • Received:2016-03-01 Revised:2016-09-17 Online:2017-01-31 Published:2017-01-31

Abstract: Emotion expression extraction is one of the important tasks of fine-grained sentiment mining. Existing methods lack efficiency in dealing with this task in Chinese microblogs because there are many new words and non-standard words in them. It’s found in this paper that a large number of new words are distributed in emotional expressions of the text in Chinese microblogs. A combined extraction model based on CRF is proposed, which incorporates new word detection into the task to improve the original work. The experimental results show that new word detection has good correlation with emotion expression extraction from Chinese microblogs, and that F1 value increases more than 2% on both the data sets of the movie field and the open field in Chinese microblogs.

Key words: sentiment analysis, new word detection, conditional random field, information extraction