Sentiment classification of Chinese cosmetic reviews based on integration of collocations and concepts
ISSN: 0264-0473
Article publication date: 16 December 2019
Issue publication date: 19 March 2020
Abstract
Purpose
This paper aims to propose a novel approach which integrates collocations and domain concepts for Chinese cosmetic word of mouth (WOM) sentiment classification. Most sentiment analysis works by collecting sentiment scores from each unigram or bigram. However, not every unigram or bigram in a WOM document contains sentiments. Chinese collocations consist of the main sentiments of WOM. This paper reduces the complexity of the document dimensionality and makes an improvement for sentiment classification.
Design/methodology/approach
This paper builds two contextual lexicons for feature words and sentiment words, respectively. Based on these contextual lexicons, this paper uses the techniques of associated rules and mutual information to build possible Chinese collocation sets. This paper applies preference vector modelling as the vector representation approach to catch the relationship between Chinese collocations and their associated concepts.
Findings
This paper compares the proposed preference vector models with benchmarks, using three classification techniques (i.e. support vector machine, J48 decision tree and multilayer perceptron). According to the experimental results, the proposed models outperform all benchmarks evaluated by the criterion of accuracy.
Originality/value
This paper focuses on Chinese collocations and proposes a novel research approach for sentiment classification. The Chinese collocations used in this paper are adaptable to the content and domains. Finally, this paper integrates collocations with the preference vector modelling approach, which not only achieves a better sentiment classification performance for Chinese WOM documents but also avoids the curse of dimensionality.
Keywords
Acknowledgements
This work was supported in part by the Ministry of Science and Technology of Taiwan under Grant MOST 106-2410-H-033-014-MY2 and 104-2410-H-033-039-MY2.
Citation
Hung, C. and Cao, Y.-X. (2020), "Sentiment classification of Chinese cosmetic reviews based on integration of collocations and concepts", The Electronic Library, Vol. 38 No. 1, pp. 155-169. https://doi.org/10.1108/EL-04-2019-0093
Publisher
:Emerald Publishing Limited
Copyright © 2019, Emerald Publishing Limited