Abstract
Extracting and detecting features from online reviews is both important and challenging, especially when domain knowledge is not explicitly available. Moreover, opinions about the same feature of a product or service are frequently expressed in various lexical forms. In this paper, we present a novel framework to automatically detect, extract and aggregate semantically related features of reviewed products and services. Our model uses sentence level syntactic and lexical information to detect candidate feature words, and corpus level co-occurrence statistics to perform grouping of semantically similar features to obtain high precision feature detection. The high precision feature assembly capability of our model has a distinct advantage over state of the art approaches, like double propagation, by producing short and succinct sets of features compared to potential thousands of features that are generated by existing approaches. We evaluate our model in two completely unrelated domains, restaurant and camera online reviews, to verify its domain independence. The results of our model outperformed existing state of the art probabilistic models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Qiu, G., et al.: Opinion Word Expansion and Target Extraction through Double Propagation. Computational Linguistics 37, 9–27 (2011)
Dhar, V., Chang, E., Stern, L.N.: Does Chatter Matter? The Impact of User-Generated Content on Music. CeDER Working Paper. New York University (2007)
Ipeirotis, P.G.: Estimating the Helpfulness and Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics. IEEE Transactions on Knowledge and Data Engineering (TKDE) 99 (2010)
Bansal, M., Cardie, C., Lee, L.: The power of negative thinking: Exploiting label disagreement in the min-cut classification framework. In: Proceedings of COLING: Companion volume: Posters, pp. 13–16 (2008)
Eguchi, K., Lavrenko, V.: Sentiment Retrieval using Generative Models. In: Jurafsky, D., Gaussier, É. (eds.) ACL, pp. 345–354 (2006) ISBN: 1-932432-73-6
Esuli, A., Sebastiani, F.: Determining the semantic orientation of terms through gloss classification, pp. 617–624 (2005)
Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up?: sentiment classification using machine learning techniques, pp. 79–86. Association for Computational Linguistics, Stroudsburg (2002)
Pang, B., Lee, L.: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts, pp. 271–278 (2004)
Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews, pp. 417–424. Association for Computational Linguistics, Stroudsburg (2002), doi: http://dx.doi.org/10.3115/1073083.1073153
Dumais, S.T., et al.: Using Latent Semantic Analysis To Improve Access To Textual Information, pp. 281–285. ACM (1988)
Hofmann, T.: Probabilistic latent semantic indexing, pp. 50–57. ACM, New York (1999), doi: http://doi.acm.org/10.1145/312624.312649 , ISBN: 1-58113-096-1
Blei, D.M., et al.: Latent dirichlet allocation. Journal of Machine Learning Research 3 (2003)
Hu, M., Liu, B.: Mining opinion features in customer reviews, pp. 755–760. AAAI Press (2004) ISBN: 0-262-51183-5
Brody, S., Elhadad, N.: An unsupervised aspect-sentiment model for online reviews, pp. 804–812. Association for Computational Linguistics, Stroudsburg (2010) ISBN: 1-932432-65-5
Titov, I., McDonald, R.: Modeling online reviews with multi-grain topic models, pp. 111–120. ACM, New York (2008) ISBN: 978-1-60558-085-2
Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38, 39–41 (1995)
Zhuang, L., Jing, F., Zhu, X.-Y.: Movie review mining and summarization, pp. 43–50. ACM, New York (2006) ISBN: 1-59593-433-2
Morinaga, S., et al.: Mining product reputations on the Web, pp. 341–349. ACM, New York (2002) ISBN: 1-58113-567-X
Popescu, A.-M., Etzioni, O.: Extracting product features and opinions from reviews, pp. 339–346. Association for Computational Linguistics, Stroudsburg (2005)
Lu, Y., Zhai, C.X., Sundaresan, N.: Rated aspect summarization of short comments, pp. 131–140. ACM, New York (2009), doi: http://doi.acm.org/10.1145/1526709.1526728 , ISBN: 978-1-60558-487-4
Wiebe, J.M.: Learning Subjective Adjectives from Corpora, pp. 735–740 (2000)
Ganu, G., Elhadad, N., Marian, A.: Beyond the stars: improving rating predictions using review text content. In: Proceedings of the Twelfth International Workshop on the Web and Databases (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bhattarai, A., Niraula, N., Rus, V., Lin, KI. (2012). A Domain Independent Framework to Extract and Aggregate Analogous Features in Online Reviews. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28604-9_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-28604-9_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28603-2
Online ISBN: 978-3-642-28604-9
eBook Packages: Computer ScienceComputer Science (R0)