Sense-Aaware Semantic Analysis: A Multi-Prototype Word Representation Model Using Wikipedia

Authors

  • Zhaohui Wu The Pennsylvania State University
  • C. Giles The Pennsylvania State University

DOI:

https://doi.org/10.1609/aaai.v29i1.9496

Keywords:

Sense-aware Semantic Analysis, Multi-prototype Word Representation, Wikipedia

Abstract

Human languages are naturally ambiguous, which makes it difficult to automatically understand the semantics of text. Most vector space models (VSM) treat all occurrences of a word as the same and build a single vector to represent the meaning of a word, which fails to capture any ambiguity. We present sense-aware semantic analysis (SaSA), a multi-prototype VSM for word representation based on Wikipedia, which could account for homonymy and polysemy. The "sense-specific'' prototypes of a word are produced by clustering Wikipedia pages based on both local and global contexts of the word in Wikipedia. Experimental evaluations on semantic relatedness for both isolated words and words in sentential contexts and word sense induction demonstrate its effectiveness.

Downloads

Published

2015-02-19

How to Cite

Wu, Z., & Giles, C. (2015). Sense-Aaware Semantic Analysis: A Multi-Prototype Word Representation Model Using Wikipedia. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9496

Issue

Section

Main Track: NLP and Knowledge Representation