Abstract
In an earlier article published in this journal (“Concept Mover’s Distance”, 2019), we proposed a method for measuring concept engagement in texts that uses word embeddings to find the minimum cost necessary for words in an observed document to “travel” to words in a “pseudo-document” consisting only of words denoting a concept of interest. One potential limitation we noted is that, because words associated with opposing concepts will be located close to one another in the embedding space, documents will likely have similar closeness to starkly opposing concepts (e.g., “life” and “death”). Using aggregate vector differences between antonym pairs to extract a direction in the semantic space pointing toward a pole of the binary opposition (following “The Geometry of Culture,” American Sociological Review, 2019), we illustrate how CMD can be used to measure a document’s engagement with binary concepts.
Similar content being viewed by others
Notes
Which we could get by subtracting the cosine similarity between “bowling” and “rich” (\(-0.754\)) from the cosine similarity between “bowling” and “poor” (0.962), and dividing by two.
See [20, pp. 296–9] and [11] for more detailed discussions of the underlying algorithm. Several teams have found computationally efficient methods of solving the transportation problem and our method now incorporates linear complexity relaxed word mover’s distance [2], as implemented in the text2vec package [19].
References
Arseniev-Koehler, A., & Foster, J. (2020). Machine learning as a model for cultural learning: Teaching an algorithm what it means to be fat. SocArXiv. https://osf.io/preprints/socarxiv/c9yj3/.
Atasu, K., Parnell, T., Dünner, C., Sifalakas, M., Pozidis, H., Vasileiadis, V., et al. (2017). Linear-complexity related word mover's distance with GPU acceleration. In J.-Y. Nie, Z. Obradovic, T. Suzumura, R. Ghosh, R. Nambiar, C. Wang, et al. (Eds.), 2017 IEEE international conference on big data (pp. 889–896). Boston: IEEE.
Bolukbasi, T., Chang, K.-W., Zou, J., Saligrama, V., & Kalai, A. (2016). Quantifying and reducing stereotypes in word embeddings. arXiv. https://arxiv.org/abs/1606.06121.
Bolukbasi, T., Chang, K.-W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, & R. Garnett (Eds.), Advances in neural information processing systems (Vol. 29, pp. 4349–4357). Curran Associates Inc.
Caliskan, A., Bryson, Joanna J, & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183–186.
Ethayarajh, K., Duvenaud, D., & Hirst, G. (2019). Understanding undesirable word embedding associations. arXiv. https://arxiv.org/abs/1908.06361.
Garg, N., Schiebinger, L., Jurafsky, D., & Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences of the United States of America, 115(16), E3635–E3644.
Goldberg, A. (2011). Mapping shared understandings using relational class analysis: The case of the cultural omnivore reexamined. American Journal of Sociology, 116(5), 1397–1436.
Kassambara, A. (2020). ggpubr: ‘ggplot2’ based publication ready plots. R package version 0.2.5. https://cran.r-project.org/web/packages/ggpubr/ggpubr.pdf. Accessed 11 June 2020.
Kozlowski, A. C., Taddy, M., & Evans, J. A. (2019). The geometry of culture: Analyzing the meanings of class through word embeddings. American Sociological Review, 84(5), 905–949.
Kusner, M., Sun, Y., Kolkin, N., & Weinberger, K. (2015). From word embeddings to document distances. In: International conference on machine learning (pp. 957–966).
Lakoff, George. (2010). Moral politics: How liberals and conservatives think. Chicago: University of Chicago Press.
Larsen, A. B. L., Sønderby, S. K., Larochelle, H., & Winther, O. (2016). Autoencoding beyond pixels using a learned similarity metric. In M. F. Balcan & K. Q. Weinberger (Eds.), Proceedings of the 33rd international conference on machine learning (pp. 1558–1566). New York: ACM.
Makrai, M., Nemeskey, D., & Kornai, A. (2013). Applicative structure in vector space models. In A. Allauzen, H. Larochelle, C. Manning, & R. Socher (Eds.), Proceedings of the workshop on continuous vector space models and their compositionality (pp. 59–63). Sofia, Bulgaria: ACL.
Mikolov, T, Yih, W.-T., & Zweig, G. (2013). Linguistic regularities in continuous space word representations. In Proceedings of the 2013 conference of the north American chapter of the association for computational linguistics: Human language technologies (pp. 746–751). aclweb.org.
Project Gutenberg. 2020. https://www.gutenberg.org/wiki/Main_Page.
Rubner, Y., Tomasi, C., & Guibas L. J. (1998). A metric for distributions with applications to image databases. In Sixth international conference on computer vision (IEEE Cat. No. 98CH36271) (pp. 59–66). IEEE.
Sahlgren, Magnus. (2008). The distributional hypothesis. Italian Journal of Disability Studies, 20, 33–53.
Selivanov, D., Bickel, M., & Wang, Q. (2020) text2vec: Modern text mining framework for R. R package version 0.6. https://cran.r-project.org/web/packages/text2vec/text2vec.pdf. Accessed 11 June 2020.
Stoltz, D. S., & Taylor, M. A. (2019). Concept mover’s distance: measuring concept engagement via word embeddings in texts. Journal of Computational Social Science, 2(2), 293–313.
Venables, W. N., & Ripley, B. D. (2002). Modern Applied Statistics with S (4th ed.). New York: Springer. (ISBN 0-387-95457-0).
Wickham, Hadley. (2016). ggplot2: Elegant Graphics for Data Analysis. New York: Springer.
Woolley, J. T, & Peters, G. (2008). The American presidency project, Santa Barbara. Available from: http://www.presidency.ucsb.edu/ws.
Acknowledgements
A replication repository for this paper can be found at: https://github.com/Marshall-Soc/cmd_geometry.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Procedures for deriving a semantic directions
Appendix: Procedures for deriving a semantic directions
Deriving a semantic direction in an embedding space is a specific kind of relation extraction or induction. As such, there are many viable procedures one could use to find the pole of a binary concept in an embedding space. First, the simplest method would involve changing the order of operations used by Kozlowski et al. [10]: average the vectors for the words on each pole and then take the difference between these two averages. Arseniev-Koehler and Foster [1] refer to this method as the “Larsen method” following [13, p. 5]. Kozlowski et al. [10, p. 943 fn8] state that the Larsen method produced “nearly identical results” to theirs.
Second, Arseniev-Koehler and Foster [1] compare the Larsen method to one used in Bolukbasi et al. [3, pp. 42–43], which entails getting the vector offsets of antonym pairs through subtraction, then dividing the resulting vector by the Euclidean norm of the vector offset for those antonym pairs (see also [6]). Arseniev-Koehler and Foster find the results are similar, but the Larsen method was more accurate than this Bolukbasi method.
Third, Bolukbasi et al. [4] offer an additional method involving taking the difference between antonym pairs (they specifically use gendered terms), but then using principal component analysis to find a suitable aggregate from the resulting vector differences.
Finally, for exhaustiveness, there is another procedure which involves measuring individual target words’ associations with antonym pairs. This procedure does not, however, define a semantic direction against which any word could be compared and thus cannot be used directly with CMD. Caliskan et al. [5] incorporate this approach into a measure of gender bias in target terms, a technique they refer to as the Word-Embedding Association Test (WEAT). This entails first picking a target term, such as “wrench” or “boat.” Then one would take the mean of this target term’s distances to female-typed words—such as “girl,” “woman,” or “lady.” Next, one would take the mean of this same term’s distances to male-typed words, such as “boy,” “man,” and “gentleman.” Finally, the analyst subtracts the first mean from the second mean, to arrive at a single measure of how strongly associated this target term is to either side of the binary (see also [7]).
Rights and permissions
About this article
Cite this article
Taylor, M.A., Stoltz, D.S. Integrating semantic directions with concept mover’s distance to measure binary concept engagement. J Comput Soc Sc 4, 231–242 (2021). https://doi.org/10.1007/s42001-020-00075-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42001-020-00075-8