Robust Evaluation of Language–Brain Encoding Experiments

Beinborn, Lisa; Abnar, Samira; Choenni, Rochelle

doi:10.1007/978-3-031-24337-0_4

Lisa Beinborn⁸,
Samira Abnar⁹ &
Rochelle Choenni⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13451))

Included in the following conference series:

International Conference on Computational Linguistics and Intelligent Text Processing

504 Accesses

Abstract

Language–brain encoding experiments evaluate the ability of language models to predict brain responses elicited by language stimuli. The evaluation scenarios for this task have not yet been standardized which makes it difficult to compare and interpret results. We perform a series of evaluation experiments with a consistent encoding setup and compute the results for multiple fMRI datasets. In addition, we test the sensitivity of the evaluation measures to randomized data and analyze the effect of voxel selection methods. Our experimental framework is publicly available to make modelling decisions more transparent and support reproducibility for future comparisons.

The experiments were conducted in 2018 when all three authors were employed at the Institute of Logic, Language and Computation at the University of Amsterdam. The paper was presented in 2019. Since then, language modeling has progressed immensely. Experimental standards for robust, comparable, and reproducible evaluation for interpreting language–brain encoding experiments with respect to reasonable random permutation baselines need to be further developed and more widely adopted.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: JPY 3498; Price includes VAT (Japan)

eBook: JPY 11439; Price includes VAT (Japan)

Softcover Book: JPY 14299; Price includes VAT (Japan)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Driving and suppressing the human language network using large language models

Article 03 January 2024

A natural language fMRI dataset for voxelwise encoding models

Article Open access 23 August 2023

The “Narratives” fMRI dataset for evaluating models of naturalistic language comprehension

Article Open access 28 September 2021

Notes

1.
The code is available at https://github.com/beinborn/brain-lang.
2.
Whether a linear model is a plausible choice is debatable. We use it here for comparison with previous work.
3.
https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md.

References

Abnar, S., Ahmed, R., Mijnheer, M., Zuidema, W.: Experiential, distributional and dependency-based word embeddings have complementary roles in decoding brain activity. In: Proceedings of the 8th Workshop on Cognitive Modeling and Computational Linguistics (CMCL’18), pp. 57–66. Association for Computational Linguistics (2018). http://aclweb.org/anthology/W18-0107
Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) Database Theory – ICDT 2001, pp. 420–434. Springer, Berlin Heidelberg, Berlin, Heidelberg (2001). http://kops.uni-konstanz.de/bitstream/handle/123456789/5715/On_the_Surprising_Behavior_of_Distance_Metric_in_High_Dimensional_Space.pdf?sequence=1
Anderson, A.J., Bruni, E., Bordignon, U., Poesio, M., Baroni, M.: Of words, eyes and brains: Correlating image-based distributional semantic models with neural representations of concepts. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1960–1970. Association for Computational Linguistics (2013). http://aclweb.org/anthology/D13-1202
Anderson, A.J., Kiela, D., Clark, S., Poesio, M.: Visually grounded and textual semantic models differentially decode brain activity associated with concrete and abstract nouns. Trans. Assoc. Comput. Linguist. 5, 17–30 (2017). http://aclweb.org/anthology/Q17-1002
Artstein, R., Poesio, M.: Inter-coder agreement for computational linguistics. Comput. Linguist. 34(4), 555–596 (2008). https://www.mitpressjournals.org/doi/pdfplus/10.1162/coli.07-034-R2
Athanasiou, N., Iosif, E., Potamianos, A.: Neural activation semantic models: computational lexical semantic models of localized neural activations. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 2867–2878 (2018). http://www.aclweb.org/anthology/C18-1243
Barrett, M., Bingel, J., Hollenstein, N., Rei, M., Søgaard, A.: Sequence classification with human attention. In: Proceedings of the 22nd Conference on Computational Natural Language Learning, pp. 302–312 (2018). http://www.aclweb.org/anthology/K18-1030
Beinborn, L., Zesch, T., Gurevych, I.: Predicting the difficulty of language proficiency tests. Trans. Assoc. Comput. Linguist. 2(1), 517–529 (2014). http://www.aclweb.org/anthology/Q14-1040
Bingel, J., Barrett, M., Søgaard, A.: Extracting token-level signals of syntactic processing from fMRI - with an application to pos induction. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1, pp. 747–755 (2016). http://www.aclweb.org/anthology/P16-1071
Brennan, J.R., Stabler, E.P., Van Wagenen, S.E., Luh, W.M., Hale, J.T.: Abstract linguistic structure correlates with temporal activity during naturalistic comprehension. Brain Lang. 157, 81–94 (2016). https://www.sciencedirect.com/science/article/pii/S0093934X1530068
Bulat, L., Clark, S., Shutova, E.: Speaking, seeing, understanding: correlating semantic models with conceptual representation in the brain. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1081–1091. Association for Computational Linguistics (2017). http://aclweb.org/anthology/D17-1113
Carroll, L.: Alice’s Adventures in Wonderland. Macmillan, London (1865)
Google Scholar
Dehghani, M., et al.: Decoding the neural representation of story meanings across languages. Human Brain Mapp. 38(12), 6096–6106 (2017). https://www.ncbi.nlm.nih.gov/pubmed/28940969
Frank, S.L., Otten, L.J., Galli, G., Vigliocco, G.: Word surprisal predicts n400 amplitude during reading. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2, pp. 878–883 (2013). https://www.semanticscholar.org/paper/Word-surprisal-predicts-N400-amplitude-during-Frank-Otten/0998e0763328764935e74db7c124ee4ee277c360
Fyshe, A., Sudre, G., Wehbe, L., Rafidi, N., Mitchell, T.M.: The semantics of adjective noun phrases in the human brain. bioRxiv (2016). https://www.biorxiv.org/content/biorxiv/early/2016/11/25/089615.full.pdf
Fyshe, A., Talukdar, P.P., Murphy, B., Mitchell, T.M.: Interpretable semantic vectors from a joint model of brain-and text-based meaning. In: Proceedings of the Conference. Association for Computational Linguistics. Meeting, vol. 2014, p. 489. NIH Public Access (2014). http://aclweb.org/anthology/P14-1046
Gauthier, J., Ivanova, A.: Does the brain represent words? An evaluation of brain decoding studies of language understanding. arXiv:1806.00591 (2018). https://arxiv.org/pdf/1806.00591.pdf
Hale, J., Dyer, C., Kuncoro, A., Brennan, J.R.: Finding syntax in human encephalography with beam search. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Volume 1 (Long Papers), pp. 2727–2736. Association for Computational Linguistics (2018). http://aclweb.org/anthology/P18-1254
Jain, S., Huth, A.: Incorporating context into language encoding models for fMRI. bioRxiv (2018). https://www.biorxiv.org/content/early/2018/05/21/327601
Kriegeskorte, N., Goebel, R., Bandettini, P.: Information-based functional brain mapping. Proc. National Acad. Sci. 103(10), 3863–3868 (2006). http://www.pnas.org/content/103/10/3863.full
Kriegeskorte, N., Mur, M., Bandettini, P.A.: Representational similarity analysis-connecting the branches of systems neuroscience. Front. Syst. Neurosci. vol. 2, p. 4 (2008). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2605405/
Li, J., Fabre, M., Luh, W.M., Hale, J.: The role of syntax during pronoun resolution: evidence from fMRI. In: Proceedings of the 8th Workshop on Cognitive Aspects of Computational Language Learning and Processing, pp. 56–64. Association for Computational Linguistics (2018). http://aclweb.org/anthology/W18-2808
Miezin, F.M., Maccotta, L., Ollinger, J., Petersen, S., Buckner, R.: Characterizing the hemodynamic response: effects of presentation rate, sampling procedure, and the possibility of ordering brain activity based on relative timing. Neuroimage 11(6), 735–759 (2000). https://doi.org/10.1006/nimg.2000.0568
Article Google Scholar
Mitchell, T.M., et al.: Predicting human brain activity associated with the meanings of nouns. science 320(5880), 1191–1195 (2008). https://www.cs.cmu.edu/tom/pubs/science2008.pdf
Monsalve, I.F., Frank, S.L., Vigliocco, G.: Lexical surprisal as a general predictor of reading time. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 398–408. Association for Computational Linguistics (2012). https://aclanthology.info/pdf/E/E12/E12-1041.pdf
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011). http://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf
Pereira, F., et al.: Toward a universal decoder of linguistic meaning from brain activation. Nat. Commun. 9(1), 1–13 (2018). https://doi.org/10.1038/s41467-018-03068-4
Article Google Scholar
Peters, M., et al.: Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1, pp. 2227–2237 (2018). http://www.aclweb.org/anthology/N18-1202
Resnik, P., Lin, J.: Evaluation of NLP systems. The Handbook of Computational Linguistics and Natural Language Processing, vol. 57, pp. 271–295 (2010). https://pdfs.semanticscholar.org/41ef/e3fb47032d609bbb13b7c850bb8b1dbd544d.pdf
Rowling, J.K.: Harry Potter and the Sorcerer’s Stone. Levine Books, Arthur A (1998)
Google Scholar
Sudre, G., et al.: Tracking neural coding of perceptual and semantic features of concrete nouns. NeuroImage 62(1), 451–463 (2012). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4465409/
Wehbe, L., Murphy, B., Talukdar, P., Fyshe, A., Ramdas, A., Mitchell, T.: Simultaneously uncovering the patterns of brain regions involved in different story reading subprocesses. PLoS One 9(11), e112575 (2014). https://doi.org/10.1371/journal.pone.0112575
Article Google Scholar
Wehbe, L., Vaswani, A., Knight, K., Mitchell, T.: Aligning context-based statistical models of language with brain activity during reading. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (2014). https://doi.org/10.3115/v1/d14-1030
Xu, H., Murphy, B., Fyshe, A.: Brainbench: a brain-image test suite for distributional semantic models. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2017–2021 (2016). http://www.aclweb.org/anthology/D16-1213

Download references

Acknowledgements

The work presented here was funded by the Netherlands Organisation for Scientific Research (NWO), through a Gravitation Grant 024.001.006 to the Language in Interaction Consortium.

Author information

Authors and Affiliations

Vrije Universiteit Amsterdam, Amsterdam, Netherlands
Lisa Beinborn
University of Amsterdam, Amsterdam, Netherlands
Samira Abnar & Rochelle Choenni

Authors

Lisa Beinborn
View author publications
You can also search for this author in PubMed Google Scholar
Samira Abnar
View author publications
You can also search for this author in PubMed Google Scholar
Rochelle Choenni
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lisa Beinborn .

Editor information

Editors and Affiliations

Instituto Politécnico Nacional, Mexico City, Mexico
Alexander Gelbukh

Appendix

Table 6. Voxel-wise results for cross-validation when taking the sum over voxels. The results are averaged over all folds and all subjects. The results for the random language model are given in parentheses. The results in this table are hard to interpret. We discourage the use of the sum method as accumulation method.

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Beinborn, L., Abnar, S., Choenni, R. (2023). Robust Evaluation of Language–Brain Encoding Experiments. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13451. Springer, Cham. https://doi.org/10.1007/978-3-031-24337-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-031-24337-0_4
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24336-3
Online ISBN: 978-3-031-24337-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Robust Evaluation of Language–Brain Encoding Experiments

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Driving and suppressing the human language network using large language models

A natural language fMRI dataset for voxelwise encoding models

The “Narratives” fMRI dataset for evaluating models of naturalistic language comprehension

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Robust Evaluation of Language–Brain Encoding Experiments

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Driving and suppressing the human language network using large language models

A natural language fMRI dataset for voxelwise encoding models

The “Narratives” fMRI dataset for evaluating models of naturalistic language comprehension

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation