Abstract
This paper explores a simple method for obtaining contextual word representations. Recently, it was shown that random sentence representations obtained from echo state networks (ESNs) were able to achieve near state-of-the-art results in several sequence classification tasks. We explore a similar direction while considering a sequence labeling task specifically named entity recognition (NER). The idea is to simply use reservoir states of an ESN as contextual word embeddings by passing pre-trained word-embeddings as its input. Experimental results show that our approach achieves competitive results in terms of accuracy and faster training times when compared to state-of-the-art methods. In addition, we provide an empirical evaluation of hyper-parameters that influence this performance.
R. Ramamurthy and R. Stenzel—Contributed equally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akbik, A., Blythe, D., Vollgraf, R.: Contextual string embeddings for sequence labeling. In: Proceedings of International Conference on Computational Linguistics (2018)
Benikova, D., Biemann, C., Kisselew, M., Pado, S.: Germeval 2014 Named Entity Recognition Shared Task (2014)
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015)
Butcher, J.B., Verstraeten, D., Schrauwen, B., Day, C., Haycock, P.: Reservoir computing and extreme learning machines for non-linear time-series data analysis. Neural Netw. 38, 76–89 (2013)
Conneau, A., Kiela, D., Schwenk, H., Barrault, L., Bordes, A.: Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364 (2017)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)
Doya, K.: Bifurcations in the learning of recurrent neural networks. In: Proceedings IEEE International Symposium on Circuits and Systems (1992)
Frank, S.L.: Learn more by training less: systematicity in sentence processing by recurrent networks. Connect. Sci. 18(3), 287–302 (2006)
Gallicchio, C., Micheli, A., Pedrelli, L.: Deep echo state networks for diagnosis of Parkinson’s disease. arXiv preprint arXiv:1802.06708 (2018)
Hinaut, X., Dominey, P.F.: On-line processing of grammatical structure using reservoir computing. In: Villa, A.E.P., Duch, W., Érdi, P., Masulli, F., Palm, G. (eds.) ICANN 2012. LNCS, vol. 7552, pp. 596–603. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33269-2_75
Jaeger, H., Haas, H.: Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667), 78–80 (2004)
Jäger, H.: The “Echo State” approach to analysing and training recurrent neural networks. Technical report 148, GMD (2001)
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Kiros, R., et al.: Skip-thought vectors. In: Proceedings of Advances in Neural Information Processing Systems, pp. 3294–3302 (2015)
Lin, X., Yang, Z., Song, Y.: Short-term stock price prediction based on echo state networks. Expert Syst. Appl. 36(3), 7313–7317 (2009)
Lukoševičius, M.: A practical guide to applying echo state networks. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 659–686. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_36
Ma, Q., Shen, L., Cottrell, G.W.: Deep-ESN: a multiple projection-encoding hierarchical reservoir computing framework. arXiv preprint arXiv:1711.05255 (2017)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)
Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. In: Proceedings of the ICML (2013)
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)
Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)
Ramamurthy, R., Bauckhage, C., Buza, K., Wrobel, S.: Using echo state networks for cryptography. In: Lintas, A., Rovetta, S., Verschure, P.F.M.J., Villa, A.E.P. (eds.) ICANN 2017. LNCS, vol. 10614, pp. 663–671. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68612-7_75
Ramamurthy, R., Bauckhage, C., Sifa, R., Wrobel, S.: Policy learning using SPSA. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds.) ICANN 2018. LNCS, vol. 11141, pp. 3–12. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01424-7_1
Socher, R., et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (2013)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Sutton, C., McCallum, A., et al.: An introduction to conditional random fields. Found. Trends Mach. Learn. 4(4), 267–373 (2012)
Tong, M.H., Bickett, A.D., Christiansen, E.M., Cottrell, G.W.: Learning grammatical structure with echo state networks. Neural Netw. 20(3), 424–432 (2007)
Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of Conference on HLT-NAACL (2003)
Wieting, J., Kiela, D.: No training required: exploring random encoders for sentence classification. arXiv preprint arXiv:1901.10444 (2019)
Yadav, V., Bethard, S.: A survey on recent advances in named entity recognition from deep learning models. In: Proceedings of International Conference on Computational Linguistics (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ramamurthy, R., Stenzel, R., Sifa, R., Ladi, A., Bauckhage, C. (2019). Echo State Networks for Named Entity Recognition. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science(), vol 11731. Springer, Cham. https://doi.org/10.1007/978-3-030-30493-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-30493-5_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30492-8
Online ISBN: 978-3-030-30493-5
eBook Packages: Computer ScienceComputer Science (R0)