Abstract
One of the crowning achievements of Yaacov Choueka’s illustrious career has been his guidance of the Bar-Ilan Responsa project from a fledgling research project to a major enterprise awarded the Israel Prize in 2008. Much of the early work on the Responsa project ultimately proved to be foundational in the now burgeoning area of information retrieval, the science of searching large digitized corpora for information. In this paper, I will very briefly review some of the project’s achievements and will discuss some of the directions the project might consider in order to meet ongoing challenges. (The reader wishing to read an insider’s detailed review of the project’s achievements and challenges is referred to (Choueka 1990).)
The Responsa project was initiated by Aviezri Fraenkel in 1963, well before massive searchable text corpora became commonplace. In order to appreciate the challenges faced by researchers involved with the Responsa project in those early days, it is instructive to compare the corpus to the most well-known corpus extant at the time, namely, the Brown corpus developed at Brown University (Kucera & Francis 1967).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adler, M., Elhadad, M.: An Unsupervised Morpheme-Based HMM for Hebrew Morphological Disambiguation. In: ACL 2006 (2006)
Attar, R., Fraenkel, A.S.: Local Feedback in Full-Text Retrieval Systems. J. ACM 24(3), 397–417 (1977)
Attar, R., Choueka, Y., Dershowitz, N., Fraenkel, A.S.: KEDMA - Linguistic Tools for Retrieval Systems. J. ACM 25(1), 52–66 (1978)
Baharad, E., Goldberger, J., Koppel, M., Nitzan, S.: Beyond Condorcet: Optimal Judgment Aggregation Using Voting Records (2008) (submitted for publication)
Bar-Haim, R., Sima’an, K., Winter, Y.: Part-of-Speech Tagging of Modern Hebrew Text. Natural Language Engineering 14(2), 223–251 (2008)
Brin, S., Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks 30, 107–117 (1998)
Choueka, Y.: Looking for needles in a haystack or locating interesting expressions in large textual databases. In: Proc. of the RIAO International Conference on User-Oriented Content-Based Text and Image Handling, pp. 609–623 (1988)
Choueka, Y.: RESPONSA - A full-text system with linguistic components for large corpora. In: Quemada, B., Zampolli, A. (eds.) Computational Lexicology and Lexicography, pp. 181–217. Giardini Editions, Pisa (1990)
Choueka, Y., Fraenkel, A.S., Klein, S.T.: Compression of Concordances in Full-Text Retrieval Systems. In: SIGIR 1988, pp. 597–612 (1988)
Choueka, Y., Fraenkel, A., Perl, Y.: Polynomial Construction of Optimal Prefix Tables for Text Compression. In: Proc. of 19th Allerton Conference on Communication, Control and Computing, pp. 762–768 (1981)
Choueka, Y., Klein, S.T., Neuwitz, E.: Automatic Retrieval of Frequent Idiomatic and Collocational Expressions in a Large Corpus. ALLC Journal 4, 34–38 (1983)
Choueka, Y., Lusignan, S.: Disambiguation by short context. Computers and the Humanities 19(3), 147–157 (1985)
Choueka, Y., Neeman, Y.: Nakdan-Text. Tel-Aviv, C.E.T. (1995)
Choueka, Y., Shapiro, M.: Machine analysis of Hebrew morphology: potentialities and achievements (Hebrew), Leshonenu (Journal of the Academy of Hebrew Language) 27, 354–372 (1964)
Dagan, I.: Contextual Word Similarity. In: Dale, R., Moisl, H., Somers, H.L. (eds.) Handbook of Natural Language Processing. CRC Press (2000)
Garfield, E.: Citation Analysis as a Tool in Journal Evaluation. Science 178(60), 471–479 (1972)
HaCohen-Kerner, Y., Kass, A., Peretz, A.: Baseline Methods for Automatic Disambiguation of Abbreviations in Jewish Law Documents. In: Vicedo, J.L., Martínez-Barco, P., Muńoz, R., Saiz Noeda, M. (eds.) EsTAL 2004. LNCS (LNAI), vol. 3230, pp. 58–69. Springer, Heidelberg (2004)
Hacohen-Kerner, Y., Kass, A., Peretz, A.: Combine One Sense Disambiguation of Abbreviations. In: Proc. of ACL (Companion Volume), pp. 61–64 (2008a)
HaCohen-Kerner, Y., Mughaz, D., Beck, H., Elchai, Y.: Words As Classifiers of Documents According to their Historical Period and the Ethnic Origin of their Authors. Cybernetics and Systems 39(3), 213–228 (2008a)
Hanani, S.: Feedback by Local Clustering in a Full-text Online Information Retrieval System. Unpublished M.Sc. Thesis, Bar-Ilan Iniversity (1987)
Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Koppel, M., Mughaz, D., Akiva, N.: New Methods for Attribution of Rabbinic Literature. Hebrew Linguistics: A Journal for Hebrew Descriptive, Computational and Applied Linguistics 57, 5–18 (2006)
Kucera, H., Francis, W.N.: Computational Analysis of Present-day American Engish. Brown University Press, Providence (1967)
Lin, D.: Automatic Retrieval and Clustering of Similar Words. In: COLING-ACL 1998, pp. 768–774 (1998)
Mughaz, D.: Classification of Hebrew texts according to style. M.Sc. thesis (in Hebrew), Bar-Ilan University, Ramat-Gan, Israel (2003)
Rabinowitz, R.: Performance Improvement of the Information Retrieval Systems Based on Utilization of the References Included in the Retrieved Documents, Unpublished M.Sc. Thesis, Bar-Ilan Iniversity (1986)
Salton, G., Wong, A., Yang, C.S.: A Vector Space Model for Automatic Indexing. Commun. ACM 18(11), 613–620 (1975)
Wintner, S.: Hebrew computational linguistics: Past and future. Artificial Intelligence Review 21(2), 113–138 (2004)
Yu, H., Kim, W., Hatzivassiloglou, V., Wilbur, W.J.: Using MEDLINE as a knowledge source for disambiguating abbreviations and acronyms in full-text biomedical journal articles. Journal of Biomedical Informatics 40(2), 150–159 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Koppel, M. (2014). The Responsa Project: Some Promising Future Directions. In: Dershowitz, N., Nissan, E. (eds) Language, Culture, Computation. Computing of the Humanities, Law, and Narratives. Lecture Notes in Computer Science, vol 8002. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45324-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-45324-3_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45323-6
Online ISBN: 978-3-642-45324-3
eBook Packages: Computer ScienceComputer Science (R0)