Abstract
Obtaining relevant legal documents fast, from very large datasets, is essential for the proper functioning of justice and legislative institutions. Nevertheless, legacy systems currently used by these institutions in Brazil are usually outdated, requiring a large deal of manual work. Legal Information Retrieval focuses on building new methods to deal with the large amount of legal texts, allowing the retrieval of relevant information from them. Relevance Feedback, an important aspect of information retrieval systems, uses the information given by the user to improve the document retrieval for a specific request. However, expanding its use to other queries is a difficult task. A possible approach is to use Relevance Feedback information from past, similar queries. In this paper, we propose Ulysses-RFSQ, a method based on this approach which gives a bonus for the documents marked as relevant for similar queries, and, through this bonus, updates the ranking created by a relevance score based Information Retrieval algorithm, which measures the similarity between the query text and the documents to be retrieved. Due to the lack of available datasets containing relevance information for similar queries, we used a corpus of legislative requests from the Brazilian Chamber of Deputies, which are in most cases redundant, allowing the assessment of the proposed method. According to the experimental results, adding the Relevance Feedback bonus to the documents score improved the Recall@20 of a BM25 algorithm by almost 3% in the legal dataset used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
Legislative Consulting Job Request and Monitoring System - SisConle.
- 6.
Legislative Information System - SiLeg.
References
Albuquerque, H.O., et al.: UlyssesNER-Br: a corpus of Brazilian legislative documents for named entity recognition. In: Pinheiro, V., et al. (eds.) PROPOR 2022. LNCS (LNAI), vol. 13208, pp. 3–14. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-98305-5_1
Almeida, P.G.R.: Uma jornada para um Parlamento inteligente: Câmara dos Deputados do Brasil. Red Información 24 (2021). https://www.redinnovacion.org/revista/red-informaci’on-edici’on-n-24-marzo-2021
Badenes-Olmedo, C., García, J.L.R., Corcho, Ó.: Legal document retrieval across languages: topic hierarchies based on synsets. CoRR abs/1911.12637 (2019)
Bhattacharya, P., Ghosh, K., Pal, A., Ghosh, S.: Methods for computing legal document similarity: a comparative study. ArXiv abs/2004.12307 (2020)
Brandt, M.B.: Ethical aspects in the organization of legislative information. KO Knowl. Organiz. 45(1), 3–12 (2018). https://doi.org/10.5771/0943-7444-2018-1-3
Brandt, M.B.: Modelagem da informação legislativa: arquitetura da informação para o processo legislativo brasileiro. Ph.D. thesis, Faculdade de Filosofia e Ciências da Universidade Estadual Paulista (UNESP) (2020)
Cantador, I., Sánchez, L.Q.: Semantic annotation and retrieval of parliamentary content: a case study on the Spanish congress of deputies. In: Proceedings of the First Joint Conference of the Information Retrieval Communities in Europe (CIRCLE 2020). CEUR Workshop Proceedings, vol. 2621 (2020)
Cetintas, S., Si, L., Yuan, H.: Using past queries for resource selection in distributed information retrieval. Technical report, Department of Computer Science, Purdue University (2011)
Chalkidis, I., Fergadiotis, M., Manginas, N., Katakalou, E., Malakasiotis, P.: Regulatory compliance through Doc2Doc information retrieval: a case study in EU/UK legislation where text similarity has limitations. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pp. 3498–3511 (2021). https://doi.org/10.18653/v1/2021.eacl-main.305
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186 (2019). https://doi.org/10.18653/v1/N19-1423
Gomes, T., Ladeira, M.: A new conceptual framework for enhancing legal information retrieval at the Brazilian superior court of justice. In: Proceedings of the 12th International Conference on Management of Digital EcoSystems, MEDES 2020, pp. 26–29 (2020). https://doi.org/10.1145/3415958.3433087
Gutiérrez Soto, C.: Exploring the reuse of past search results in information retrieval. Ph.D. thesis, Université de Toulouse, Université Toulouse III-Paul Sabatier (2016)
Gutiérrez-Soto, C., Hubert, G.: Probabilistic reuse of past search results. In: International Conference on Database and Expert Systems Applications - DEXA 2014, vol. 1, pp. 265–274 (2014)
Gutiérrez-Soto, C., Hubert, G.: Randomized algorithm for information retrieval using past search results. In: 2014 IEEE Eighth International Conference on Research Challenges in Information Science (RCIS), pp. 1–9 (2014)
Gutiérrez-Soto, C., Hubert, G.: On the reuse of past searches in information retrieval: study of two probabilistic algorithms. Int. J. Inf. Syst. Model. Des. (IJISMD) 6(2), 72–92 (2015)
Hust, A.: Introducing query expansion methods for collaborative information retrieval. In: Reading and Learning, pp. 252–280 (2004)
Lv, Y., Zhai, C.: When documents are very long, BM25 fails! In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, pp. 1103–1104 (2011). https://doi.org/10.1145/2009916.2010070
Maxwell, K.T., Schafer, B.: Concept and context in legal information retrieval. In: Proceedings of the 2008 Conference on Legal Knowledge and Information Systems: JURIX 2008: The Twenty-First Annual Conference, pp. 63–72 (2008)
Moshfeghi, Y., Velinov, K., Triantafillou, P.: Improving search results with prior similar queries. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, CIKM 2016, pp. 1985–1988 (2016). https://doi.org/10.1145/2983323.2983890
Murata, H., Onoda, T., Yamada, S.: Comparative analysis of relevance for SVM-based interactive document retrieval. J. Adv. Comput. Intell. Intell. Inform. 17(2), 149–156 (2013). https://doi.org/10.20965/jaciii.2013.p0149
de Oliveira, R.A.N., Junior, M.C.: Experimental analysis of stemming on jurisprudential documents retrieval. Information 9(2), 28 (2018). https://doi.org/10.3390/info9020028
Onoda, T., Murata, H., Yamada, S.: SVM-based interactive document retrieval with active learning. New Gener. Comput. 26(1), 49–61 (2007)
Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In: Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 1994, pp. 232–241 (1994)
Russell-Rose, T., Chamberlain, J., Azzopardi, L.: Information retrieval in the workplace: a comparison of professional search practices. Inf. Process. Manag. 54(6), 1042–1057 (2018). https://doi.org/10.1016/j.ipm.2018.07.003
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975). https://doi.org/10.1145/361219.361220
Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. J. Am. Soc. Inf. Sci. 41(4), 288–297 (1990)
Savoy, J.: Light stemming approaches for the French, Portuguese, German and Hungarian languages. In: Proceedings of the 2006 ACM Symposium on Applied Computing, SAC 2006, pp. 1031–1035 (2006). https://doi.org/10.1145/1141277.1141523
Silva, N.F.F., et al.: Evaluating topic models in Portuguese political comments about bills from Brazil’s chamber of deputies. In: Britto, A., Valdivia Delgado, K. (eds.) BRACIS 2021. LNCS (LNAI), vol. 13074, pp. 104–120. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-91699-2_8
Song, S.K., Myaeng, S.H.: A novel term weighting scheme based on discrimination power obtained from past retrieval results. Inf. Process. Manag. 48(5), 919–930 (2012). https://doi.org/10.1016/j.ipm.2012.03.004
Souza, E., et al.: Assessing the impact of stemming algorithms applied to Brazilian legislative documents retrieval. In: Proceedings of the 13th Brazilian Symposium in Information and Human Language Technology, SBC, pp. 227–236 (2021). https://doi.org/10.5753/stil.2021.17802
Souza, E., et al.: An information retrieval pipeline for legislative documents from the Brazilian chamber of deputies. In: Legal Knowledge and Information Systems, pp. 119–126. IOS Press (2021). https://doi.org/10.3233/FAIA210326
van Opijnen, M., Santos, C.: On the concept of relevance in legal information retrieval. Artificial Intelligence and Law 25(1), 65–87 (2017). https://doi.org/10.1007/s10506-017-9195-8
Yin, P.Y., Bhanu, B., Chang, K.C., Dong, A.: Improving retrieval performance by long-term relevance information. In: International Conference on Pattern Recognition, vol. 3, pp. 533–536 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Vitório, D., Souza, E., Martins, L., da Silva, N.F.F., de Leon Ferreira de Carvalho, A.C.P., Oliveira, A.L.I. (2022). Ulysses-RFSQ: A Novel Method to Improve Legal Information Retrieval Based on Relevance Feedback. In: Xavier-Junior, J.C., Rios, R.A. (eds) Intelligent Systems. BRACIS 2022. Lecture Notes in Computer Science(), vol 13653. Springer, Cham. https://doi.org/10.1007/978-3-031-21686-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-21686-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21685-5
Online ISBN: 978-3-031-21686-2
eBook Packages: Computer ScienceComputer Science (R0)