Abstract
Lexical Answer Type (LAT) prediction is an essential part of question classification. It aims to assign certain lexical answer type to the questions to narrow down the search space and improve the classifier’s performance. LAT prediction is a challenge in the biomedical domain since it is more of a multi-label classification question, which means each question has more than one label. In this paper, we employ the Label Powerset method to transform multi-label classification problems into multi-classification problems. Afterwards we introduced a random forest based mechanism to partition the features into used (important) and unused (unimportant) sets with corresponding weights. Furthermore, by assuming that the unimportant features are not useless, we employ principal components analysis to get the information from the unused feature set. By combing these two types of features, the experimental study on the BioMedLAT dataset has demonstrated our method’s potential.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Our source code is available at https://github.com/Romainpkq/LATPrediction.
- 2.
- 3.
References
Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdisc. Rev. Comput. Stat. 2(4), 433–459 (2010)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Davis, J., Goadrich, M.: The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 233–240 (2006)
Diplaris, S., Tsoumakas, G., Mitkas, P.A., Vlahavas, I.: Protein classification with multiple algorithms. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 448–456. Springer, Heidelberg (2005). https://doi.org/10.1007/11573036_42
Ferrucci, D.A., et al.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)
Gliozzo, A.M., Kalyanpur, A.: Predicting lexical answer types in open domain QA. Int. J. Semant. Web Inf. Syst. 8(3), 74–88 (2012)
Li, Y., Su, L., Chen, J., Yuan, L.: Semi-supervised learning for question classification in CQA. Nat. Comput. 16(4), 567–577 (2016). https://doi.org/10.1007/s11047-016-9554-5
Liaw, A., Wiener, M.: Classification and regression by RandomForest. R News 2(3), 18–22 (2002)
Mollá, D., González, J.L.V.: Question answering in restricted domains: an overview. Comput. Linguist. 33(1), 41–61 (2007)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, pp. 807–814 (2010)
Neves, M., Kraus, M.: BioMedLAT corpus: annotation of the lexical answer type for biomedical questions. In: Proceedings of the Open Knowledge Base and Question Answering Workshop, pp. 49–58 (2016)
Neves, M., Leser, U.: Question answering for biology. Methods 74, 36–46 (2015)
Peng, S., You, R., Xie, Z., Wang, B., Zhang, Y., Zhu, S.: The Fudan participation in the 2015 BioASQ challenge: large-scale biomedical semantic indexing and question answering. In: Proceedings of Working Notes of CLEF 2015 Conference and Labs of the Evaluation Forum (2015)
Sarrouti, M., Alaoui, S.O.E.: SemBioNLQA: a semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions. Artif. Intell. Med. 102, 101767 (2020)
Sarrouti, M., Lachkar, A., Ouatik, S.E.A.: Biomedical question types classification using syntactic and rule based approach. In: Proceedings of 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, vol. 1, pp. 265–272. IEEE (2015)
Schulze, F., et al.: HPI question answering system in BioASQ 2016. In: Proceedings of the 4th BioASQ Workshop, pp. 38–44 (2016)
Shin, M., Jang, D., Nam, H., Lee, K.H., Lee, D.: Predicting the absorption potential of chemical compounds through a deep learning approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 15(2), 432–440 (2016)
Tsatsaronis, G., et al.: An overview of the BioASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinform. 16, 138:1–138:28 (2015)
Tsoumakas, G., Katakis, I.: Multi-label classification: an overview. Int. J. Data Warehouse. Min. 3(3), 1–13 (2007)
Tsoumakas, G., Katakis, I., Vlahavas, I.: A review of multi-label classification methods. In: Proceedings of the 2nd ADBIS workshop on Data Mining and Knowledge Discovery, pp. 99–109 (2006)
Wasim, M., Asim, M.N., Khan, M.U.G., Mahmood, W.: Multi-label biomedical question classification for lexical answer type prediction. J. Biomed. Inform. 93 (2019)
Weissenborn, D., Tsatsaronis, G., Schroeder, M.: Answering factoid questions in the biomedical domain. In: Proceedings of the 1st Workshop on Bio-Medical Semantic Indexing and Question Answering (2013)
Yang, Z., Gupta, N., Sun, X., Xu, D., Zhang, C., Nyberg, E.: Learning to answer biomedical factoid & list questions: OAQA at BioASQ 3B. In: Proceedings of Working Notes of CLEF 2015 Conference and Labs of the Evaluation Forum (2015)
Yao, Y., Zhou, B.: Micro and macro evaluation of classification rules. In: Proceedings of 7th IEEE International Conference on Cognitive Informatics, pp. 441–448 (2008)
Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2013)
Acknowledgement
This work was partially supported by State Key Laboratory of Software Development Environment of China (No. SKLSDE-2019ZX-16).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Peng, K., Rong, W., Li, C., Hu, J., Xiong, Z. (2020). Weight Aware Feature Enriched Biomedical Lexical Answer Type Prediction. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12534. Springer, Cham. https://doi.org/10.1007/978-3-030-63836-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-63836-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63835-1
Online ISBN: 978-3-030-63836-8
eBook Packages: Computer ScienceComputer Science (R0)