Abstract
Feature selection for case representation is an essential phase of Case-Based Reasoning (CBR) system development. To (semi-)automate the feature selection process can ease the knowledge engineering process. This paper explores the feature importance provided for XGBoost models as basis for creating CBR systems. We use Patient-Reported Outcome Measurements (PROMs) on low back pain from the selfBACK project in our experiments. PROMs are a valuable source of information that capture physical, emotional as well as social aspects of well-being from the perspective of the patients. Leveraging the analytical capabilities of machine learning methods and data science techniques for exploiting PROMs have the potential of improving decision making. This paper presents a two-fold approach employed on our dataset for feature selection that combines statistical strength with data-driven knowledge modelling in CBR and compares it with permutation feature selection using XGBoost regressor. Furthermore, we compare the performance of the CBR models, built with the selected features, with two machine learning algorithms for predicting different PROMs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Aamodt, A., Plaza, E.: Case-based reasoning: Foundational issues, methodological variations, and system approaches. Artif. Intell. Commun. 7(1), 39–59 (1994)
Andritsos, P., Jurisica, I., Glasgow, J.I.: Case-based reasoning for biomedical informatics and medicine. In: Springer Handbook of Bio-/Neuroinformatics, pp. 207–221. Springer (2014)
Bach, K., Althoff, K.-D.: Developing case-based reasoning applications using myCBR 3. In: Agudo, B.D., Watson, I. (eds.) ICCBR 2012. LNCS (LNAI), vol. 7466, pp. 17–31. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32986-9_4
Bach, K., Mathisen, B.M., Jaiswal, A.: Demonstrating the mycbr rest api. In: ICCBR Workshops, pp. 144–155 (2019)
Bichindaritz, I., Marling, C.: Case-Based Reasoning in the Health Sciences: Foundations and Research Directions. In: Bichindaritz,, I., Vaidya, S., Jain, A., Jain, L.C. (eds.) Computational Intelligence in Healthcare 4. Studies in Computational Intelligence, vol 309. Springer, Heidelberg (2010)
Buitinck, L., et al.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp. 108–122 (2013)
Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014)
Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. Association for Computing Machinery, New York (2016)
Fisher, A., Rudin, C., Dominici, F.: All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously. J. Mach. Learn. Res. 20(177), 1–81 (2019)
Floyd, M.W., Davoust, A., Esfandiari, B.: Considerations for real-time spatially-aware case-based reasoning: a case study in robotic soccer imitation. In: European Conference on Case-Based Reasoning, pp. 195–209. Springer (2008)
Fontana, M.A., Lyman, S., Sarker, G.K., Padgett, D.E., MacLean, C.H.: Can machine learning algorithms predict which patients will achieve minimally clinically important differences from total joint arthroplasty? Clin. Orthop. Relat. Res. 477(6), 1267–1279 (2019)
Harris, A.H., Kuo, A.C., Weng, Y., Trickey, A.W., Bowe, T., Giori, N.J.: Can machine learning methods produce accurate and easy-to-use prediction models of 30-day complications and mortality after knee or hip arthroplasty? Clin. Orthop. Relat. Res. 477(2), 452 (2019)
Holt, A., Bichindaritz, I., Schmidt, R., Perner, P.: Medical applications in case-based reasoning. Knowl. Eng. Rev. 20(3), 289–292 (2005)
Huber, M., Kurz, C., Leidl, R.: Predicting patient-reported outcomes following hip and knee replacement surgery using supervised machine learning. BMC Med. Inform. Decis. Mak. 19(1), 3 (2019)
Hutter, F., Kotthoff, L., Vanschoren, J.: Automated Machine Learning. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-05318-5
Li, Y., Shiu, S.C.K., Pal, S.K., Liu, J.N.K.: A rough set-based case-based reasoner for text categorization. Int. J. Approximate Reasoning 41(2), 229–255 (2006)
Li, Y.F., Xie, M., Goh, T.: A study of mutual information based feature selection for case based reasoning in software cost estimation. Expert Syst. Appl. 36(3), 5921–5931 (2009)
Lin, I., et al.: What does best practice care for musculoskeletal pain look like? eleven consistent recommendations from high-quality clinical practice guidelines: systematic review. Br. J. Sports Med. 54(2), 79–86 (2020)
Rahman, Q.A., Janmohamed, T., Clarke, H., Ritvo, P., Heffernan, J., Katz, J.: Interpretability and class imbalance in prediction models for pain volatility in manage my pain app users: analysis using feature selection and majority voting methods. JMIR Med. Inf. 7(4), e15601 (2019)
Rahman, Q.A., Janmohamed, T., Pirbaglou, M., Clarke, H., Ritvo, P., Heffernan, J.M., Katz, J.: Defining and predicting pain volatility in users of the manage my pain app: analysis using data mining and machine learning methods. Journal of medical Internet research 20(11), e12001 (2018)
Salamó, M., Golobardes, E.: Rough sets reduction techniques for case-based reasoning. In: Aha, D.W., Watson, I. (eds.) ICCBR 2001. LNCS (LNAI), vol. 2080, pp. 467–482. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44593-5_33
Salamo, M., Lopez-Sanchez, M.: Rough set based approaches to feature selection for case-based reasoning classifiers. Pattern Recogn. Lett. 32(2), 280–292 (2011)
Sandal, L.F., et al.: An app-delivered self-management program for people with low back pain: protocol for the selfback randomized controlled trial. JMIR Res. Protoc 8(12), e14720 (2019)
Vallat, R.: Pingouin: statistics in python. J. Open Source Softw. 3(31), 1026 (2018). https://doi.org/10.21105/joss.01026
Verma, D., Bach, K., Mork, P.J.: Modelling similarity for comparing physical activity profiles - a data-driven approach. In: Cox, M.T., Funk, P., Begum, S. (eds.) CBR Research and Development. Springer, Cham (2018)
Verma, D., Bach, K., Mork, P.J.: Similarity measure development for case-based reasoning–a data-driven approach. In: Bach, K., Ruocco, M. (eds.) NAIS 2019. CCIS, vol. 1056, pp. 143–148. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-35664-4_14
Wu, A., Kharrazi, H., Boulware, L., Snyder, C.: Measure once, cut twice -adding patient-reported outcome measures to the electronic health record for comparative effectiveness research. J Clin. Epidemiol. 66, S12–20 (2013)
Xiong, N., Funk, P.: Construction of fuzzy knowledge bases incorporating feature selection. Soft. Comput. 10(9), 796–804 (2006)
Xiong, N., Funk, P.: Combined feature selection and similarity modelling in case-based reasoning using hierarchical memetic algorithm. In: IEEE Congress on Evolutionary Computation, pp. 1–6. IEEE (2010)
Zhu, G.N., Hu, J., Qi, J., Ma, J., Peng, Y.H.: An integrated feature selection and cluster analysis techniques for case-based reasoning. Eng. Appl. Artif. Intell. 39, 14–22 (2015)
Acknowledgement
The work has been conducted as part of the selfBACK and Back-UP projects, which have received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 689043 and No 777090.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Verma, D., Bach, K., Mork, P.J. (2021). Using Automated Feature Selection for Building Case-Based Reasoning Systems: An Example from Patient-Reported Outcome Measurements. In: Bramer, M., Ellis, R. (eds) Artificial Intelligence XXXVIII. SGAI-AI 2021. Lecture Notes in Computer Science(), vol 13101. Springer, Cham. https://doi.org/10.1007/978-3-030-91100-3_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-91100-3_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91099-0
Online ISBN: 978-3-030-91100-3
eBook Packages: Computer ScienceComputer Science (R0)