{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T19:14:47Z","timestamp":1740165287887,"version":"3.37.3"},"reference-count":29,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2018,12,28]],"date-time":"2018-12-28T00:00:00Z","timestamp":1545955200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJGI"],"abstract":"Mainstream machine learning approaches to predictive analytics consistently prove their ability to perform well using a variety of datasets, although the task of identifying an optimally-performing machine learning approach for any given dataset becomes much less intuitive. Methods such as ensemble and transformation modeling have been developed to improve upon individual base learners and datasets with large degrees of variance. Despite the increased generalizability and flexibility of ensemble approaches, the cost often involves sacrificing inference for predictive ability. This paper introduces an alternative approach to ensemble modeling, combining the predictive ability of an ensemble framework with localized model construction through the incorporation of cluster analysis as a pre-processing technique. The workflow not only outperforms independent base learners and comparative ensemble methods, but also preserves local inferential capability by manipulating cluster parameters and maintaining interpretable relative importance values and non-transformed coefficients for the overall consideration of variable importance. This paper demonstrates the ensemble technique on a dataset to estimate rates of health insurance coverage across the state of Missouri, where the cluster pre-processing assists in understanding both local and global variable importance and interactions when predicting high concentration areas of low health insurance coverage based on demographic, socioeconomic, and geospatial variables.<\/jats:p>","DOI":"10.3390\/ijgi8010013","type":"journal-article","created":{"date-parts":[[2018,12,28]],"date-time":"2018-12-28T16:52:42Z","timestamp":1546015962000},"page":"13","source":"Crossref","is-referenced-by-count":11,"title":["A Cluster-Based Machine Learning Ensemble Approach for Geospatial Data: Estimation of Health Insurance Status in Missouri"],"prefix":"10.3390","volume":"8","author":[{"given":"Erik","family":"Mueller","sequence":"first","affiliation":[{"name":"Integrated & Applied Sciences: Bioinformatics & Geospatial Biology, College of Arts and Sciences, Saint Louis University, St. Louis, MO 63103, USA"}]},{"given":"J. S. On\u00e9simo","family":"Sandoval","sequence":"additional","affiliation":[{"name":"Department of Sociology and Anthropology, College of Arts and Sciences, Saint Louis University, St. Louis, MO 63103, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2261-3794","authenticated-orcid":false,"given":"Srikanth","family":"Mudigonda","sequence":"additional","affiliation":[{"name":"School for Professional Studies, Saint Louis University, St. Louis, MO 63103, USA"}]},{"given":"Michael","family":"Elliott","sequence":"additional","affiliation":[{"name":"Department of Epidemiology and Biostatistics, College for Public Health and Social Justice, Saint Louis University, St. Louis, MO 63103, USA"}]}],"member":"1968","published-online":{"date-parts":[[2018,12,28]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Waller, L.A., and Gotway, C.A. (2004). Applied Spatial Statistics for Public Health Data, John Wiley & Sons.","DOI":"10.1002\/0471662682"},{"key":"ref_2","unstructured":"Trivedi, S., Pardos, Z.A., and Heffernan, N.T. (arXiv, 2015). The utility of clustering in prediction tasks, arXiv."},{"key":"ref_3","unstructured":"Trivedi, S., Pardos, Z.A., and Heffernan, N.T. (July, January 28). Clustering students to generate an ensemble to improve standard test score predictions. Proceedings of the International Conference on Artificial Intelligence in Education, Auckland, New Zealand."},{"key":"ref_4","unstructured":"Trivedi, S., Pardos, Z., S\u00e1rk\u00f6zy, G., and Heffernan, N. (2011, January 6\u20138). Spectral clustering in educational data mining. Proceedings of the 4th International Conference on Educational Data Mining, Eindhoven, The Netherlands."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"67","DOI":"10.1109\/4235.585893","article-title":"No free lunch theorems for optimization","volume":"1","author":"Wolpert","year":"1997","journal-title":"IEEE Trans. Evol. Comput."},{"key":"ref_6","unstructured":"Alpaydin, E. (2014). Introduction to Machine Learning, MIT Press."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Kuncheva, L.I. (2004). Combining Pattern Classifiers: Methods and Algorithms, John Wiley & Sons.","DOI":"10.1002\/0471660264"},{"key":"ref_8","unstructured":"USCB (2018, October 17). American Community Survey (ACS), Available online: https:\/\/www.census.gov\/programs-surveys\/acs\/."},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"12866","DOI":"10.3390\/ijerph111212866","article-title":"The public health exposome: A population-based, exposure science approach to health disparities research","volume":"11","author":"Juarez","year":"2014","journal-title":"Int. J. Environ. Res. Public Health"},{"key":"ref_10","unstructured":"USCB (2017). The National Map: Transportation."},{"key":"ref_11","unstructured":"USCB (2018). Topologically Integrated Geographic Encoding and Referencing Datasets."},{"key":"ref_12","unstructured":"Missouri Department of Health and Senior Services (2015). Missouri DHSS Bureau of Health Care Analysis and Data Dissemination."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"1640","DOI":"10.1377\/hlthaff.24.6.1640","article-title":"Legal status and health insurance among immigrants","volume":"24","author":"Goldman","year":"2005","journal-title":"Health Aff."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"2105","DOI":"10.2105\/AJPH.93.12.2105","article-title":"The association of race, socioeconomic status, and health insurance status with the prevalence of overweight among children and adolescents","volume":"93","author":"Haas","year":"2003","journal-title":"Am. J. Public Health"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"917","DOI":"10.1136\/bmj.317.7163.917","article-title":"Income distribution, socioeconomic status, and self rated health in the United States: Multilevel analysis","volume":"317","author":"Kennedy","year":"1998","journal-title":"BMJ"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"1398","DOI":"10.1377\/hlthaff.2012.1426","article-title":"Low-socioeconomic-status enrollees in high-deductible plans reduced high-severity emergency care","volume":"32","author":"Wharam","year":"2013","journal-title":"Health Aff."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"135","DOI":"10.1111\/j.1475-6773.2005.00346.x","article-title":"The effects of geography and spatial behavior on health care utilization among the residents of a rural region","volume":"40","author":"Arcury","year":"2005","journal-title":"Health Serv. Res."},{"key":"ref_18","first-page":"1142","article-title":"Patient perspectives on primary health care in rural communities: Effects of geography on access, continuity and efficiency","volume":"9","author":"Regan","year":"2009","journal-title":"Rural Remote Health"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"182","DOI":"10.1016\/S0749-3797(01)00349-X","article-title":"Are rural residents less likely to obtain recommended preventive healthcare services?","volume":"21","author":"Casey","year":"2001","journal-title":"Am. J. Prev. Med."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"234","DOI":"10.2307\/143141","article-title":"A computer movie simulating urban growth in the Detroit region","volume":"46","author":"Tobler","year":"1970","journal-title":"Econ. Geogr."},{"key":"ref_21","unstructured":"Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., and Leisch, F. (2015). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien, R Foundation. R Package Version 1.6-7."},{"key":"ref_22","unstructured":"Breiman, L. (2018, October 10). randomForest: Breiman and Cutler\u2019s Random Forests for Classification and Regression. Available online: https:\/\/www.stat.berkeley.edu\/~breiman\/RandomForests\/."},{"key":"ref_23","unstructured":"Ridgeway, G. (2010). Gbm: Generalized Boosted Regression Models, R Foundation. R Package Version 1.6-3.1."},{"key":"ref_24","unstructured":"Friedman, J., Hastie, T., Simon, N., and Tibshirani, R. (2016). Lasso and Elastic-Net Regularized Generalized Linear Models, R Foundation. R-Package Version 2.0-5."},{"key":"ref_25","unstructured":"Mevik, B.-H., Wehrens, R., and Liland, K.H. (2011). Pls: Partial Least Squares and Principal Component Regression, R Foundation. R Package Version."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An Introduction to Statistical Learning, Springer.","DOI":"10.1007\/978-1-4614-7138-7"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Shalev-Shwartz, S., and Ben-David, S. (2014). Understanding Machine Learning: From Theory to Algorithms, Cambridge University Press.","DOI":"10.1017\/CBO9781107298019"},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"881","DOI":"10.1109\/TPAMI.2002.1017616","article-title":"An efficient k-means clustering algorithm: Analysis and implementation","volume":"24","author":"Kanungo","year":"2002","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_29","first-page":"1","article-title":"Package \u2018NbClust\u2019","volume":"61","author":"Charrad","year":"2014","journal-title":"J. Stat. Softw."}],"container-title":["ISPRS International Journal of Geo-Information"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2220-9964\/8\/1\/13\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,6,15]],"date-time":"2024-06-15T07:45:05Z","timestamp":1718437505000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2220-9964\/8\/1\/13"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2018,12,28]]},"references-count":29,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2019,1]]}},"alternative-id":["ijgi8010013"],"URL":"https:\/\/doi.org\/10.3390\/ijgi8010013","relation":{},"ISSN":["2220-9964"],"issn-type":[{"type":"electronic","value":"2220-9964"}],"subject":[],"published":{"date-parts":[[2018,12,28]]}}}