Abstract
Species’ potential distribution modelling is the process of building a representation of the fundamental ecological requirements for a species and extrapolating these requirements into a geographical region. The importance of being able to predict the distribution of species is currently highlighted by issues like global climate change, public health problems caused by disease vectors, anthropogenic impacts that can lead to massive species extinction, among other challenges. There are several computational approaches that can be used to generate potential distribution models, each achieving optimal results under different conditions. However, the existing software packages available for this purpose typically implement a single algorithm, and each software package presents a new learning curve to the user. Whenever new software is developed for species’ potential distribution modelling, significant duplication of effort results because many feature requirements are shared between the different packages. Additionally, data preparation and comparison between algorithms becomes difficult when using separate software applications, since each application has different data input and output capabilities. This paper describes a generic approach for building a single computing framework capable of handling different data formats and multiple algorithms that can be used in potential distribution modelling. The ideas described in this paper have been implemented in a free and open source software package called openModeller. The main concepts of species’ potential distribution modelling are also explained and an example use case illustrates potential distribution maps generated by the framework.
Similar content being viewed by others
References
Peterson AT (2006) Uses and requirements of ecological niche models and related distributional models. Biodiversity Informatics 3:59–72
Canhos VP, Souza S, De Giovanni R, Canhos DAL (2004) Global Biodiversity Informatics: setting the scene for a “new world” of ecological forecasting. Biodiversity Informatics 1:1
Yesson C, Brewer PW, Sutton T, Caithness N, Pahwa JS, Burgess M, Gray WA, White RJ, Jones AC, Bisby FA, Culham A (2007) How global is the Global Biodiversity Information Facility. PLoS ONE 2(11):e1124
Anderson RP, Lew D, Peterson AT (2003) Evaluating predictive models of species’ distributions: criteria for selecting optimal models. Ecol Model 162:211–232
Peterson AT, Papes M, Eaton M (2007) Transferability and model evaluation in ecological niche modeling: a comparison of GARP and Maxent. Ecography 30(4):550–560
Elith J, Graham CH, Anderson RP, Dudík M, Ferrier S, Guisan A, Hijmans RJ, Huettmann F, Leathwick JR, Lehmann AL, Li J, Lohman LG, Loiselle BA, Manion G, Moritz C, Nakamura M, Nakazawa Y, Overton JMcC, Peterson AT, Phillips SJ, Richardson K, Scachetti-Pereira R, Schapire RE, Soberón J, Williams S, Wisz MS, Zimmermann NE (2006) Novel methods improve prediction of species’ distributions from occurrence data. Ecography 29:129–151
Manel S, Dias JM, Buckton ST, Ormerod SJ (1999) Alternative methods for predicting species distribution: an illustration with Himalayan river birds. J Appl Ecol 36:734–747
Johnson CJ, Gillingham MP (2005) An evaluation of mapped species distribution models used for conservation planning. Environ Conserv 32:117–128
Phillips SJ, Anderson RP, Schapire RE (2006) Maximum entropy modeling of species geographic distributions. Ecol Model 190:231–259
Carpenter G, Gillison AN, Winter J (1993) DOMAIN: a flexible modeling procedure for mapping potential distributions of animals and plants. Biodivers Conserv 2:667–680
Scachetti-Pereira R (2002) Desktop GARP, http://www.nhm.ku.edu/desktopgarp, October, 24, 2007
Thuiller W (2003) BIOMOD—optimizing prediction of species distributions and projecting potential future shifts under global change. Glob Chang Biol 9:1353–1362
Garzón MB, Blazek R, Neteler M, de Dios RS, Ollero HS, Furlanello C (2006) Predicting habitat suitability with machine learning models: the potential area of Pinus sylvestris L. in the Iberian Peninsula. Ecol Model 97:383–393
MacArthur RH (1972) Geographical ecology: patterns in the distribution of species. Harper and Row, New York
Hutchinson GE (1957) Concluding remarks. Cold Spring Harb Symp Quant Biol 22:415–442
Soberón J, Peterson AT (2005) Interpretation of models of fundamental ecological niches and species’ distributional areas. Biodiversity Informatics, https://journals.ku.edu/index.php/jbi/article/view/4
Anderson RP, Laverde M, Peterson AT (2002) Using niche-based GIS modeling to test geographic predictions of competitive exclusion and competitive release in South American pocket mice. Oikos 93:3–16
Ferrier S, Drielsma M, Manion G, Watson G (2002) Extended statistical approaches to modelling spatial pattern in biodiversity in northeast New South Wales. II. Biodivers Conserv 11(12):2309–2338
Kearney M, Porter W (2009) Mechanistic niche modelling: combining physiological and spatial data to predict species’ ranges. Ecol Lett 12(4):334–350
Beaman R, Conn B (2003) Automated geoparsing and georeferencing of Malesian collection locality data. Telopea 10:43–52
Guralnick RP, Hill AW, Lane M (2007) Towards a collaborative, global infrastructure for biodiversity assessment. Ecol Lett 10(8):663–672
Longley PA, Goodchild MF, Maguire DJ, Rhind DW (2005) Geographic information systems and science, 2nd edn. John Wiley & Sons, Chichester, 517 p
Nix HA (1986) A biogeographic analysis of Australian elapid snakes. In: Longmore R (ed) Atlas of Australian elapid snakes. Australian Flora and Fauna Series 8:4–15
Stockwell DRB, Noble IR (1992) Induction of sets of rules from animal distribution data: a robust and informative method of analysis. Math Comput Simul 33:385–390
Mladenoff DJ, Sickley TA, Haight RG, Wydeven AP (1995) A regional landscape analysis and prediction of favorable greywolf habitat in the northern Great Lakes region. Conserv Biol 9:279–294
Bian L, West E (1997) GIS modeling of Elk calving habitat in a prairie environment with statistics. Photogramm Eng Remote Sensing 63:161–167
Frescino TS, Edwards TC, Moisen GG (2001) Modeling spatially explicit forest structural attributes using generalized additive models. J Veg Sci 12:15–26
Kelly NM, Fonseca M, Whitfield P (2001) Predictive mapping for management and conservation of seagrass beds in North Carolina. Aquatic Conservation: Marine and Freshwater Ecosystems 11(6):437–451
Guisan A, Edwards TC, Hastie T (2002) Generalized linear and generalized additive models in studies of species distributions: setting the scene. Ecol Model 157:89–100
Felicísimo AM, Francés E, Fernández JM, González-Díez A, Varas J (2002) Modeling the potential distribution of forests with a GIS. Photogramm Eng Remote Sensing 68:455–462
Fonseca MS, Whitfield PE, Kelly NM, Bell SS (2002) Statistical modeling of seagrass landscape pattern and associated ecological attributes in relation to hydrodynamic gradients. Ecol Appl 12(1):218–237
Livingston SA, Todd CS, Krohn WB, Owen RB (1990) Habitat models for nesting bald eagles in Maine. J Wildl Manage 54(4):644–653
Fielding AH, Haworth PF (1995) Testing the generality of bird-habitat models. Conserv Biol 9(6):1466–1481
Pearson RG, Dawson TP, Berry PM, Harrison PA (2002) SPECIES: a spatial evaluation of climate impact on the envelope of species. Ecol Model 154(3):289–300
Guo Q, Kelly M, Graham CH (2005) Support vector machines for predicting distribution of sudden oak death in California. Ecol Model 182(1):75–90
Leathwick JR, Elith J, Francis MP, Hastie T, Taylor P (2006) Variation in demersal fish species richness in the oceans surrounding New Zealand: an analysis using boosted regression trees. Mar Ecol Prog Ser 321:267–281
Kaschner K, Ready JS, Agbayani E, Rius J, Kesner-Reyes K, Eastwood PD, South AB, Kullander SO, Rees T, Close CH, Watson R, Pauly D, Froese R (2007) AquaMaps: predicted range maps for aquatic species, http://www.aquamaps.org, December, 2007
Quinlan JR (1986) Induction of decision trees. Mach Learn 1:81–106
Yesson C, Culham A (2006) Phyloclimatic modelling: combining phylogenetics and bioclimatic modelling. Syst Biol 55(5):788–802
Yesson C, Culham A (2006) A phyloclimatic study of cyclamen. BMC Evol Biol 6:72
Vapnik V (1995) The nature of statistical learning theory. Springer-Verlag, New York
Ripley BD (1996) Pattern recognition and neural networks. Cambridge University Press, Cambridge
Peterson AT, Vieglais DA, Navarro-Sigüenza AG, Silva M (2003) A global distributed biodiversity information network: building the world museum. Bull Br Ornithol Club 123A:186–196
Soberón J, Peterson AT (2004) Biodiversity informatics: managing and applying primary biodiversity data. Philos Trans R Soc Lond, B 359:689–698
Stein BR, Wieczorek JR (2004) Mammals of the world: MaNIS as an example of data integration in a distributed network environment. Biodiversity Informatics
Robertson MP, Caithness N, Villet MH (2001) A PCA-based modelling technique for predicting environmental suitability for organisms from presence records. Divers Distrib 7:15–27
Durigan G, Baitello JB, Franco GADC, Siqueira MF (2004) Plantas do cerrado paulista: imagens de uma paisagem ameaçada. Páginas & Letras Editora e Gráfica, São Paulo, 475 p
Ratter JA, Bridgewater S, Ribeiro JF, Dias TAB, Silva MR (2000) Distribuição das espécies lenhosas da fitofisionomia cerrado sentido restrito nos estados compreendidos no bioma cerrado. Bol Herb Ezechias Paulo Heringer 5:5–43
Durigan G, Siqueira MF, Franco GADC, Bridgewater S, Ratter JA (2003) The vegetation of priority areas for cerrado conservation in São Paulo state, Brazil. Edinb J Bot 60:217–241
Ratter JA, Bridgewater S, Atkinson R, Ribeiro JF (1996) Analysis of the floristic composition of the Brazilian cerrado vegetation II: comparison of the woody vegetation of 98 areas. Edinb J Bot 53:153–180
Hijmans RJ, Cameron SE, Parra JL, Jones PG, Jarvis A (2005) Very high resolution interpolated climate surfaces for global land areas. Int J Climatol 25:1965–1978
Chapman AD (2005) Principles of data quality, version 1.0 report for the Global Biodiversity Information Facility. Copenhagen, Denmark, pp1–58, http://www2.gbif.org/DataQuality.pdf
Guralnick RP, Wieczorek JR, Beaman R, Hijmans RJ, the BioGeomancer Working Group (2006) Biogeomancer: automated georeferencing to map the world’s biodiversity data. PLoS Biol 4(11):e381
Wheeler QD, Raven PH, Wilson EO (2004) Taxonomy: impediment or expedient? Science 303:285
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323
Michie D, Spiegelhalter DJ, Taylor CC (1994) Machine learning, neural and statistical classification. Ellis Horwood, New York
Stockwell DRB, Peterson AT (2002) Predicting species occurrences: issues of accuracy and scale, controlling bias in biodiversity data. Island, Washington, pp 537–546
Johnson CM, Johnson LB, Richards C, Beasley V (2002) Predicting species occurrences: issues of accuracy and scale, predicting the occurrence of amphibians: an assessment of multiple-scale models. Island, Washington, pp 157–170
Chapman AD, Muñoz MES, Koch I (2005) Environmental information: placing biodiversity phenomena in an ecological and environmental context. Biodiversity Informatics, https://journals.ku.edu/index.php/jbi/article/view/5
Hartkamp AD, De Beurs K, Stein A, White JW (1999) Interpolation techniques for climate variables, NRG-GIS Series 99-01, CIMMYT, Mexico D.F.
Bannerman BS (1999) Positional accuracy, error and uncertainty in spatial information. Geoinnovations, Howard Springs, NT, Australia. http://www.geoinnovations.com.au/posacc/default.htm. Accessed 12 Jun 2009
Acknowledgements
The openModeller framework was originally developed by CRIA with support from FAPESP during the speciesLink project. After being released as a free and open source software package, other projects and institutions started to collaborate. The University of Kansas Natural History Museum & Biodiversity Research Center, the University of Reading and also individual contributors helped to significantly further develop the framework. In the beginning of 2005, openModeller received additional support from FAPESP as part of a new thematic project to be carried out by three Brazilian institutions: CRIA, INPE and Poli/USP.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
de Souza Muñoz, M.E., De Giovanni, R., de Siqueira, M.F. et al. openModeller: a generic approach to species’ potential distribution modelling. Geoinformatica 15, 111–135 (2011). https://doi.org/10.1007/s10707-009-0090-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-009-0090-7