Abstract
Many systems (manufacturing, environmental, health, etc.) generate counts (or rates) of events that are monitored to detect changes. Modern data complements event counts with many additional measurements (such as geographic, demographic, and others) that comprise high-dimensional attributes. This leads to an important challenge to detect a change that only occurs within a region, initially unspecified, defined by these attributes and current methods to handle the attribute information are challenged by high-dimensional data. Our approach transforms the problem to supervised learning, so that properties of an appropriate learner can be described. Rather than error rates, we generate a signal (of a system change) from an appropriate feature selection algorithm. A measure of statistical significance is included to control false alarms. Results on simulated examples are provided.
This material is based upon work supported by the National Science Foundation under Grant 0743160 and the Office of Naval Research under Grant N000140910656.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bonetti, M., Pagano, M.: The interpoint distance distribution as a descriptor of point patterns, with an application to spatial disease clustering. Statistics in Medicine 24(5), 753–773 (2005)
Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)
Fonseca Nobre, F., Sa Carvalho, M.: Spatial and temporal analysis of epidemiological data. GIS for Health and the Environment (1995)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Kleinman, K., Lazarus, R., Platt, R.: A Generalized Linear Mixed Models Approach for Detecting Incident Clusters of Disease in Small Areas, with an Application to Biological Terrorism. American Journal of Epidemiology 159(3), 217–224 (2004), http://aje.oxfordjournals.org/cgi/content/abstract/159/3/217
Kulldorff, M.: Information Management Services, Inc.: SaTScanTM v8.0 User Guide (2009)
Kulldorff, M.: A spatial scan statistic. Communications in Statistics–Theory and Methods 26(6), 1481–1496 (1997)
Kulldorff, M.: Prospective time periodic geographical disease surveillance using a scan statistic. Journal of the Royal Statistical Society 164(1), 61–72 (2001)
Lawson, A.B., Browne, W.J., Vidal-Rodeiro, C.L.: Disease Mapping with WinBUGS and MLwiN. John Wiley and Sons, Chichester (2003)
Lawson, A.B., Waller, L.A.: A review of point pattern methods for a spatial modelling of events around sources of pollution. Environmetrics 7(5), 471–487 (1996)
Rogerson, P.A.: Monitoring point patterns for the development of space time clusters. Journal of the Royal Statistical Society 164(1), 87–96 (2001)
Tango, T.: A class of tests for detecting general and focused clustering of rare diseases. Statistics in Medicine 14(21–22), 2323–2334 (1995)
Tuv, E., Borisov, A., Runger, G., Torkkola, K.: Feature selection with ensembles, artificial variables, and redundancy elimination. J. Mach. Learn. Res. 10, 1341–1366 (2009)
U.S. Census Bureau Geography Division: Census 2000: State and state equivalent areas in arcview shapefile (.shp) format (July 2001), http://www.census.gov/geo/www/cob/st2000.html
Weinstock, M.A.: A generalised scan statistic test for the detection of clusters. International Journal of Epidemiology 10(3), 289–293 (1981), http://ije.oxfordjournals.org/cgi/content/abstract/10/3/289
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dávila, S., Runger, G., Tuv, E. (2011). High-Dimensional Surveillance. In: Honkela, T., Duch, W., Girolami, M., Kaski, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2011. ICANN 2011. Lecture Notes in Computer Science, vol 6792. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21738-8_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-21738-8_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21737-1
Online ISBN: 978-3-642-21738-8
eBook Packages: Computer ScienceComputer Science (R0)