Abstract
Bayesian Networks (BNs) have been recently employed to solve meteorology problems. In this paper, the application of BNs for mining a real-world weather dataset is described. The employed dataset discriminates between “wet fog” instances and “other weather conditions” instances, and it contains many missing data. Therefore, BNs were employed not only for classifying instances, but also for filling missing data. In addition, the Markov Blanket concept was employed to select relevant attributes. The efficacy of BNs to perform the aforementioned tasks was assessed by means of several experiments. In summary, more convincing results were obtained by taking advantage of the fact that BNs can directly (i.e. without data preparation) classify instances containing missing values. In addition, the attributes selected by means of the Markov Blanket provide a simpler, faster, and equally accurate classifier.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Basak, J., Sudarshan, A., Trivedi, D., Santhanam, M.S., Weather Data Mining Using Independent Component Analysis, Journal of Machine Learning Research, n.5, pp. 239–253, 2004.
Cano, R., Sordo, C., Gutiérrez, J.M., Applications of Bayesian Networks in Meteorology, Advances in Bayesian Networks, Gámez, J.A. et al. eds., pp. 309–327, Springer, 2004.
Cofiño, A.S., Gutiérrez, J.M., Jakubiak, B., Melonek, M., Implementation of data mining techniques for meteorological applications. In: Realizing Teracomputing, Zwieflhofer, W. & N. Kreitz eds., pp. 256–271, World Scientific Publishing, 2003.
Heckerman, D. “Bayesian networks for data mining,” Data Mining and Knowledge Discovery, vol. 1, pp. 79–119, 1997.
Hruschka JR., E. R., Hruschka, E. R., Ebecken, N. F. F. A Data Preparation Bayesian Approach for a Clustering Genetic Algorithm. In: Frontiers in Artificial Intelligence and Applications, Soft Computing Systems: Design, Management and Applications, IOS Press, v.87, pp. 453–461, 2002.
Blum, A.L., Langley, P., Selection of Relevant Features and Examples in Machine Learning, Artificial Intelligence, pp. 245–271, 1997.
Hruschka JR., E. R., Hruschka, E. R., Ebecken, N. F. F. Feature Selection by Bayesian Networks In: The Seventeenth Canadian Conference on Artificial Intelligence, 2004, London, Ontario. Lecture Notes in Artificial Intelligence. Berlin: Springer-Verlag, v. 3060, pp. 370–379, 2004.
Pearl, J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann, San Mateo, CA, 1988.
Friedman, N. and Koller, D., Being Bayesian about network structure. A Bayesian approach to structure discovery in Bayesian networks, Machine Leraning 50(1–2): 95–125, 2003.
Cooper, Gregory F. NESTOR: A computer-based medical diagnostic aid that integrates causal and probabilistic knowledge, PhD. thesis, Rep. No. STAN-CS-84-48 (also HPP-84-48) Dept. of Computer Science, Stanford Univ., CA, 1984.
Chickering, D. M., Optimal Structure Identification with Greedy Search, Journal of Machine Learning Research, (3):507–554, 2002.
Spirtes, P., Glymour, C. and Scheines, R., Causation, Prediction, and Search, (Adaptive Computation and Machine Learning), 2nd edition, Bradford Books, 2001.
Cheng, J., Greiner, R., Kelly, J., Bell, D., Liu, W.R., Learning Bayesian networks from data: An information-theory based approach. Artificial Intelligence, 137(1–2): 43–90, 2002.
Cooper G. & Herskovitz, E.. A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning, 9, 309–347, 1992.
Langley, P. & Sage, S., Induction of Selective Bayesian Classifiers. Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence, Seattle, 1994.
Anderson, J. R. & Matessa, M., Explorations of an Incremental Bayesian Algorithm for categorization. Machine Learning, 9, 275–308, 1992.
Hsu, W. H., Genetic Wrappers for feature selection in decision tree induction and variable ordering in Bayesian network structure learning, Information Science, 163, pp. 103–122,2004.
Cheng, J. and Greiner, R., Comparing Bayesian Network Classifiers, Proc. of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI’ 99), Sweden, pp. 101–108, 1999.
Ying, Y. and Webb, G., On Why Discretization Works for Naive-Bayes Classifiers. In Proceedings of the 16th Australian Conference on AI (AI 03), Lecture Notes AI 2903, 440–452. Berlin: Springer, 2003.
Ying, Y., Discretization for Naive-Bayes Learning. PhD. Thesis, Monash University, 2003b. http://www.cs.uvm.edu/~yyang/Yingthesis.pdf
Witten, I. H., Frank, E., Data Mining — Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers, USA, 2000.
Dempster, A. P., Laird, N. M., Rubin, D. B., Maximum Likelihood from Incomplete Data via the EM algorithm, Journal of the Royal Statistical Society B, 39,1–39, 1977.
Gelfand, A.,E. and Smith, A. F. M., Sampling-based approaches to calculating marginal densities. J. American Statistical Association, 85:398–409, 1990.
Casella, G. and George, E. I., “Explaining the Gibbs sampler,” Amer. Statist., vol. 46, pp. 167–174, 1992.
Bigus, J. P., Data Mining with Neural Networks, First edition, USA, McGraw-Hill, 1996.
Han, J. and Kamber, M., Data Mining, Concepts and Techniques. Morgan Kaufmann, 2001.
Reunanen, J., Overfitting in Making Comparissons Between Variable Selection Methods, Journal of Machine Learning Research 3, pp. 1371–1382, 2003.
Liu, H. and Motoda, H., Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic, 1998.
Guyon, I., Elisseeff, A., An Introduction to Variable and Feature Selection, Journal of Machine Learning Research 3, pp. 1157–1182, 2003.
Little, R., & Rubin, D. B., Statistical Analysis with Missing Data. Wiley, New York, 1987.
Lauritzen, S. L., & Spiegelhalter, D. J., Local computations with probabilities on graphical structures and their application to expert systems. J. Royal Statistical Society B, 50, 157–224, 1988.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag London Limited
About this paper
Cite this paper
Hruschka, E.R., Hruschka, E.R., Ebecken, N.F.F. (2006). Applying Bayesian Networks for Meteorological Data Mining. In: Macintosh, A., Ellis, R., Allen, T. (eds) Applications and Innovations in Intelligent Systems XIII. SGAI 2005. Springer, London. https://doi.org/10.1007/1-84628-224-1_10
Download citation
DOI: https://doi.org/10.1007/1-84628-224-1_10
Publisher Name: Springer, London
Print ISBN: 978-1-84628-223-2
Online ISBN: 978-1-84628-224-9
eBook Packages: Computer ScienceComputer Science (R0)