Abstract
In this paper, we introduce an uncertain data mining driven model for knowledge discovery in chemical database. We aim at discovering relationship between molecule characteristics and properties using uncertain data mining tools. In fact, we intend to predict the Critical Micelle Concentration (CMC) property based on a molecule characteristics. To do so, we develop a likelihood-based belief function modelling approach to construct evidential database. Then, a mining process is developed to discover valid association rules. The prediction is performed using association rule fusion technique. Experiments were conducted using a real-world chemical databases. Performance analysis showed a better prediction outcome for our proposed approach in comparison with several literature-based methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
An amphiphilic molecule is chemical compound possessing both hydrophilic (water-loving, polar) and lipophilic (fat-loving) properties.
- 2.
Each subset A of \(2^{\varTheta }\), fulfilling \(m(A)>0\), is called a focal element.
References
Seeja, K., Zareapoor, M.: Fraudminer: A novel credit card fraud detection model based on frequent itemset mining. Sci. World J. 2014 (2014). http://dx.doi.org/10.1155/2014/252797
Chen, Z., Chen, G.: Building an associative classifier based on fuzzy association rules. Int. J. Comput. Intell. Syst. 1(3), 262–273 (2008)
Dehaspe, L., Toivonen, H., King, R.D.: Finding frequent substructures in chemical compounds. In: Proceeding of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD 1998), New York City, New York, USA, pp. 30–36 (1998)
King, R.D., Srinivasan, A., Dehaspe, L.: Warmr: a data mining tool for chemical data. J. Comput. Aided Mol. Des. 15(2), 173–181 (2001)
Sarfraz Iqbal, M., Golsteijn, L., Öberg, T., Sahlin, U., Papa, E., Kovarich, S., Huijbregts, M.A.: Understanding quantitative structure-property relationships uncertainty in environmental fate modeling. Environ. Toxicol. Chem. 32(5), 1069–1076 (2013)
Weng, C.H., Chen, Y.L.: Mining fuzzy association rules from uncertain data. Knowl. Inf. Syst. 23(2), 129–152 (2010)
Leung, C.S., MacKinnon, R., Tanbeer, S.: Fast algorithms for frequent itemset mining from uncertain data. In: Proceeding of IEEE International Conference on Data Mining (ICDM), Shenzhen, China, pp. 893–898, December 2014
Tong, Y., Chen, L., Cheng, Y., Yu, P.S.: Mining frequent itemsets over uncertain databases. In: Proceedings of the VLDB Endowment, vol. 5(11), pp. 1650–1661 (2012)
Samet, A., Lefevre, E., Ben Yahia, S.: Evidential database: a new generalization of databases? In: Proceedings of 3rd International Conference on Belief Functions, Belief 2014, Oxford, UK, pp. 105–114 (2014)
Samet, A., Lefevre, E., Ben Yahia, S.: Classification with evidential associative rules. In: Proceedings of 15th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Montpellier, France, pp. 25–35 (2014)
Hewawasam, K.R., Premaratne, K., Shyu, M.L.: Rule mining and classification in a situation assessment application: A belief-theoretic approach for handling data imperfections. Trans. Sys. Man Cyber. Part B 37(6), 1446–1459 (2007)
Samet, A., Dao, T.T.: Mining over a reliable evidential database: Application on amphiphilic chemical database. To appear in Proceeding of 14th International Conference on Machine Learning and Applications, IEEE ICMLA 2015, Miami, Florida (2015)
Nouaouri, I., Samet, A., Allaoui, H.: Evidential data mining for length of stay (LOS) prediction problem. In: Proceeding of 11th IEEE International Conference on Automation Science and Engineering, CASE 2015, Gothenburg, Sweden, 2015, pp. 1415–1420 (2015)
Lee, S.: Imprecise and uncertain information in databases: an evidential approach. In: Proceedings of Eighth International Conference on Data Engineering, Tempe, AZ, pp. 614–621 (1992)
Samet, A., Lefevre, E., Ben Yahia, S.: Mining frequent itemsets in evidential database. In: Proceedings of the Fifth International Conference on Knowledge and Systems Engeneering, Hanoi, Vietnam, pp. 377–388 (2013)
Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)
Appriou, A.: Multisensor signal processing in the framework of the theory of evidence. In: Application of Mathematical Signal Processing Techniques to Mission Systems, pp. 5–1 (1999)
Li, W., Han, J., Pei, J.: Cmar: accurate and efficient classification based on multiple class-association rules. In: Proceeding of IEEE International Conference on Data Mining (ICDM), San Jose, California, USA, pp. 369–376 (2001)
Samet, A., Lefèvre, E., Ben Yahia, S.: Evidential data mining: precise support and confidence. J. Intell. Inf. Syst. 1–29 (2016). http://dx.doi.org/10.1007/s10844-016-0396-5
Acknowledgement
This work was performed, in partnership with the SAS PIVERT, within the frame of the French Institute for the Energy Transition (Institut pour la Transition Energétique (ITE) P.I.V.E.R.T. (www.institut-pivert.com) selected as an Investment for the Future (“Investissements d’Avenir”). This work was supported, as part of the Investments for the Future, by the French Government under the reference ANR-001-01.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Samet, A. et al. (2016). Predictive Model Based on the Evidence Theory for Assessing Critical Micelle Concentration Property. In: Carvalho, J., Lesot, MJ., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2016. Communications in Computer and Information Science, vol 610. Springer, Cham. https://doi.org/10.1007/978-3-319-40596-4_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-40596-4_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40595-7
Online ISBN: 978-3-319-40596-4
eBook Packages: Computer ScienceComputer Science (R0)