Abstract
Association rules mining is an important and classic research topic in Data Mining, and has been widely applied in many real-life cases. The primary time and memory consumption in association rules mining is from its first step - frequent itemsets mining. With the development of cloud computing, outsourcing this task to third-party service providers will save efforts in system development, deployment, operation, etc. Outsourcing, however, actually brings risks and difficulties in verifying the results returned by these services. In this paper, we focus on verifying the integrity of the results returned by outsourcing services. We propose a metamorphic-based method, which is light-weight and requires not much complicated process. The key point of our method is the construction of a set of metamorphic relations (MRs). Through analysis and experimental research, we show that our approach delivers quite satisfactory results.
This work is supported by National Key R&D Program of China (2018YFB1003901), the National Key Basic Research and Development Program of China (973 Program 2014CB340702), and the National Natural Science Foundation of China (61572375, 61772263).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Strictly speaking, Apriori consists of both FI and association rules mining. But since the FI mining takes up most of the resources, we focus on this step only.
References
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, vol. 22, pp. 207–216. ACM (1993)
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I., et al.: Fast discovery of association rules. Adv. Knowl. Discov. Data Min. 12(1), 307–328 (1996)
Alwidian, J., Hammo, B.H., Obeid, N.: WCBA: weighted classification based on association rules algorithm for breast cancer disease. Appl. Soft Comput. 62, 536–549 (2018)
Aravindhan, R., Shanmugalakshmi, R., Ramya, K.: Circumvention of nascent and potential Wi-Fi phishing threat using association rule mining. Wirel. Pers. Commun. 94(4), 2331–2361 (2017)
Barr, E.T., Harman, M., McMinn, P., Shahbaz, M., Yoo, S.: The oracle problem in software testing: a survey. IEEE Trans. Softw. Eng. 41(5), 507–525 (2015)
Berry, M.J., Linoff, G.: Data Mining Techniques: For Marketing, Sales, and Customer Support. Wiley, Hoboken (1997)
Borgelt, C.: Efficient implementations of Apriori and Eclat. In: 2003 Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations (2003)
Chan, W.K., Cheung, S.C., Leung, K.R.: A metamorphic testing approach for online testing of service-oriented software applications. Int. J. Web Serv. Res. 4(2), 61–81 (2007)
Chen, T.Y., Cheung, S.C., Yiu, S.M.: Metamorphic testing: a new approach for generating next test cases. Technical report, Technical Report HKUST-CS98-01, Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong (1998)
Chen, T.Y., Ho, J.W., Liu, H., Xie, X.: An innovative approach for testing bioinformatics programs using metamorphic testing. BMC Bioinformatics 10(1), 24 (2009)
Dong, B., Liu, R., Wang, H.W.: Trust-but-verify: verifying result correctness of outsourced frequent itemset mining in data-mining-as-a-service paradigm. IEEE Trans. Serv. Comput. 9(1), 18–32 (2016)
Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu., C., Tseng, V.S.: SPMF: a Java open-source pattern mining library (2016). http://www.philippe-fournier-viger.com/spmf/
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, vol. 29, pp. 1–12. ACM (2000)
Kotsiantis, S., Kanellopoulos, D.: Association rules mining: a recent overview. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 71–82 (2006)
Kuo, F.C., Chen, T.Y., Tam, W.K.: Testing embedded software by metamorphic testing: a wireless metering system case study. In: 2011 Proceedings of IEEE 36th Conference on Local Computer Networks, pp. 291–294. IEEE (2011)
Pang, H., Jain, A., Ramamritham, K., Tan, K.L.: Verifying completeness of relational query results in data publishing. In: 2005 Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 407–418. ACM (2005)
Rolfsnes, T., Moonen, L., Di Alesio, S., Behjati, R., Binkley, D.: Aggregating association rules to improve change recommendation. Empir. Softw. Eng. 23(2), 987–1035 (2018)
Weyuker, E.J.: On testing non-testable programs. Comput. J. 25(4), 465–470 (1982)
Wong, W.K., Cheung, D.W., Hung, E., Kao, B., Mamoulis, N.: An audit environment for outsourcing of frequent itemset mining. PVLDB 2(1), 1162–1173 (2009)
Xie, M., Wang, H., Yin, J., Meng, X.: Integrity auditing of outsourced data. In: 2007 Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 782–793. VLDB Endowment (2007)
Xie, X., Ho, J., Murphy, C., Kaiser, G., Xu, B., Chen, T.Y.: Application of metamorphic testing to supervised classifiers. In: 2009 Proceedings of the Ninth International Conference on Quality Software, pp. 135–144. IEEE (2009)
Xie, X., Ho, J.W., Murphy, C., Kaiser, G., Xu, B., Chen, T.Y.: Testing and validating machine learning classifiers by metamorphic testing. J. Syst. Softw. 84(4), 544–558 (2011)
Zaki, M.J., Parthasarathy, S., Ogihara, M., Li, W., et al.: New algorithms for fast discovery of association rules. In: 1997 Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, vol. 97, pp. 283–286 (1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, J., Xie, X., Zhang, Z. (2018). How Reliable Is Your Outsourcing Service for Data Mining? A Metamorphic Method for Verifying the Result Integrity. In: Bu, L., Xiong, Y. (eds) Software Analysis, Testing, and Evolution. SATE 2018. Lecture Notes in Computer Science(), vol 11293. Springer, Cham. https://doi.org/10.1007/978-3-030-04272-1_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-04272-1_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04271-4
Online ISBN: 978-3-030-04272-1
eBook Packages: Computer ScienceComputer Science (R0)