Abstract
In Internet applications, due to the growth of big data with more features, intrusion detection has become a difficult process in terms of computational complexity, storage efficiency and getting optimized solutions of classification through existing sequential computing environment. Using a parallel computing model and a nature inspired feature selection technique, a Hadoop Based Parallel Binary Bat Algorithm method is proposed for efficient feature selection and classification in order to obtain optimized detection rate. The MapReduce programming model of Hadoop improves computational complexity, the Parallel Binary Bat algorithm optimizes the prominent features selection and parallel Naïve Bayes provide cost-effective classification. The experimental results show that the proposed methodologies perform competently better than sequential computing approaches on massive data and the computational complexity is significantly reduced for feature selection as well as classification in big data applications.
Similar content being viewed by others
References
Abadeh, M.S., Habibi, J.: A hybridization of evolutionary fuzzy systems and ant colony optimization for intrusion detection. ISC Int. J. Inf. Secur. 2(1), 33–46 (2010)
Chu, C.T., Kim, S., Lin, Y.A.: MapReduce for machine learning on multicore. In: Proceedings of the 20th Conference on Advances in Neural Information Processing Systems, NIPS, pp. 281–288 (2006)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Deng, D.Y., Yan, D.X., Wang, J.Y.: Parallel reducts based on attribute significance. In: Yu, J., Greco, S., Lingras, P., et al. (eds.) Rough Set and Knowledge Technology. Lecture Notes in Computer Science, vol. 6401, pp. 336–343. Springer, Berlin (2010)
Depren, O., Topllar, M., Anarim, E., Ciliz, M.K.: An intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks. Expert Syst. Appl. 29, 713–722 (2005)
Domingos, P., Pazzani, M.: On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn. 29, 103–130 (1997)
Gowrison, G., Ramar, K., Muneeswaran, K., Revathi, T.: Minimal complexity attack classification intrusion detection system. Appl. Soft Comput. 13, 921–927 (2013)
Guo, C., Zhou, Y., Ping, Y., Zhang, Z., Liu, G., Yang, Y.: A distance sum-based hybrid method for intrusion detection. Appl. Intell. 40, 178–188 (2014). doi:10.1007/s10489-013-0452-6
Hadoop MapReduce. http://hadoop.apache.org/ (2015)
Han, L.X., Liew, C.C., Hemert, J.V., Atkinson, M.: A generic parallel processing model for facilitating data mining and integration. Parallel Comput. 37, 157–171 (2011)
Harb, H.M., Desuky, A.S.: Adaboost ensemble with genetic algorithm post optimization for intrusion detection. Int. J. Comput. Sci. Issues 8(5), 28–33 (2011)
Horng, S.-J., Ming-Yang, S., Chen, Y.-H., Kao, T.-W., Chen, R.-J., Lai, J.-L., Perkasa, C.D.: A novel intrusion detection system based on hierarchical clustering and support vector machines. Expert Syst. Appl. 38(1), 306–313 (2011)
Hu, W., Hu, W.: Network-based intrusion detection using Adaboost algorithm. In: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI’05) (2005)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Comput. Surv. 31(3), 264–323 (1999)
Kennedy, J., Eberhart, R.C.: A discrete binary version of the particle swarm algorithm. In: IEEE International Conference on Computational Cybernetics and Simulation, pp 4104–4108 (1997)
Levin, I.: KDD-99 classifier learning contest LLSoft’s results overview. SIGKDD Explore. ACM SIGKDD (2000)
Mahmud, W.M., Agiza, H.N., Radwan, E.: Intrusion detection using rough sets based parallel genetic algorithm hybrid model. In: Proceedings of the World Congress on Engineering and Computer Science (WCECS-2009), USA
McNabb, A.W., Monson, C.K., Seppi, K.D.: Parallel PSO Using MapReduce. In: Proceedings of 2007 IEEE Congress on Evolutionary Computation, CEC, IEEE Computer Society, pp. 7–16 (2007)
Mirjalili, S., Mohd Hashim, S.Z.: BMOA: binary magnetic optimization algorithm. In: 2011 3rd International Conference on Machine Learning and Computing (ICMLC 2011), Singapore, 2011, pp. 201–206 (2011)
Mohammad, M.R., Dominik, S., Wróblewski, J.: Parallel island model for attribute reduction. In: Pal, S.K., et al. (eds.) PReMI 2005. LNCS 3776, pp. 714–719, Springer (2005)
Natesan, P., Balasubramanie, P., Gowrison, G.: Improving attack detection rate in network intrusion detection using adaboost algorithm with multiple weak classifiers. J. Inf. Comput. Sci. 8(8), 2239–2251 (2012)
Peddabachigari, S., Abraham, A., Grosan, C., Thomas, J.: Modelling intrusion detection system using hybrid systems. J. Netw. Comput. Appl. 30, 114–132 (2007)
Pfahringer, B.: Winning the KDD99 classification cup: bagged boosting. SIGKDD Explor. 1(2), 67–75 (2000)
Qian, J., Miao, D., Zhang, Z., Yue, X.: Parallel attribute reduction algorithms using MapReduce. J. Inf. Sci. 279, 671–690 (2014)
Rashedi, E., Nezamabadi-pour, H., Saryazdi, S.: BGSA: binary gravitational search algorithm. Nat. Comput. 9, 727–745 (2009)
Srinivasan, A., Faruquie, T.A., Sachindra, J.: Data and task parallelism in ILP using MapReduce. Mach. Learn. 86(1), 141–168 (2012)
Sung, A.H., Mukkamala, S.: The feature selection and intrusion detection problems. In Proceedings of advances in computer science—ASIAN 2004: higher-level decision making. In: 9th Asian Computing Science Conference, vol. 3321, pp. 468-482 (2004)
Tsang, C.H., Kwong, S.: Multi-agent intrusion detection system in industrial network using ant colony clustering approach and unsupervised feature extraction. In: Proceedings of the IEEE International Conference on Industrial Technology 2005(ICIT2005), pp. 51–56 (2005)
Venkatachalam, V., Selvan, S.: Performance comparison of intrusion detection system classifiers using various feature reduction techniques. Int. J. Simul. 9(1), 30–39 (2008)
Verma, A., Llora, X., Goldberg, D.E., Campbell, R.H.: Scaling genetic algorithms using MapReduce. In: Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications, IEEE Computer Society, pp. 13–18 (2009)
Wang, G., Hao, J., Ma, J., Huang, L.: A new approach to intrusion detection using artificial neural networks and fuzzy clustering. Expert Syst. Appl. 37(9), 6225–6232 (2010)
Weaver, J.: A scalability metric for parallel computations on large, growing datasets (like the web). In: Proceedings of the Joint Workshop on Scalable and High-Performance Semantic Web Systems (2012)
Weiming, H., Wei, H., Maybank, S.: AdaBoost-based algorithm for network intrusion detection. IEEE Trans. Syst. Man Cybern. Part B Cybern. 38(2), 577–583 (2008)
Xiang, C., Chong, M.Y., Zhu, H.L.: Design of multiple-level tree classifiers for intrusion detection system. In: Proceedings of the 2004 IEEE Conference on Cybernetics and Intelligent Systems, December, Singapore, pp. 872–877 (2004)
Xiang, C., Yong, P.C., Meng, L.S.: Design of multiple-level hybrid classifier for intrusion detection system using Bayesian clustering and decision trees. Pattern Recognit. Lett. 29, 918–924 (2008)
Yang, X.S.: A new metaheuristic bat-inspired algorithm. In: Gonzalez, J.R., et al. (eds.) Nature Inspired Cooperative Strategies for Optimization (NICSO 2010), vol. 284, pp. 65–74. Springer, Berlin (2010)
Zhao, W.Z., Ma, H.F., He, Q.: Parallel K-means clustering based on MapReduce. In: Jaatun, M.G., Zhao, G., Rong, C. (eds.) Cloud Computing, CloudCom2009, pp. 674–679. Springer, Berlin (2009)
Acknowledgments
The authors would like to thank all anonymous reviewers for their constructive and insightful suggestions to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Natesan, P., Rajalaxmi, R.R., Gowrison, G. et al. Hadoop Based Parallel Binary Bat Algorithm for Network Intrusion Detection. Int J Parallel Prog 45, 1194–1213 (2017). https://doi.org/10.1007/s10766-016-0456-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-016-0456-z