Abstract
Intrusion detection systems (IDSs) play an important role in the security of computer networks. One of the main challenges in IDSs is the high-dimensional input data analysis. Feature selection is a solution to overcoming this problem. This paper presents a hybrid feature selection method using binary gravitational search algorithm (BGSA) and mutual information (MI) for improving the efficiency of standard BGSA as a feature selection algorithm. The proposed method, called MI-BGSA, used BGSA as a wrapper-based feature selection method for performing global search. Moreover, MI approach was integrated into the BGSA, as a filter-based method, to compute the feature–feature and the feature–class mutual information with the aim of pruning the subset of features. This strategy found the features considering the least redundancy to the selected features and also the most relevance to the target class. A two-objective function based on maximizing the detection rate and minimizing the false positive rate was defined as a fitness function to control the search direction of the standard BGSA. The experimental results on the NSL-KDD dataset showed that the proposed method can reduce the feature space dramatically. Moreover, the proposed algorithm found better subset of features and achieved higher accuracy and detection rate as compared to the some standard wrapper-based and filter-based feature selection methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Amiri F, RezaeiYousefi M, Lucas C, Shakery A, Yazdani N (2011) Mutual information-based feature selection for intrusion detection systems. J Netw Comput Appl 34(4):1184–1199. doi:10.1016/j.jnca.2011.01.002
Battiti R (2002) Using mutual information for selecting features in supervised neural networks learning. IEEE Trans Neural Networ 5(4):537–550. doi:10.1109/72.298224
Bhuse V, Gupta A (2006) Anomaly intrusion detection in wireless sensor networks. J High Speed Netw 15(1):33–51
Blake CL, Merz CJ (1998) UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine. http://mrl.cs.umass.edu/ml/datasets. Accessed 21 May 2008
Bonev BI (2010) Feature selection based on information theory. Dissertation, University of Alicante
Cutillo L, Carissimo A, Figini S (2012) Network selection: a method for ranked lists selection. Plos One 7(8):e43678. doi:10.1371/journal.pone.0043678
Dash R, Paramguru RL, Dash R (2011) Comparative analysis of supervised and unsupervised discretization techniques. Int J Adv Sci Technol 2(3):29–37
Deisy C, Baskar S, Ramraj N, Saravanan Koori J, Jeevanandam P (2010) A novel information theoretic-interact algorithm (IT-IN) for feature selection using three machine learning algorithms. Expert Syst Appl 37(12):7589–7597. doi:10.1016/j.eswa.2010.04.084
Enache AC, Patriciu VV (2014) Intrusions detection based on support vector machine optimized with swarm intelligence. In: 9th international symposium on applied computational intelligence and informatics, pp 153–158
Fiore U, Palmieri F, Castiglione A, De Santis A (2013) Network anomaly detection with the restricted Boltzmann machine. Neurocomputing 122:13–23. doi:10.1016/j.neucom.2012.11.050
Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: 17th International Conference on Machine Learning, pp 359–366
Hopkins M, Reeber E, Forman G, Suermondt J (1999) Spam dataset- machine learning repository, UCI. http://archive.ics.uci.edu/ml/datasets/Spambase. Accessed 1 August 2015
Hoque N, Bhattacharyya DK, Kalita JK (2014) MIFS-ND: a mutual information-based feature selection method. Expert Syst Appl 41(14):6371–6385. doi:10.1016/j.eswa.2014.04.019
Jiang S, Wang Y, Ji Z (2014) Convergence analysis and performance of an improved gravitational search algorithm. Appl Soft Comput 24:363–384. doi:10.1016/j.asoc.2014.07.016
Kim G, Lee S, Kim S (2014) A novel hybrid intrusion detection method integrating anomaly detection with misuse detection. Expert Syst Appl 41(4):1690–1700. doi:10.1016/j.eswa.2013.08.066
Kira K, Rendell LA (1992) Feature selection problem: Traditional methods and a new algorithm. In: 10th National Conference on artificial intelligence, pp 129–134
Kuang F, Zhang S, Jin Z, Xu W (2015) A novel SVM by combining kernel principal component analysis and improved chaotic particle swarm optimization for intrusion detection. Soft Comput 19(5):1187–1199. doi:10.1007/s00500-014-1332-7
Kudłacik P, Porwik P, Wesołowski T (2015) Fuzzy approach for intrusion detection based on user’s commands. Soft Comput. doi:10.1007/s00500-015-1669-6
Kumar G, Kumar K (2012) An information theoretic approach for feature selection. Secur Commun Netw 5(2):178–185. doi:10.1002/sec.303
Kwak N, Choi CH (2003) Input feature selection by mutual information based on Parzen window. IEEE Trans Pattern Anal 24(12):1667–1671. doi:10.1109/TPAMI.2002.1114861
Liu H, Setiono R (1995) Chi2: Feature selection and discretization of numeric attributes. In: 7th international conference on tools with artificial intelligence, pp 388–391
Liu H, Sun J, Liu L, Zhang H (2009) Feature selection with dynamic mutual information. Pattern Recogn 42(7):1330–1339. doi:10.1016/j.patcog.2008.10.028
Liu H, Wu X, Zhang S (2014) A new supervised feature selection method for pattern classification. Comput Intell 30(2):342–361. doi:10.1111/j.1467-8640.2012.00465.x
Migliardi M, Merlo A (2013) Improving energy efficiency in distributed intrusion detection systems. J High Speed Netw 19(3):251–264. doi:10.3233/JHS-130476
Nezamabadi-pour H, Rostami-Shahrbabaki M, Maghfoori-Farsangi M (2008) Binary particle swarm optimization: challenges and new solutions. CSI J Comput Sci Eng 6(1-A):21–32
Noto K, Brodley C, Slonim D (2012) FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection. Data Min Knowl Disc 25(1):109–133. doi:10.1007/s10618-011-0234-x
Palmieri F, Fiore U (2010) Network anomaly detection through nonlinear analysis. Comput Secur 29(7):737–755. doi:10.1016/j.cose.2010.05.002
Palmieri F, Fiore U, Castiglione A, De Santis A (2013) On the detection of card-sharing traffic through wavelet analysis and support vector machines. Appl Soft Comput 13(1):615–627. doi:10.1016/j.asoc.2012.08.045
Pang S, Ban T, Kadobayashi Y, Kasabov N (2011) Personalized mode transductive spanning SVM classification tree. Inf Sci 181(11):2071–2085. doi:10.1016/j.ins.2011.01.008
Pei M, Goodman ED, Punch WF (1998) Feature extraction using genetic algorithms. In: International symposium on intelligent data engineering and learning, pp 371–384
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal 27(8):1226–1238. doi:10.1109/TPAMI.2005.159
Rashedi E, Nezamabadi-pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179(13):2232–2248. doi:10.1016/j.ins.2009.03.004
Rashedi E, Nezamabadi-pour H, Saryazdi S (2010) BGSA: binary gravitational search algorithm. Nat Comput 9(3):727–745. doi:10.1007/s11047-009-9175-3
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of ReliefF and RReliefF. Mach Learn 53(1–2):23–69. doi:10.1023/A:1025667309714
Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2005) Heuristic search over a ranking for feature selection. Lect Notes Comput Sci 3512:742–749. doi:10.1007/11494669_91
Sheikhan M (2014) Generation of suprasegmental information for speech using a recurrent neural network and binary gravitational search algorithm for feature selection. Appl Intell 40(4):772–790. doi:10.1007/s10489-013-0505-x
Sheikhan M, Jadidi Z, Farrokhi A (2012) Intrusion detection using reduced-size RNN based on feature grouping. Neural Comput Appl 21(6):1185–1190. doi:10.1007/s00521-010-0487-0
Sheikhan M, Mohammadi N (2012) Neural-based electricity load forecasting using hybrid of GA and ACO for feature selection. Neural Comput Appl 21(8):1961–1970. doi:10.1007/s00521-011-0599-1
Sigillito VG (1989) Ionosphere dataset- machine learning repository, UCI. http://archive.ics.uci.edu/ml/datasets/Ionosphere. Accessed 1 August 2015
Stakhanova N, Basu S, Wong J (2010) On the symbiosis of specification-based and anomaly-based detection. Comput Secur 29(2):253–268. doi:10.1016/j.cose.2009.08.007
Tavallaee M, Bagheri E, Wei L Ghorbani A (2009a) NSL-KDD Data Set. http://nsl.cs.unb.ca/NSL-KDD. Accessed 21 November 2014
Tavallaee M, Bagheri E, Wei L, Ghorbani A (2009b) A detailed analysis of the KDD CUP 99 data set. In: 2nd international symposium on computational intelligence for security and defense applications, pp 53–58
Unler A, Murat A, Chinnam RB (2011) mr\(^{2}\)PSO: a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification. Inf Sci 181(20):4625–4641. doi:10.1016/j.ins.2010.05.037
Wang G, Hao J, Ma J, Huang L (2010) A new approach to intrusion detection using artificial neural networks and fuzzy clustering. Expert Syst Appl 37(9):6225–6232. doi:10.1016/j.eswa.2010.02.102
Wang W, Zhang X, Gombault S, Knapskog SJ (2009) Attribute normalization in network intrusion detection. In: 10th international symposium on pervasive systems, algorithms, and networks, pp 448–453
Wolberg WH (1992) Original Wisconsin Breast Cancer Dataset- Machine Learning Repository, UCI. http://archive.ics.uci.edu/ml/datasets. Accessed 1 August 2015
Wu S, Yen E (2009) Data mining-based intrusion detectors. Expert Syst Appl 36(3):5605–5612. doi:10.1016/j.eswa.2008.06.138
Wu SX, Banzhaf W (2010) The use of computational intelligence in intrusion detection systems: a review. Appl Soft Comput 10(1):1–35. doi:10.1016/j.asoc.2009.06.019
Zhang Z, Hancock ER (2012) Hypergraph based information-theoretic feature selection. Pattern Recogn Lett 33(15):1991–1999. doi:10.1016/j.patrec.2012.03.021
Zhao Z, Liu H (2007) Searching for interacting features. In: 20th international joint conference on artificial intelligence, pp 1156–1161
Zheng Y, Kwoh CK (2011) A feature subset selection method based on high-dimensional mutual information. Entropy 13(4):860–901. doi:10.3390/e13040860
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no potential conflict of interest in this work.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Bostani, H., Sheikhan, M. Hybrid of binary gravitational search algorithm and mutual information for feature selection in intrusion detection systems. Soft Comput 21, 2307–2324 (2017). https://doi.org/10.1007/s00500-015-1942-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-015-1942-8