Enhancing the Discrimination Ability of a Gas Sensor Array Based on a Novel Feature Selection and Fusion Framework
Next Article in Journal
Data Fusion Architectures for Orthogonal Redundant Inertial Measurement Units
Previous Article in Journal
Numerical and Experimental Evaluation of High-Frequency Unfocused Polymer Transducer Arrays
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhancing the Discrimination Ability of a Gas Sensor Array Based on a Novel Feature Selection and Fusion Framework

1
College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China
2
High Tech Department, China International Engineering Consulting Corporation, Beijing 100048, China
3
Westa College, Southwest University, Chongqing 400715, China
*
Author to whom correspondence should be addressed.
Sensors 2018, 18(6), 1909; https://doi.org/10.3390/s18061909
Submission received: 27 May 2018 / Revised: 6 June 2018 / Accepted: 9 June 2018 / Published: 12 June 2018
(This article belongs to the Section Chemical Sensors)

Abstract

:
In this paper, a novel feature selection and fusion framework is proposed to enhance the discrimination ability of gas sensor arrays for odor identification. Firstly, we put forward an efficient feature selection method based on the separability and the dissimilarity to determine the feature selection order for each type of feature when increasing the dimension of selected feature subsets. Secondly, the K-nearest neighbor (KNN) classifier is applied to determine the dimensions of the optimal feature subsets for different types of features. Finally, in the process of establishing features fusion, we come up with a classification dominance feature fusion strategy which conducts an effective basic feature. Experimental results on two datasets show that the recognition rates of Database I and Database II achieve 97.5% and 80.11%, respectively, when k = 1 for KNN classifier and the distance metric is correlation distance (COR), which demonstrates the superiority of the proposed feature selection and fusion framework in representing signal features. The novel feature selection method proposed in this paper can effectively select feature subsets that are conducive to the classification, while the feature fusion framework can fuse various features which describe the different characteristics of sensor signals, for enhancing the discrimination ability of gas sensors and, to a certain extent, suppressing drift effect.

1. Introduction

An artificial olfactory system (AOS), also known as the machine olfactory system or electronic nose (E-nose), is designed for imitating the biological sensory system based on the principle of bionics. Nowadays, it has become a major innovation in the field of gas detection technology, due to its advantages, such as real time, non-invasiveness, easy operation, and low cost. However, there are still some deficiencies in AOS, such as being susceptible to the environment, not directly distinguishing the mixed gas, and drifting over time. Since the concept of E-nose was put forward in 1994 [1], the perception and judgment process of bionic olfactory information, as well as its related applications, have been of wide concern to scholars in related fields [2,3,4,5,6].
On the one hand, it is observed that some features of chemical sensors may not be necessary, and only a subset of the original features contribute to the classification when deploying a gas sensor array for a specific application [7]. Meanwhile, the cross-sensitivity of sensor array has both merits and demerits. Specifically, this cross-sensitivity is conducive to the detection of various gases when the number of sensors is limited. However, it also leads to redundancies and interferences. The sensor array may produce redundant, incomplete, imprecise, and inconsistent information, and the presence of these irrelevant features increases the dimensionality of the feature space, which may reduce the accuracy of the pattern recognition. Robust features can describe the characteristic of sensor signals effectively. The performances of classifiers can be improved by using a subset of features instead of the whole set. This requires a systematic or structured approach to select the optimal subset of sensors to enhance the performance of the overall system.
Generally, feature selection algorithms can be roughly divided into two major categories [8]. The filter approach filters the redundant features by a certain figure of merit using a preprocessing process, such as using the Mahalanobis distance between response distributions to evaluate the configurations of sensor array [9]. It possesses less amount of computation, and can find feature sets which are able to be applied to models with different requirements. The wrapper approach selects feature subsets by models trained on all the feature subsets to predict the accuracy of the features. It is more accurate to a certain calibration model [10]. The utilization of classifiers (such as SVM) to select the optimal features of the different sensors for enhancing the classification ability of the compound is a typical wrapper approach [11]. The two approaches are widely used in the feature selection of gas sensor arrays [12,13,14,15]. However, there are still disadvantages of the existing approaches. The filter approach is not accurate enough for different classification models, while the wrapper approach requires more computational resources, and may not have such accuracy in other occasions for its specificity.
Many previous studies have made improvements based on the two basic methods in different research fields. The minimal-redundancy maximal-relevance (mRMR) method is based on the filter method, which selects relevant and nonredundant features according to the mutual information criterion [16]. Sequential forward selection (SFS) has been used for evaluation of breath alcohol measurement [17]. It starts with an empty subset and sequentially adds the best features, which can make the rank of the feature subset higher. Conversely, sequential backward selection (SBS) reduces feature elements sequentially from all to none, and has been used to assess the odor of automobile interior components [18]. In addition, other improved feature selection methods have been also widely applied in many areas, including cloud computing [19], identifying different kinds of meat [20], and social image annotation [21], but still there were not enough such improvements in the E-nose area. However, the improvements mentioned above cannot both reduce the dimension of features and select features efficiently.
On the other hand, it is well known that a single feature cannot fully reflect the characteristics of sensor signals, which achieves low classification accuracy. Thus, the fusion strategy [22] is an advisable choice to improve the prediction accuracy of a gas sensor array. The fusion strategy is an idea that synthesizes the signals from different sources to obtain a better model representation. The purpose of data fusion is to combine information obtained from multiple sources by different strategies, which can potentially achieve a better description and enhance the classification accuracy [23]. Many studies have reported that combining the features of the E-nose, E-tongue, or E-eye will improve the performance of a gas sensor array for odor identification. Hong et al. [24] described the use of four fusion approaches for an E-nose and an E-tongue to distinguish cherry tomato juice from adulteration, and demonstrated that the utilization of perceptual knowledge from both the E-sensors could perform better than using E-nose or E-tongue individually. Buratti et al. [25] proposed an effective mid-level data fusion method to discuss the applicability of the E-nose, E-eye, and E-tongue for the quality decay assessment and characterization of olive oil, which evidenced the ability to classify samples, and has greatly improved the KNN classification model. Rodriguez-Mendez et al. [26] proposed a method of combining the correlations between the chemical parameters from an E-nose and an E-tongue associated with the oxygen and the polyphenolic composition of red wines, which significantly improved the quality of the predictions. In Ref. [23], a multilevel fusion strategy framework of the E-nose and E-tongue is proposed in this paper, which can improve the tea quality prediction accuracies through modeling decision fusion and feature fusion. However, the aforementioned conventional feature fusion strategy focuses on simply and directly combining the original features from different instruments into one feature matrix, which only increases the dimensions of the feature matrix without taking the contribution of each kind of feature on the final classification into account. Recently, Lijun Dang et al. [27] proposed a weighted fusion framework with logarithmic form, which concentrated on the contribution of each feature. However, for the particularity of logarithms, the method of calculating the weights cannot properly reflect the classification accuracy of each feature.
In this paper, we present a feature selection and fusion framework, and the merits of this paper include the following.
(1)
We propose a feature selection method, which couples the filter and wrapper strategies, to evaluate the subfeatures of a gas sensor array using two indicators, i.e., separability and dissimilarity, as well as the KNN classifier, for effectively describing the characteristics of different odors under the premise of reducing the data redundancy as much as possible.
(2)
We propose a weighted feature fusion framework combining information according to a classification dominance strategy, for achieving better description of odor and increasing the accuracy of final classification.
(3)
The novel feature selection and feature fusion framework can not only improve the recognition rate of a gas sensor array, but also greatly suppress the negative effects of sensor drift effect on gas identification.
In the rest of this paper, we will firstly introduce the whole methodology of the proposed feature selection and fusion framework will be described in Section 2; then, the data sets are introduced briefly in Section 3; the results of this experiment will be shown in Section 4; finally, we will draw our conclusions in Section 5.

2. Methodology

In this section, a novel feature selection and fusion framework of E-nose are described, which contains three parts: firstly, the separability and dissimilarity of different features are calculated for the order of feature selection. Secondly, use the classifier to determine the optimal dimension of the feature subsets. Finally, fuse the selected features based on a weighted voting according to a classification dominance strategy for obtaining gas classification results. The flow chart of this feature selection and fusion framework is shown in Figure 1.

2.1. Feature Selection

In order that the information provided by an E-nose can represent the characteristics of different odors more clearly and can contribute to the classification, a new feature selection method is defined based on the separability and the dissimilarity. First, we introduce the principle of class separability criterion in Section 2.1.1. Then, to eliminate redundant features, we define a dissimilarity criterion in Section 2.1.2. At last, the feature selection algorithm is shown in Section 2.1.3.

2.1.1. Separability Index

A pretty good classification rate will be achieved if one feature produces a distinct scent fingerprint in the feature space for different gases. That is to say, if scent fingerprints contain good separable information, the pattern recognition algorithm can easily identify them. On the contrary, the classification performance will not improve in case all the features have poor information that four classes of odors cannot be correctly identified.
Suppose K is the number of samples for each class of gases, M is the number of dimensions of original feature matrix, and N is the number of classes of gases. X m n ( i ) denotes the feature of the i - th   ( i = 1 , 2 , , k ) sample of the m - th   ( m = 1 , 2 , , M ) dimension of the feature matrix (denoted as f m ) for the n - th   ( n = 1 , 2 , , N ) gas (denoted as G n ). Thus, the mean vector μ m n for feature f m and gas G n is
μ m n = i = 1 K X m n ( i ) K .
The Euclidean distance between each sample of each dimension of feature matrix for each class of gases and the mean vector can be written as
d m n ( i ) = X m n ( i ) μ m n .
The mean and variance of d m n ( i ) for feature f m and gas G n are defined as Equations (3) and (4):
μ d m n = i = 1 K d m n ( i ) K ,
σ d m n 2 = i = 1 K ( d m n ( i ) μ d m n ) 2 K 1 .
Then σ m 1 2 , defined as the average of σ d m n 2 for each gas as Equation (5), is a measure of variation of within-class scatter for feature f m :
σ m 1 2 = n = 1 n σ d m n 2 N .
Define the sample mean vector for N classes of gases as follows:
μ m = n = 1 N μ m n N .
The Euclidean distance from μ m n to overall mean vector μ m is:
d m n = μ m n μ m .
The mean and variance of d m n are
μ d m = n = 1 N d m n N ,
σ m 2 2 = n = 1 N ( d m n μ d m ) 2 N 1 .
Here, σ m 2 2 is a measure of variation of between-class scatter for feature f m .
Finally, we define the ratio of σ m 2 2 / σ m 1 2 as the class separability index (SI):
S I ( f m ) = σ m 2 2 / σ m 1 2 .
Hence, S I ( f m ) describes the capability of class separability for the features to be selected. The larger S I ( f m ) means the more separability information that the feature contains.

2.1.2. Dissimilarity Index

We can acquire little additional information if different features are similar for all the gases. It means that each of the selected features not only must have good separability, but also contain more diverse but less redundant information for a subset of features to be optimized, so we define a dissimilarity index (DI) for all the features as follows:
D I ( f i , f j ) = 1 | ρ ( f i , f j ) |   ( i , j = 1 , 2 , , M ) ,
where ρ ( f i , f j ) is the correlation coefficient between the feature f i and f j . The larger D I ( f i , f j ) means there is less shared information between two features. It means that the selected features have more additional, but less redundant information for classification.

2.1.3. Feature Selection Algorithm

The purpose of class separability and dissimilarity is to choose the optimal feature subsets for classification. The larger both separability and dissimilarity means the selected features have more advantages to enhance classification capability. The steps are described in detail in the following Algorithm 1.
Algorithm 1. Feature Selection
Input:
  Original feature matrix X M with M-dimensional features.
Output:
  Selected feature subset S with D-dimensional features ( D = 1 , 2 , , M ) .
Procedure:
  1: D = 1. Compute S I ( f i )   ( i = 1 , 2 , ,   M ) of each dimension of the original feature matrix and record the score1: s c o r e 1 = S I ( f i ) . Choose the feature with the largest s c o r e 1 as the first element of the optimal feature subset S. Then, the remaining feature element is X M D .
  2: do
  Step 1: D = D + 1. Then, choose a feature element from X M D in turn, and combine the element with S into a new feature subset X T , all subset X T make up a new feature matrix X . Compute the class separability index ( S I ) of each feature subset in the X and the S I is defined as S I = 2 D i = 1 D S I ( f i ) .
  Step 2: For the formed new feature matrix X in Step 1, obtain t = ( 2 D ) subsets. Then, compute the average D I of the pairwise dissimilarity of all the subsets.
  Step 3: For each subset, compute the score2 defined as s c o r e 2 = S I + D I , which reflects whether the feature subset is appropriate.
  Step 4: Put the feature element with the largest value of s c o r e 2 into S and reset the remaining feature element X M D .
  Step 5: Input the selected feature subset S with D-dimensional features into the classifier. Then, the classification accuracy of the D-dimensional features a c c u r a c y ( D ) will be obtained.
  End while until the number of selected elements D reaches M.
  3: Choose the best classification accuracy from a c c u r a c y ( D ) ( D = 1 , 2 , 3     M ) as the final accuracy for this kind of feature after feature selection. If a c c u r a c y ( i ) = a c c u r a c y ( j ) but i j ( i , j = 1 , 2 , 3     M ) , i can be considered as the optimal feature dimension.
Return: S = {s1, s2, …, sM}.
Note: The larger score2 means the feature is more beneficial to increasing classification performance.

2.2. Feature Fusion Framwork

Suppose that there are L kinds of features and N types of samples. Each kind of feature makes decisions according to its prediction accuracy on test data. Firstly, each kind of feature is used as the input of the classifier, respectively, and L classification accuracy rates are obtained and denoted as a = [ a 1 , a 2 , , a L ] . The importance weight of each kind of feature w = [ w 1 , w 2 , w L ] is calculated by Equation (15):
w i = a i i = 1 L a i ,
where w i ( i = 1 , 2 , L ) denotes the importance weight of the i-th kind of feature.
For each sample, the output form of the classifier for the L kinds of features can be predicted as
δ = [ δ 1 , δ 2 , , δ i , , δ L ] T ,
where δ i [ 1 , 2 , , N ] can be transformed into binary encoding. If the prediction result of the i-th kind of feature is δ i = 1 , then encode it by δ i b i n a r y = [ 1   0     0 N   e l e m e n t s ] T . Similarly, if δ i = 2 , then encode it by δ i b i n a r y = [ 0   1     0 N e l e m e n t s ] T . By that analogy, if δ i = c , its binary encoding δ i b i n a r y is a vector with N elements, whose c-th element equals 1 and the others are 0. Thus, we can obtain
δ b i n a r y = [ δ 1 b i n a r y , δ 2 b i n a r y , , δ i b i n a r y , , δ L b i n a r y   ] T .
The weighted feature fusion framework according to a classification dominance fusion strategy leverages the classification rates of the base features, and makes a final decision based on Equations (15) and (16).
f u s i o n = [ f u s i o n 1 , f u s i o n 2 , f u s i o n j f u s i o n N ] = w δ b i n a r y ,
where f u s i o n j ( j = 1 , 2 , , N ) is the fusion score of the j-th class. It means that each class has its own fusion score, and the class label of one sample can be predicted by the maximum fusion score, which is shown as
p r e d i c t _ l a b e l = max ( f u s i o n 1 , f u s i o n 2 , f u s i o n j f u s i o n N ) .
All the computations involved in this paper are implemented in the E-nose software system and Matlab R2015b (Mathworks, Natick, MA, USA). The K-Nearest Neighbor (KNN) algorithm is used as the classifier, which classifies samples based on closest training examples in the feature space. There are numerous advantages of the KNN that has been proved. One of the advantages is that it is effective to reduce the misclassification when the number of samples in the training dataset is large. Meanwhile, KNN can easily deal with multiclass recognition problems especially when the class size is three and higher. What is more important is that KNN is superior to many other supervised learning methods, such as support vector machine (SVM), neural network (NN), etc. Since the process of their parameters to be optimized will cost much time, while the KNN method demands only few parameters to tune for achieving excellent classification accuracy: the value of k and the distance metric [28,29,30].

3. Description of Experimental Data

In this paper, two different datasets of gas sensor arrays are utilized, and here is a brief description of the materials and gas sensor array to make the paper self-contained.

3.1. Dataset I

A sensor array containing fourteen metal oxide sensors (TGS800, TGS813, TGS816, TGS822, TGS825, TGS826, TGS2600, TGS2602, TGS2620, WSP2111, MQ135, MQ138, QS-01, and SP3S-AQ2), and one electrochemical general air quality sensor (AQ) produced by Dart Sensors Ltd. (Exeter, UK) were applied to detect four types of rates wounds (uninfected and infected by Staphylococcus aureus, Pseudomonas aeruginosa, and Escherichia coli, respectively). The details of the samples and experiments are presented in previous publication [31]. The specific distribution of data is shown in Table 1.
Seven kinds of features were extracted from the original response curves and their transform domains: maximum value (MV), the DC component and first order harmonic component of the coefficients of fast Fourier transformation (FFT), and the approximation coefficients of discrete wavelet transformation (DWT) based on wavelets Db1, Db2, Db3, Db4, and Db5, respectively. The structures of the feature matrix are shown in Table 2.

3.2. Dataset II

We used a big sensor data array with long-term drift effect of 36 months, which was publicly released by UCI Machine Learning Repository [32], as the second dataset. This data contains 13910 measurements from an E-nose system with 16 gas sensors (TGS2600, TGS2602, TGS2610, and TGS2620 (four of each) from Figaro Engineering Inc. (Tianjin, China). All sensors are exploited to detect six kinds of pure gaseous substances at distinct concentration levels, including ethanol, ethylene, ammonia, acetaldehyde, acetone, and toluene. The concentration ranges of the six kinds of gases are shown in Table 3. Eight kinds of features extracted from the original response data for each sensor make up a 128-dimensional feature vector (16 sensors × 8 features). In total, 10 batches of sensor features are collected at different time intervals. The details of the sensor batches are presented in Table 4.
During the gas injection phase, the resistance of the sensor will increase with the growth trend gradually slowing down, and the response will gradually decrease with the declining trend gradually slowing down during the cleaning phase. Therefore, we can use the maximum/minimum value of the exponential moving average (EMA) to reflect the growth/declining trend of the sensor signals [32], and the EMA is defined as:
y [ k ] = ( 1 α ) y [ k 1 ] + α ( R [ k ] R [ k 1 ] ) ,
where α is a scalar smoothing parameter between 0 and 1, while y [ k ] and R [ k ] are the EMA and the response at time k , respectively.
Three different values of α ( α = 0.1 , α = 0.01 , a = 0.001 ) were used in the formula to obtain three different maximum/minimum values of EMAs for increasing stage and decreasing stage, which are defined as EMAi1, EMAi2, EMAi3, EMAd1, EMAd2 and EMAd3, respectively. In addition, another two features contain a steady-state feature, defined as the difference of the maximal resistance change and the baseline (DR); normalized version steady-state feature (NDR), is expressed by the ratio of the maximal resistance and the baseline values. In order to display that the feature selection and fusion framework can enhance the discrimination ability of the gas sensor array with drift effect, batches 1 to 9 are used as training set, while the batch 10 is used as the test set, respectively.

4. Results and Discussion

In this work, the different values of k which will be tested are {1, 3, 5, 7, 9}, and the different distance metrics are Euclidean distance (EU), cityblock distance (CB), cosine distance (COS) and correlation distance (COR) [28,33].

4.1. The Optimal Value of k and the Distance Metrics

First of all, in order to further certify the optimal value of k and the distance metrics for different features of two datasets, we perform a comparison between different values of k and distance metrics in KNN classifier without feature selection, and each kind of feature is experimented individually. Table 5 and Table 6 lists the classification accuracy of different values of k and distance metrics using the KNN classifier for Dataset I and Dataset II, respectively. In Table 5, it is observed that the classification rate descends as the values of k increases on the whole. Compared with the feature of MV, the recognition results of the features from transform domain are greatly improved. As for the distance metric, it is obvious that the COS and COR perform generally better than EU and CB. For the COS and COR, the performances of DB1, DB2, DB3, and DB4 can achieve above 90%. For the COR, especially, the best performances obtained by Db1 are better than COS, which can achieve 93.75% when k = 1, while the COS has the performance of 90.00%. For Dataset II, which is shown in Table 6, the variation of classification accuracy using the classifier with different values of k and distance metrics is inconspicuous, and the performances of different features are all bad. It indicates that the sensor drift effect seriously affects the stability of the outputs of sensors, and finally deteriorates the performance of the classifier. In general, EU and CB measure the absolute distance between points in the high dimensional space, which is directly related to the position coordinates of each point, i.e., the values of the elements of each feature. EU and CB can reflect the absolute difference of individual numerical values of features, so they are usually used to analyze the difference from the numerical values of different dimensions. The COS and COR measure the angle between the vectors and lay emphasis on the difference in the directions of the vectors rather than the positions. For the classification for the signals of a gas sensor array, the patterns of distinct odors are mainly reflected in the relative directions among different features rather than the absolute values. Therefore, COS and COR are superior to EU and CB. To exhibit the effect of feature selection and fusion framework, COS and COR, as well as k = 1, are used in the following experiments.

4.2. Separability Index and Dissimilarity Matrix

Among all the features of an E-nose, not all of them are sensitive to the target gases. In order to remove the features which are not helpful for classification, the proposed feature selection method is conducted in this section. MV of Dataset I and DR of Dataset II are taken as examples to illustrate the process of feature selection. Figure 2 shows the class separability index of 15 features of MV and class separability index of 16 features of DR. It is obvious that sensor 4 for MV and sensor 13 for DR have the highest separability, while sensor 8 of MV and sensor 9 of DR have the worst one. Hence, if the selection is just based on separability, the feature of sensor 4 for MV and feature of sensor 13 for DR should be chosen in the final subset.
The dissimilarity for all the pairwise combinations of features of MV for Dataset I and DR for Dataset II, as computed by Equation (11), are shown in Figure 3. Colors biased towards red/blue tones denote higher/lower values of the dissimilarity. Note that the distributions of values are symmetric with respect to the swap of features for Figure 3. We can see clearly that the combination of features index of the MV with highest dissimilarity turns out (6, 8), and the different situation appears for the DR features when the features index for the selected pair of features is (4, 10). Hence, if we choose two subfeatures only based on dissimilarity, the feature subset {6, 8} of the MV and the feature subset {4, 10} will be chosen on the basis of the pigment band.

4.3. Optimal Numbers of Different Kinds of Features after Selection

To obtain the optimal numbers of different features after feature selection, the classification accuracies performed by KNN classifier with the selected feature subset for Dataset I and Dataset II are computed. Figure 4 shows the best classification accuracies of seven features of Dataset I, while Figure 5 shows the best classification accuracies of eight features of Dataset II when COS is used as distance metric and k = 1. It can be clearly observed that the recognition rate rises rapidly when the number of selected feature dimensions, which are put into KNN classifier, is small. Generally, the recognition rate increases with the increasing number of selected features. However, for the vast majority of the features, using all the features cannot obtain the optimal recognition rate. This means that not all the features are beneficial for the identification. Therefore, the number of selected features corresponding to the best classification accuracy is regarded as the optimal number of the subset after feature selection.
We can obtain the optimal numbers of different kinds of features with COS and COR distance metrics when k = 1 for both Datasets, which are given in Table 7 and Table 8.

4.4. Comparison of Classification Accuracies with and without Feature Selection

Table 9 lists the classification results of Dataset I with/without feature selection, and Table 10 lists the classification results of Dataset II with/without feature selection. It can be concluded from the Table 9 and Table 10 that the classification accuracy with feature selection based on the proposed feature selection approach is improved obviously than without feature selection. For Dataset I, it can be seen that the Db1 and Db2 achieve best classification accuracy when the distance metric is COR with feature selection, which achieve 96.25%. For Dataset II, the classification accuracies of DR and EMAi3 can achieve the classification rate above 70% when the distance metric is COR. However, for COS, only the classification accuracies of DR can achieve the classification rate above 70%. Hence, COR is used as the optimal distance metric in the following feature fusion framework.

4.5. Results of Feature Fusion

In order to compare the performance of different identification methods, we applied two control methods: (1) the conventional feature fusion strategy combining the original features simply and directly; (2) the proposed feature fusion method but without feature selection. Figure 6 and Figure 7 showed the results of the three methods for both the datasets, respectively. For both of the datasets, the proposed feature selection and fusion method can obtain the best classification accuracy. Particularly, for Dataset I, it is observed that the conventional feature fusion method, which integrates all the features without feature selection, only has the accuracy of 77.5%, which is much lower than the other two methods. The conventional fusion method does not consider the importance of each kind of feature, which have distinct contributions to the identification. The proposed feature fusion method enhances the effects of “good” features and suppresses the effects of “bad” features, which can improve the performance. The proposed feature selection and fusion method can obtain the highest accuracy of 97.5% among the three methods. It demonstrates that the redundant features will deteriorate the performance of the gas sensor array, and the proposed feature selection methods can eliminate the redundant and irrelevant features and finally enhance the performance of the gas sensor array. For Dataset II, we can see clearly in Figure 7 that the recognition rate of the proposed feature fusion method with selected features obviously improved the performance, which has achieved an accuracy of 80.11%. The classification for each kind of gas has been significantly improved, compared with the conventional method. This means that the proposed feature selection and fusion method can effectively compensate the drift effect and enhance the discrimination ability of the gas sensor array.

5. Conclusions

In this paper, efforts are made to improve the discrimination ability of a gas sensor array using a new feature selection and feature fusion framework. The feature selection integrates the filter and the wrapper approaches, and the feature selection emphasizes the classification dominance fusion strategy based on the classification rates of each base feature. Compared with original features, the selected features can better represent the E-nose signal characteristics. Based on the proposed framework, the E-nose performs better markedly than not only using all the basis features without selection, but also the conventional feature fusion method. The experimental results show that the classification rates of gases have been excellent, and improved after feature selection compared with the results without feature selection. The feature selection method can select more relevant but less redundant feature elements, which are beneficial to the gas identification. Furthermore, compared with the conventional direct fusion method, the fusion method proposed in this paper has better performance on gas classification, as the conventional fusion method directly integrates “good” features as well as “bad” features to one blended feature matrix, without considering the different discrimination ability of each kind of features. The blended feature matrix may contain redundancies between feature elements, which are not conducive for distinguishing different kinds of gases. Moreover, the discrimination ability of the base feature is the most intuitive indicator of the contribution of each feature for the final discrimination in the decision level. It indicates that the superiority of the proposed feature selection and fusion framework in enhancing E-nose performance. It also indicates that proposed framework can be successfully used in overcoming the long-term drift effect of a gas sensor array.

Author Contributions

Conceptualization, J.Y.; Methodology, C.D. and J.Y.; Software, C.D. and J.Y.; Validation, C.D., K.L. and J.Y.; Formal Analysis, C.Y., D.S. and B.Y.; Investigation, K.L. and J.Y.; Resources, K.L.; Data Curation, C.D. and J.Y.; Writing-Original Draft Preparation, C.D.; Writing-Review & Editing, D.S., B.Y., S.Y. and Z.H.; Supervision, J.Y.; Project Administration, J.Y.; Funding Acquisition, J.Y.

Funding

The work was supported by Fundamental Research Funds for the Central Universities (Grant No. XDJK2017C073), National Natural Science Foundation of China (Grant Nos. 61672436, 61571372), Undergraduate Students Science and Technology Innovation Fund Project of Southwest University (Grant No. 20171803005).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Gardner, J.W.; Bartlett, P.N. A brief history of electronic nose. Sens. Actuators B 1994, 18, 211–215. [Google Scholar] [CrossRef]
  2. Natale, C.D.; Macagnano, A.; Martinelli, E.; Paolesse, R.; Proietti, E.; D’Amico, A. The evaluation of quality of post-harvest oranges and apples by means of an electronic nose. Sens. Actuators B Chem. 2001, 78, 26–31. [Google Scholar] [CrossRef]
  3. Stuetz, R.M.; Fenner, R.A.; Engin, G. Characterisation of wastewater using an electronic nose. Water Res. 1999, 33, 442–452. [Google Scholar] [CrossRef]
  4. Campagnoli, A.; Cheli, F.; Polidori, C.; Zaninelli, M.; Zecca, O.; Savoini, G.; Pinotti, L.; Dell’Orto, V. Use of the Electronic Nose as a Screening Tool for the Recognition of Durum Wheat Naturally Contaminated by Deoxynivalenol: A Preliminary Approach. Sensors 2011, 11, 4899–4916. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Sahgal, N.; Magan, N. Fungal volatile fingerprints: Discrimination between dermatophyte species and strains by means of an electronic nose. Sens. Actuators B Chem. 2008, 131, 117–120. [Google Scholar] [CrossRef] [Green Version]
  6. Gębicki, J.; Szulczyński, B. Discrimination of selected fungi species based on their odour profile using prototypes of electronic nose instruments. Measurement 2017, 116, 307–318. [Google Scholar] [CrossRef]
  7. Wang, X.R.; Lizier, J.T.; Berna, A.Z.; Bravo, F.G.; Trowell, S.C. Human breath-print identification by E-nose, using information-theoretic feature selection prior to classification. Sens. Actuators B Chem. 2015, 217, 165–174. [Google Scholar] [CrossRef]
  8. Kohavi, R.; John, G.H. Wrappers for feature subset selection. Artif. Intell. 1997, 97, 273–324. [Google Scholar] [CrossRef]
  9. Muezzinoglu, M.K.; Vergara, A.; Huerta, R.; Rabinovich, M.I. A sensor conditioning principle for odor identification. Sens. Actuators B Chem. 2010, 146, 472–476. [Google Scholar] [CrossRef]
  10. Johnson, K.J.; Rosepehrsson, S.L. Sensor Array Design for Complex Sensing Tasks. Annu. Rev. Anal. Chem. 2015, 8, 287. [Google Scholar] [CrossRef] [PubMed]
  11. Nowotny, T.; Berna, A.Z.; Binions, R.; Trowell, S. Optimal feature selection for classifying a large set of chemicals using metal oxide sensors. Sens. Actuators B Chem. 2013, 187, 471–480. [Google Scholar] [CrossRef]
  12. Yan, K.; Zhang, D. Feature selection and analysis on correlated gas sensor data with recursive feature elimination. Sens. Actuators B Chem. 2015, 212, 353–363. [Google Scholar] [CrossRef]
  13. Shahid, A.; Choi, J.H.; Rana, A.U.H.S.; Kim, H.S. Least Squares Neural Network-Based Wireless E-Nose System Using an SnO2 Sensor Array. Sensors 2018, 18, 1446. [Google Scholar] [CrossRef] [PubMed]
  14. Licen, S.; Barbieri, G.; Fabbris, A. Odor Control Map: Self Organizing Map built from electronic nose signals and integrated by different instrumental and sensorial data to obtain an assessment tool for real environmental scenarios. Sens. Actuators B Chem. 2018, 263, 476–485. [Google Scholar] [CrossRef]
  15. Lippolis, V.; Cervellieri, S.; Damascelli, A. Rapid prediction of deoxynivalenol contamination in wheat bran by MOS-based electronic nose and characterization of the relevant pattern of volatile compounds. J. Sci. Food Agric. 2018. [Google Scholar] [CrossRef] [PubMed]
  16. Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Paulsson, N.; Larsson, E.; Winquist, F. Extraction and selection of parameters for evaluation of breath alcohol measurement with an electronic nose. Sens. Actuators A Phys. 2000, 84, 187–197. [Google Scholar] [CrossRef]
  18. Li, J.; Gutierrez-Osuna, R.; Hodges, R.D.; Luckey, G.; Crowell, J. Using Field Asymmetric Ion Mobility Spectrometry for Odor Assessment of Automobile Interior Components. IEEE Sens. J. 2016, 16, 5747–5756. [Google Scholar] [CrossRef]
  19. Li, Y.; Shi, X.; Tong, L.; Tu, J. Manifold Regularized Multi-View Feature Selection for Web Image Annotation. Adv. Multimed. Inf. Process. PCM 2014, 204, 103–112. [Google Scholar]
  20. Anwar, K.; Harjoko, A.; Suharto, S. Feature Selection Based on Minimum Overlap Probability (MOP) in Identifying Beef and Pork. Int. J. Adv. Comput. Sci. Appl. 2016, 7, 316–322. [Google Scholar] [CrossRef]
  21. Osanaiye, O.; Cai, H.; Choo, K.K.R.; Dehghantanha, A.; Xu, Z. Ensemble-based multi-filter feature selection method for DDoS detection in cloud computing. Eurasip J. Wirel. Commun. Netw. 2016, 1, 130. [Google Scholar] [CrossRef]
  22. Giungato, P.; Laiola, E.; Nicolardi, V. Evaluation of Industrial Roasting Degree of Coffee Beans by Using an Electronic Nose and a Stepwise Backward Selection of Predictors. Food Anal. Method 2017, 10, 3424–3433. [Google Scholar] [CrossRef]
  23. Zhi, R.; Zhao, L.; Zhang, D. A Framework for the Multi-Level Fusion of Electronic Nose and Electronic Tongue for Tea Quality Assessment. Sensors 2017, 17, 1007. [Google Scholar] [CrossRef] [PubMed]
  24. Hong, X.; Wang, J. Detection of adulteration in cherry tomato juices based on electronic nose and tongue: Comparison of different data fusion approaches. J. Food Eng. 2014, 126, 89–97. [Google Scholar] [CrossRef]
  25. Buratti, S.; Malegori, C.; Benedetti, S. E-nose, e-tongue and e-eye for edible olive oil characterisation and shelf life assessment: A powerful data fusion approach. Talanta 2018, 182, 131–141. [Google Scholar] [CrossRef] [PubMed]
  26. Rodriguez-Mendez, M.L.; Apetrei, C.; Gay, M.; Medina-Plaza, C.; De Saja, J.A.; Vidal, S.; Aagaard, O.; Ugliano, M.; Wirth, J.; Cheynier, V. Evaluation of Oxygen Exposure Levels and Plyphenolic Content of Red Wines Using an Electronic Panel Formed by an Electronic Nose and an Electronic Tongue. Food Chem. 2014, 155, 91–97. [Google Scholar] [CrossRef] [PubMed]
  27. Dang, L.; Tian, F.; Zhang, L.; Kadri, C.; Yin, X.; Peng, X.; Liu, S. A novel classifier ensemble for recognition of multiple indoor air contaminants by an electronic nose. Sens. Actuators A Phys. 2014, 207, 67–74. [Google Scholar] [CrossRef]
  28. Miao, J.; Zhang, T.; Wang, Y.; Li, G. Optimal Sensor Selection for Classifying a Set of Ginsengs Using Metal-Oxide Sensors. Sensors 2015, 15, 16027–16039. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  29. Kim, K.S.; Choi, H.H.; Moon, C.S.; Mun, C.W. Comparison of k-nearest neighbor, quadratic discriminant and linear discriminant analysis in classification of electromyogram signals based on the wrist-motion directions. Curr. Appl. Phys. 2011, 11, 740–745. [Google Scholar] [CrossRef]
  30. Yesilbudak, M.; Sagiroglu, S.; Colak, I. A new approach to very short-term wind speed prediction using k-nearest neighbor classification. Energy Convers. Manag. 2013, 69, 77–86. [Google Scholar] [CrossRef]
  31. Feng, J.; Tian, F.; Yan, J.; He, Q.; Shen, Y.; Pan, L. A background elimination method based on wavelet transform in wound infection detection by electronic nose. Sens. Actuators B Chem. 2011, 157, 395–400. [Google Scholar] [CrossRef]
  32. Vergara, A.; Vembu, S.; Ayhan, T.; Ryan, M.A.; Homer, M.L.; Huerta, R. Chemical gas sensor drift compensation using classifier ensembles. Sens. Actuators B Chem. 2012, 166, 320–329. [Google Scholar] [CrossRef]
  33. Saini, I.; Singh, D.; Khosla, A. QRS detection using K-Nearest Neighbor algorithm (KNN) and evaluation on standard ECG databases. J. Adv. Res. 2013, 4, 331–344. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The flow chart of the feature selection and fusion framework.
Figure 1. The flow chart of the feature selection and fusion framework.
Sensors 18 01909 g001
Figure 2. Separability index of maximum value (MV) for Dataset I and difference of the maximal resistance change and the baseline (DR) for Dataset II.
Figure 2. Separability index of maximum value (MV) for Dataset I and difference of the maximal resistance change and the baseline (DR) for Dataset II.
Sensors 18 01909 g002
Figure 3. Visualization of the dissimilarity matrix of MV and DR.
Figure 3. Visualization of the dissimilarity matrix of MV and DR.
Sensors 18 01909 g003
Figure 4. The classification accuracies of seven features for Dataset I.
Figure 4. The classification accuracies of seven features for Dataset I.
Sensors 18 01909 g004
Figure 5. The classification accuracies of eight features for Dataset II.
Figure 5. The classification accuracies of eight features for Dataset II.
Sensors 18 01909 g005
Figure 6. The 3D plot of classification results of the four feature fusion methods for Dataset I. (a) The conventional feature fusion method without feature selection; (b) the proposed feature fusion method without feature selection; (c) the proposed feature fusion method with feature selection; (Note: 1, No infection; 2, S. aureus; 3, E. coli; 4, P. aeruginosa.).
Figure 6. The 3D plot of classification results of the four feature fusion methods for Dataset I. (a) The conventional feature fusion method without feature selection; (b) the proposed feature fusion method without feature selection; (c) the proposed feature fusion method with feature selection; (Note: 1, No infection; 2, S. aureus; 3, E. coli; 4, P. aeruginosa.).
Sensors 18 01909 g006
Figure 7. The 3D plot of classification results of the four feature fusion methods for Dataset II. (a) The conventional feature fusion method without feature selection; (b) the proposed feature fusion method without feature selection; (c) the proposed feature fusion method with feature selection; (Note: 1, ethanol; 2, ethylene; 3, ammonia; 4, acetaldehyde; 5, acetone; 6, toluene).
Figure 7. The 3D plot of classification results of the four feature fusion methods for Dataset II. (a) The conventional feature fusion method without feature selection; (b) the proposed feature fusion method without feature selection; (c) the proposed feature fusion method with feature selection; (Note: 1, ethanol; 2, ethylene; 3, ammonia; 4, acetaldehyde; 5, acetone; 6, toluene).
Sensors 18 01909 g007
Table 1. Number of samples in Dataset I.
Table 1. Number of samples in Dataset I.
GroupTraining SetTest Set
No infection2020
Pseudomonas aeruginosa2020
Escherichia coli2020
Staphylococcus aureus2020
Total8080
Table 2. Data structure of seven features.
Table 2. Data structure of seven features.
FeaturesMVFFTDb1Db2Db3Db4Db5
Feature structure15 × 8030 × 8030 × 8060 × 8090 × 80120 × 80150 × 80
Note: MV, maximum value; FFT, the DC component and first order harmonic component of the coefficients of fast Fourier transformation; Db1, Db2, Db3, Db4, Db5, the approximation coefficients of discrete wavelet transformation based on wavelets Db1, Db2, Db3, Db4, and Db5, respectively.
Table 3. Concentration ranges of analytes in Dataset II.
Table 3. Concentration ranges of analytes in Dataset II.
AnalytesAmmoniaAcetaldehydeAcetoneEthyleneEthanolToluene
Concentration Range (ppm)50–10005–30010–30010–30010–60010–100
Table 4. Experimental long-term sensor drift big data.
Table 4. Experimental long-term sensor drift big data.
Batch IDMonthNumber of the Data
EthanolEthyleneAmmoniaAcetaldehydeAcetoneToluene
Batch 11, 2833070989074
Batch 23~101001095323341645
Batch 311, 12, 132162402754903650
Batch 414, 1512301243640
Batch 51620466340280
Batch 617~2011029606574514467
Batch 721360744630662649568
Batch 822, 234033143303018
Batch 924, 3010075785561101
Batch 1036600600600600600600
Table 5. Classification results of seven features based on different values of k and distance metrics for Dataset I (%).
Table 5. Classification results of seven features based on different values of k and distance metrics for Dataset I (%).
DistancekMVFFTDb1Db2Db3Db4Db5
EU168.7573.7590.0091.2587.5088.7583.75
363.7572.5075.0078.7578.7581.2580.00
546.2545.0066.2570.0070.0071.2573.75
751.2553.7560.0068.7570.0073.7575.00
943.7560.0057.5066.2566.2566.2570.00
CB166.2570.0091.2591.2586.2586.2582.50
360.0062.5071.2575.0076.2577.5075.00
541.2547.5061.2571.2570.0072.5072.50
752.5053.7557.5071.2571.2572.5073.75
948.7548.7555.0065.0063.7565.0067.50
COS177.5077.5090.0092.5091.2591.2586.25
372.5078.7580.0082.5082.5082.5082.50
557.5060.0068.7568.7572.5073.7580.00
758.7553.7563.7565.0067.5071.2575.00
946.2543.7555.0062.5061.2561.2566.25
COR178.7577.5093.7592.5091.2592.5087.50
371.2576.2581.2585.0085.0085.0085.00
556.2557.5066.2576.2575.0076.2582.50
752.5061.2563.7566.2567.5071.2576.25
951.2558.7550.0058.7563.7565.0067.50
Note: EU, Euclidean distance; CB, cityblock distance; COS, cosine distance; COR, correlation distance. The bold numbers are the highest accuracies for each distance metric.
Table 6. Classification results of eight features based on different values of k and distance metrics for Dataset II (%).
Table 6. Classification results of eight features based on different values of k and distance metrics for Dataset II (%).
DistancekDRNDREMAi1EMAi2EMAi3EMAd1EMAd2EMAd3
EU153.5360.0036.3153.6159.0636.3143.5648.78
354.2558.8937.2854.0358.4236.6143.4749.39
554.0659.4238.4754.2858.0037.1143.4749.00
753.4759.5338.6154.0857.9737.3943.2548.42
953.5059.5038.1153.6457.6137.0343.0048.31
CB157.3361.9738.2254.0360.2536.6444.3650.50
360.5862.1937.5055.7858.8936.7544.4251.36
560.0061.2838.6955.5859.3636.8144.3151.28
760.0361.7838.1154.5359.6137.0344.3651.03
959.9762.1737.8653.6960.2836.8944.3151.50
COS149.4260.4237.3952.2257.2534.5043.3948.58
351.3359.8137.9750.4755.2835.0043.5648.69
551.3159.2238.2850.4255.5835.6443.1948.50
751.5359.3138.0850.2855.4236.1742.9448.42
952.0059.0037.7850.0655.4236.1942.5848.19
COR149.5659.7237.8951.0356.4435.7240.8646.69
349.9459.9737.7549.5054.5835.6141.0647.31
550.2559.5337.8649.9455.0035.0840.5647.78
750.3659.1437.3649.1154.6935.6740.4747.50
950.2259.2836.9748.8155.6936.2840.1447.14
Note: the bold numbers are the highest accuracies for each distance metric.
Table 7. Optimal numbers of different features after selection for Dataset I.
Table 7. Optimal numbers of different features after selection for Dataset I.
FeaturesMVFFTDb1Db2Db3Db4Db5
Distance metricsCOS15182110103674
COR142625231849109
Table 8. Optimal numbers of different features after selection for Dataset II.
Table 8. Optimal numbers of different features after selection for Dataset II.
FeaturesDRNDREMAi1EMAi2EMAi3EMAd1EMAd2EMAd3
Distance metricsCOS1316713871112
COR1397108161112
Table 9. Classification accuracy of Dataset I with/without feature selection.
Table 9. Classification accuracy of Dataset I with/without feature selection.
FeaturesMVFFTDb1Db2Db3Db4Db5
Without selectionCOSDimension1530306090120150
180.0085.0090.0090.0095.0095.0095.00
280.0080.0090.0095.0085.0085.0080.00
365.0070.0095.0090.0090.0090.0090.00
485.0075.0085.0095.0095.0095.0080.00
Average77.5077.5090.0092.5091.2591.2586.25
COR175.0085.0095.0090.0090.0095.0095.00
285.0085.00100.0090.0090.0090.0085.00
370.0065.0090.0090.0090.0090.0090.00
485.0075.0090.00100.0095.0095.0080.00
Average78.7577.5093.7592.5091.2592.5087.50
With selectionCOSDimension15182110103674
180.0085.0095.0095.0095.0095.0095.00
280.0080.0090.0090.0095.0090.0085.00
365.0075.0095.0095.0090.0095.0085.00
485.0080.0095.0095.0090.0090.0090.00
Average77.5080.0093.7593.7592.5092.5088.75
CORDimension142625231849109
185.0085.0095.0095.0090.0090.0095.00
290.0085.00100.0095.0095.00100.0085.00
370.0070.0095.0095.0095.0095.0090.00
480.0080.0095.00100.0095.0095.0085.00
Average81.2580.0096.2596.2593.7595.0088.75
Note: 1, No-infection; 2, S. aureus; 3, E. coli; 4, P. aeruginosa.
Table 10. Classification accuracy of Dataset II with/without feature selection.
Table 10. Classification accuracy of Dataset II with/without feature selection.
FeaturesDRNDREMAi1EMAi2EMAi3EMAd1EMAd2EMAd3
without selectionCOSDimension1616161616161616
161.0026.0073.1798.6765.0061.1769.6785.00
285.1798.6767.1783.3384.5054.0062.8367.67
390.3391.8310.0037.1789.3327.3358.5079.50
46.6717.3340.6759.5067.671.335.8315.00
546.1758.1730.8323.8327.5023.3355.5040.67
67.1770.502.5010.839.5039.838.003.67
Average49.4260.4237.3952.2257.2534.5043.3948.58
COR153.0025.5073.3398.8365.5060.8370.6783.00
285.1798.8367.6783.8383.8354.3363.1768.67
390.1790.176.3329.5086.3327.3345.6778.67
45.1714.8343.6758.1767.670.834.5011.50
551.8359.8332.1722.6724.0020.6751.5034.67
612.0069.174.1713.1711.3350.339.673.67
Average49.5659.7237.8951.0356.4435.7240.8646.69
with selectionCOSDimension1316713871112
147.1726.0075.0087.5082.0052.5077.8382.50
292.0098.6767.1784.3390.1753.0066.0079.33
385.8391.8331.8332.8375.3331.6758.6776.50
431.6717.3316.5071.3343.001.5030.5020.00
593.6758.1788.1749.3362.8341.0070.0047.00
676.3370.5019.6729.5059.6743.6747.1762.00
Average71.1160.4249.7259.1468.8337.2258.3661.22
CORDimension1397108161112
141.0051.0073.3392.1785.8360.8374.0079.50
292.6799.1772.3387.5091.1754.3364.0079.33
386.6798.6724.1760.5059.3327.3344.8376.67
431.330.009.505.3343.670.8316.009.33
593.1730.3381.8343.3365.3320.6769.0042.17
681.3395.6737.1752.8391.0050.3347.1760.33
Average71.0362.4749.7256.9472.7235.7252.5057.89
Note: 1, ethanol; 2, ethylene; 3, ammonia; 4, acetaldehyde; 5, acetone; 6, toluene.

Share and Cite

MDPI and ACS Style

Deng, C.; Lv, K.; Shi, D.; Yang, B.; Yu, S.; He, Z.; Yan, J. Enhancing the Discrimination Ability of a Gas Sensor Array Based on a Novel Feature Selection and Fusion Framework. Sensors 2018, 18, 1909. https://doi.org/10.3390/s18061909

AMA Style

Deng C, Lv K, Shi D, Yang B, Yu S, He Z, Yan J. Enhancing the Discrimination Ability of a Gas Sensor Array Based on a Novel Feature Selection and Fusion Framework. Sensors. 2018; 18(6):1909. https://doi.org/10.3390/s18061909

Chicago/Turabian Style

Deng, Changjian, Kun Lv, Debo Shi, Bo Yang, Song Yu, Zhiyi He, and Jia Yan. 2018. "Enhancing the Discrimination Ability of a Gas Sensor Array Based on a Novel Feature Selection and Fusion Framework" Sensors 18, no. 6: 1909. https://doi.org/10.3390/s18061909

APA Style

Deng, C., Lv, K., Shi, D., Yang, B., Yu, S., He, Z., & Yan, J. (2018). Enhancing the Discrimination Ability of a Gas Sensor Array Based on a Novel Feature Selection and Fusion Framework. Sensors, 18(6), 1909. https://doi.org/10.3390/s18061909

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop