Abstract
DNA gene expression profiling plays an important role in a wide range of areas in biological science for handling cancer diseases. Data generated in microarray related experiments have many missing expression values which lose valuable information from the dataset. The proposed method first partitions the genes without missing values using clustering algorithm and then measures the similarity between a gene with missing values and the centroid of the clusters and finally, the missing values are estimated by the corresponding expression values of the centroid giving maximum similarity factor. The method explicitly depends on expression values to imputes missing values, completed the input dataset with low errors for data analysis and knowledge discovery. The method is compared with prominent approaches, such as zero-impute, row-average-impute and KNN-impute in terms of “Normalized Root Mean Square Error” to claim its novelty.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
DeRisi, J., et al.: Use of a cDNA microarray to analyse gene expression patterns in human cancer. Nat. Genet. 14(4), 457–460 (1996)
Luo, J., Yang, T., Wang, Y.: Missing Value Estimation for Microarray Data Based On Fuzzy C-means Clustering. In: Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region (2005)
Butte, A.J., Ye, J.: Determining Significant Fold Differences in Gene Expression Analysis. In: Pac. Symp. Biocomput., vol. 6, pp. 6–17 (2001)
Alizadeh, A.A., et al.: Distinct Types of Diffuse Large B-Cell Lymphoma Identified by Gene Expression Profiling. Nature 403, 503–511 (2000)
Schafer, J.L., Graham, J.W.: Missing data: our view of the state of the art. Psychol. Methods 7, 144–177 (2002)
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–525 (2001)
Huynen, M., Snel, B., Lathe, W., Bork, P.: Genome Res. 10, 1204–1210 (2000)
Zhang, S., Zhang, J., Zhu, X., Qin, Y., Zhang, C.: Missing Value Imputation Based on Data Clustering. Transactions on Computational Science (TCOS) 1, 128–138 (2008)
Velarde Cristina, C., Escudero, R., Zaliz, R.R.: Boolean Networks: A Study on Microarray Data Discretization. In: ESTYLF 2008, Cuencas Mineras, Mieres, Langreo, pp. 17–19 (2008)
Pati, S.K., Das, A.K.: Cluster Analysis of Microarray Data Based on Similarity Measurement. International Journal of Bioinformatics Research 3(2), 207–213 (2011) ISSN: 0975-3087
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pati, S.K., Das, A.K. (2012). Missing Value Estimation of Microarray Data Using Similarity Measurement. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Nanda, P.K. (eds) Swarm, Evolutionary, and Memetic Computing. SEMCCO 2012. Lecture Notes in Computer Science, vol 7677. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35380-2_70
Download citation
DOI: https://doi.org/10.1007/978-3-642-35380-2_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35379-6
Online ISBN: 978-3-642-35380-2
eBook Packages: Computer ScienceComputer Science (R0)