Projection Based Clustering of Gene Expression Data | SpringerLink
Skip to main content

Abstract

The microarray DNA technologies have given researchers the ability to examine, discover and monitor thousands of genes in a single experiment. Nonetheless, the tremendous amount of data that can be obtained from microarray studies presents a challenge for data analysis, mainly due to the very high data dimensionality. A particular class of clustering algorithms has been very successful in dealing with such data, utilising information driven by the Principal Component Analysis. In this paper, we investigate the application of recently proposed projection based hierarchical clustering algorithms on gene expression microarray data. The algorithms apart from identifying the clusters present in a data set also calculate their number and thus require no special knowledge about the data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Alizadeh, A.A., Eisen, M.B., Davis, R.E., Ma, C., Lossos, I.S., Rosenwald, A., Boldrick, J.C., Sabet, H., Tran, T., Yu, X.: Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)

    Article  Google Scholar 

  2. Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide array. Proc. Natl. Acad. Sci. USA 96(12), 6745–6750 (1999)

    Article  Google Scholar 

  3. Bellman, R.: Adaptive control processes: A guided tour. Princeton University Press, Princeton (1961)

    MATH  Google Scholar 

  4. Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful. In: 7th International Conference on Database Theory, pp. 217–235 (1999)

    Google Scholar 

  5. Boley, D.: Principal direction divisive partitioning. Data Mining and Knowledge Discovery 2(4), 325–344 (1998)

    Article  Google Scholar 

  6. Brown, P., Botstein, D., Eisen, M., Spellman, P.: Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 95(25), 14863–14868 (1998)

    Article  Google Scholar 

  7. Chute, C., Yang, Y.: An overview of statistical methods for the classification and retrieval of patient events. Methods Inf. Med. 34(1-2), 104–110 (1995)

    Google Scholar 

  8. Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)

    Article  Google Scholar 

  9. Dhillon, I., Kogan, J., Nicholas, C.: Feature selection and document clustering. A Comprehensive Survey of Text Mining, 73–100 (2003)

    Google Scholar 

  10. Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 269–274. ACM, New York (2001)

    Chapter  Google Scholar 

  11. Golub, T., Slomin, D., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J., Coller, H., Loh, M., Caligiuri, M., Downing, J., Bloomfield, C., Lander, E.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 268, 531–537 (1999)

    Article  Google Scholar 

  12. Greengard, L., Strain, J.: The fast gauss transform. SIAM J. Sci. Stat. Comput. 12(1), 79–94 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  13. Jain, A.K., Dubes, R.C.: Algorithms for clustering data (1988)

    Google Scholar 

  14. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31(3), 264–323 (1999), http://citeseer.ist.psu.edu/jain99data.html

    Article  Google Scholar 

  15. Khan, J., Wei, J., Ringner, M., Saal, L., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C., Peterson, C., Meltzer, P.: Classification and diagnostic prediction of cancers using expression profiling and artificial neural networks. Nature Medicine 7, 673–679 (2001)

    Article  Google Scholar 

  16. Lax, P.D.: Linear algebra and its applications. Wiley Interscience, Hoboken (2007)

    MATH  Google Scholar 

  17. Nilsson, M.: Hierarchical Clustering Using Non-Greedy Principal Direction Divisive Partitioning. Information Retrieval 5(4), 311–321 (2002)

    Article  Google Scholar 

  18. Notterman, D.A., Alon, U., Sierk, A.J., Levine, A.J.: Transcriptional gene expression profiles of colorectal adenoma, adenocarcinoma, and normal tissue examined by oligonucleotide arrays. Cancer Research 61, 3124–3130 (2001)

    Google Scholar 

  19. Sander, J., Ester, M., Kriegel, H.P., Xu, X.: Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery 2(2), 169–194 (1998)

    Article  Google Scholar 

  20. Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P.: Gene expression correlates of clinical prostate cancer behavior. Cancer cell 1(2), 203–209 (2002)

    Article  Google Scholar 

  21. Steinbach, M., Ertz, L., Kumar, V.: The challenges of clustering high dimensional data. New Vistas in Statistical Physics: Applications in Econophysics, Bioinformatics, and Pattern Recognition (2003)

    Google Scholar 

  22. Tasoulis, S., Tasoulis, D.: Improving principal direction divisive clustering. In: 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2008), Workshop on Data Mining using Matrices and Tensors, Las Vegas, USA (2008)

    Google Scholar 

  23. Tryon, C.: Cluster Analysis. Edward Brothers, Ann Arbor (1939)

    Google Scholar 

  24. Wen, X., Fuhrman, S., Michaels, G., Carr, D., Smith, S., Barker, J., Somogyi, R.: Large-scale temporal gene expression mapping of cns development. Proceedings of the National Academy of Sciences of the United States of America 95, 334–339 (1998)

    Article  Google Scholar 

  25. Zhao, Y., Karypis, G.: Empirical and theoretical comparisons of selected criterion functions for document clustering. Machine Learning 55(3), 311–331 (2004)

    Article  MATH  Google Scholar 

  26. Yang, C., Duraiswami, R., Gumerov, N.A., Davis, L.: Improved fast gauss transform and efficient kernel density estimation. In: Proceedings of Ninth IEEE International Conference on Computer Vision, pp. 664–671 (2003)

    Google Scholar 

  27. Yeoh, E.J., Ross, M.E., Shurtleff, S.A., Williams, W.K., Patel, D., Mahfouz, R., Behm, F.G., Raimondi, S.C., Relling, M.V., Patel, A.: Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer cell 1(2), 133–143 (2002)

    Article  Google Scholar 

  28. Zeimpekis, D., Gallopoulos, E.: PDDP(l): Towards a Flexing Principal Direction Divisive Partitioning Clustering Algorithms. In: Boley, D., Dhillon, I., Ghosh, J., Kogan, J. (eds.) Proc. IEEE ICDM ’03 Workshop on Clustering Large Data Sets, Melbourne, Florida, pp. 26–35 (2003)

    Google Scholar 

  29. Zeimpekis, D., Gallopoulos, E.: Principal direction divisive partitioning with kernels and k-means steering. In: Survey of Text Mining II: Clustering, Classification, and Retrieval, pp. 45–64 (2007)

    Google Scholar 

  30. Zhangi, A., Jiang, D., Tang, C.: Cluster analysis for gene expression data: a survey. IEEE Transactions on Knowledge Data Engineering 16(11), 1370–1386 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tasoulis, S.K., Plagianakos, V.P., Tasoulis, D.K. (2010). Projection Based Clustering of Gene Expression Data. In: Masulli, F., Peterson, L.E., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2009. Lecture Notes in Computer Science(), vol 6160. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14571-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-14571-1_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-14570-4

  • Online ISBN: 978-3-642-14571-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics