Abstract
Mining the “meaningful” clues from vast amount of expression profiling data remains to be challenge for biologists. After all the statistical tests, biologists often struggle deciding how to do next with a large list of genes without any obvious theme of mechanism, partly because most statistical analyses do not incorporate understanding of biological systems before hand. Here, we developed a novel method of “gene –pair difference within a sample” to identify phenotype-defining gene signatures, based on the hypothesis that a biological state is governed by the relative difference among different biological processes. For gene expression, it is relative difference among the genes within a sample (an individual, cell, etc), the highest frequency of occurrences a gene contributing to the within sample difference underline the contributions of genes in defining the biological states. We tested the method on three datasets, and identified the most important gene-pairs to drive the phenotypic differences.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Savage, K.J., Monti, S., Kutok, J.L., Cattoretti, G., Neuberg, D., De Leval, L., et al.: The molecular signature of mediastinal large B-cell lymphoma differs from that of other diffuse large B-cell lymphomas and shares features with classical Hodgkin lymphoma. Blood 102, 3871–3879 (2003)
Rosenwald, A., Wright, G., Chan, W.C., Connors, J.M., Campo, E., Fisher, R.I., et al.: The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. 346, 1937–1947 (2002)
Alizadeh, A.A., Eisen, M.B., Davis, R.E., Ma, C., Lossos, I.S., Rosenwald, A., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Lossos, I.S., Czerwinski, D.K., Alizadeh, A.A., Wechser, M.A., Tibshirani, R., Botstein, D., et al.: Prediction of survival in diffuse large-B-cell lymphoma based on the expression of six genes. N. Engl. J. Med. 350, 1828–1837 (2004)
Barrett, T., Suzek, T.O., Troup, D.B., Wilhite, S.E., Ngau, W.C., Ledoux, P., et al.: NCBI GEO: mining millions of expression profiles–database and tools. Nucleic Acids Res. 33(Database Issue), 562–566 (2005)
Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., et al.: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS, 0506580102 (2005)
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. U.S.A. 95, 14863–14868 (1998)
Handl, J., Knowles, J., Kell, D.B.: Computational cluster validation in post-genomic data analysis. Bioinformatics 21, 3201–3212 (2005)
Handl, J., Knowles, J., Kell, D.B.: Computational cluster validation in post-genomic data analysis. Bioinformatics 21, 3201–3212 (2005)
Man, M.Z., Dyson, G., Johnson, K., Liao, B.: Evaluating methods for classifying expression data. J. Biopharm. Stat. 14, 1065–1084 (2004)
Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., et al.: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS 2005, 0506580102 (2005)
Oltvai, Z.N., Barabasi, A.L.: Systems biology. Life’s complexity pyramid. Science 298, 763–764 (2002)
Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N., Barabasi, A.L.: Hierarchical organization of modularity in metabolic networks. Science 297, 1551–1555 (2002)
Dezso, Z., Oltvai, Z.N., Barabasi, A.L.: Bioinformatics analysis of experimentally determined protein complexes in the yeast Saccharomyces cerevisiae. Genome Res. 13, 2450–2454 (2003)
Wuchty, S., Oltvai, Z.N., Barabasi, A.L.: Evolutionary conservation of motif constituents in the yeast protein interaction network. Nat. Genet. 35, 176–179 (2003)
Fortunel, N.O., Otu, H.H., Ng, H.H., Chen, J., Mu, X., Chevassut, T., et al.: Comment on ‘Stemness’: transcriptional profiling of embryonic and adult stem cells and a stem cell molecular signature. Science 302, 393 (2003)
Liu, W.M., Mei, R., Di, X., Ryder, T.B., Hubbell, E., Dee, S., et al.: Analysis of high density expression microarrays with signed-rank call algorithms. Bioinformatics 18, 1593–1599 (2002)
Schena, M., Shalon, D., Davis, R.W., Brown, P.O.: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science %20 270, 467–470 (1995)
Rosenwald, A., Wright, G., Chan, W.C., Connors, J.M., Campo, E., Fisher, R.I., et al.: The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. %20 346, 1937–1947 (2002)
Monti, S., Savage, K.J., Kutok, J.L., Feuerhake, F., Kurtin, P., Mihm, M., et al.: Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response. Blood 105, 1851–1861 (2005)
Bhattacharjee, A., Richards, W.G., Staunton, J., Li, C., Monti, S., Vasa, P., et al.: Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. U.S.A. 98, 13790–13795 (2001)
Liu, W.M., Mei, R., Di, X., Ryder, T.B., Hubbell, E., Dee, S., et al.: Analysis of high density expression microarrays with signed-rank call algorithms. Bioinformatics 18, 1593–1599 (2002)
Schena, M., Shalon, D., Davis, R.W., Brown, P.O.: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science %20 270, 467–470 (1995)
Rosenwald, A., Wright, G., Chan, W.C., Connors, J.M., Campo, E., Fisher, R.I., et al.: The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. %20 346, 1937–1947 (2002)
Alizadeh, A.A., Eisen, M.B., Davis, R.E., Ma, C., Lossos, I.S., Rosenwald, A., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Rosenwald, A., Wright, G., Chan, W.C., Connors, J.M., Campo, E., Fisher, R.I., et al.: The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. %20 346, 1937–1947 (2002)
Milili, M., Antunes, H., Blanco-Betancourt, C., Nogueiras, A., Santos, E., Vasconcelos, J., et al.: A new case of autosomal recessive agammaglobulinaemia with impaired pre-B cell differentiation due to a large deletion of the IGH locus. European Journal of Pediatrics 161, 479–484 (2002)
Lopez, G.E., Porpiglia, A.S., Hogan, M.B., Matamoros, N., Krasovec, S., Pignata, C., et al.: Clinical and molecular analysis of patients with defects in micro heavy chain gene. Journal of Clinical Investigation 110, 1029–1035 (2002)
Foroni, L., Boehm, T., White, L., Forster, A., Sherrington, P., Liao, X.B., et al.: The rhombotin gene family encode related LIM-domain proteins whose differing expression suggests multiple roles in mouse development. J. Mol. Biol. 226, 747–761 (1992)
Monti, S., Savage, K.J., Kutok, J.L., Feuerhake, F., Kurtin, P., Mihm, M., et al.: Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response. Blood 105, 1851–1861 (2005)
Cahir-McFarland, E.D., Carter, K., Rosenwald, A., Giltnane, J.M., Henrickson, S.E., Staudt, L.M., et al.: Role of NF-kappa B in cell survival and transcription of latent membrane protein 1-expressing or Epstein-Barr virus latency III-infected cells. Journal of Virology 78, 4108–4119 (2004)
Islam, T.C., Asplund, A.C., Lindvall, J.M., Nygren, L., Liden, J., Kimby, E., et al.: High level of cannabinoid receptor 1, absence of regulator of G protein signalling 13 and differential expression of Cyclin D1 in mantle cell lymphoma. Leukemia 17, 1880–1890 (2003)
Rosenwald, A., Wright, G., Chan, W.C., Connors, J.M., Campo, E., Fisher, R.I., et al.: The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N. Engl. J. Med. %20 346, 1937–1947 (2002)
Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., et al.: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS 2005, 0506580102 (2005)
Monti, S., Savage, K.J., Kutok, J.L., Feuerhake, F., Kurtin, P., Mihm, M., et al.: Molecular profiling of diffuse large B-cell lymphoma identifies robust subtypes including one characterized by host inflammatory response. Blood 105, 1851–1861 (2005)
Bridges, J.P., Wert, S.E., Nogee, L.M., Weaver, T.E.: Expression of a human surfactant protein C mutation associated with interstitial lung disease disrupts lung development in transgenic mice. Journal of Biological Chemistry 278, 52739–52746 (2003)
Vejda, S., Posovszky, C., Zelzer, S., Peter, B., Bayer, E., Gelbmann, D., et al.: Plasma from cancer patients featuring a characteristic protein composition mediates protection against apoptosis. Mol. Cell Proteomics 1, 387–393 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, CW., Li, S.D., Su, E.W., Liao, B. (2006). Identification of Phenotype-Defining Gene Signatures Using the Gene-Pair Matrix Based Clustering. In: Dalkilic, M.M., Kim, S., Yang, J. (eds) Data Mining and Bioinformatics. VDMB 2006. Lecture Notes in Computer Science(), vol 4316. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11960669_10
Download citation
DOI: https://doi.org/10.1007/11960669_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68970-6
Online ISBN: 978-3-540-68971-3
eBook Packages: Computer ScienceComputer Science (R0)