Multi-channel Partial Graph Integration Learning of Partial Multi-omics Data for Cancer Subtyping | Bentham Science
Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

Multi-channel Partial Graph Integration Learning of Partial Multi-omics Data for Cancer Subtyping

Author(s): Qing-Qing Cao, Jian-Ping Zhao* and Chun-Hou Zheng*

Volume 18, Issue 8, 2023

Published on: 12 July, 2023

Page: [680 - 691] Pages: 12

DOI: 10.2174/1574893618666230519145545

Price: $65

Open Access Journals Promotions 2
Abstract

Background: The appearance of cancer subtypes with different clinical significance fully reflects the high heterogeneity of cancer. At present, the method of multi-omics integration has become more and more mature. However, in the practical application of the method, the omics of some samples are missing.

Objective: The purpose of this study is to establish a depth model that can effectively integrate and express partial multi-omics data to accurately identify cancer subtypes.

Methods: We proposed a novel partial multi-omics learning model for cancer subtypes, MPGIL (Multichannel Partial Graph Integration Learning). MPGIL has two main components. Firstly, it obtains more lateral adjacency information between samples within the omics through the multi-channel graph autoencoders based on high-order proximity. To reduce the negative impact of missing samples, the weighted fusion layer is introduced to replace the concatenate layer to learn the consensus representation across multi-omics. Secondly, a classifier is introduced to ensure that the consensus representation is representative of clustering. Finally, subtypes were identified by K-means.

Results: This study compared MPGIL with other multi-omics integration methods on 16 datasets. The clinical and survival results show that MPGIL can effectively identify subtypes. Three ablation experiments are designed to highlight the importance of each component in MPGIL. A case study of AML was conducted. The differentially expressed gene profiles among its subtypes fully reveal the high heterogeneity of cancer.

Conclusion: MPGIL can effectively learn the consistent expression of partial multi-omics datasets and discover subtypes, and shows more significant performance than the state-of-the-art methods.

Keywords: Partial multi-omics data, high-order proximity, cancer data, multi-channel, classifier, graph autoencoder.

« Previous
Graphical Abstract
[1]
Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021; 71(3): 209-49.
[http://dx.doi.org/10.3322/caac.21660] [PMID: 33538338]
[2]
Garraway LA, Lander ES. Lessons from the cancer genome. Cell 2013; 153(1): 17-37.
[http://dx.doi.org/10.1016/j.cell.2013.03.002] [PMID: 23540688]
[3]
Fisher R, Pusztai L, Swanton C. Cancer heterogeneity: Implications for targeted therapeutics. Br J Cancer 2013; 108(3): 479-85.
[http://dx.doi.org/10.1038/bjc.2012.581] [PMID: 23299535]
[4]
Zhao L, Yan H. MCNF: A novel method for cancer subtyping by integrating multi-omics and clinical data. IEEE/ACM Trans Comput Biol Bioinformatics 2020; 17(5): 1682-90.
[http://dx.doi.org/10.1109/TCBB.2019.2910515] [PMID: 30990192]
[5]
Bebber CM, Thomas ES, Stroh J, et al. Ferroptosis response segregates small cell lung cancer (SCLC) neuroendocrine subtypes. Nat Commun 2021; 12(1): 1-19.
[PMID: 33397941]
[6]
Golub TR, Slonim DK, Tamayo P, et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 1999; 286(5439): 531-7.
[http://dx.doi.org/10.1126/science.286.5439.531] [PMID: 10521349]
[7]
Goodwin S, McPherson JD, McCombie WR. Coming of age: Ten years of next-generation sequencing technologies. Nat Rev Genet 2016; 17(6): 333-51.
[http://dx.doi.org/10.1038/nrg.2016.49] [PMID: 27184599]
[8]
Hudson TJ, Anderson W, Artez A, et al. International network of cancer genome projects. Nature 2010; 464(7291): 993-8.
[http://dx.doi.org/10.1038/nature08987] [PMID: 20393554]
[9]
Weinstein JN, Collisson EA, Mills GB, et al. The cancer genome atlas pan-cancer analysis project. Nat Genet 2013; 45(10): 1113-20.
[http://dx.doi.org/10.1038/ng.2764] [PMID: 24071849]
[10]
Rappoport N, Shamir R. Multi-omic and multi-view clustering algorithms: review and cancer benchmark. Nucleic Acids Res 2018; 46(20): 10546-62.
[http://dx.doi.org/10.1093/nar/gky889] [PMID: 30295871]
[11]
Duan R, Gao L, Gao Y, et al. Evaluation and comparison of multi-omics data integration methods for cancer subtyping. PLOS Comput Biol 2021; 17(8): e1009224.
[http://dx.doi.org/10.1371/journal.pcbi.1009224] [PMID: 34383739]
[12]
Subramanian I, Verma S, Kumar S, Jere A, Anamika K. Multi-omics data integration, interpretation, and its application. Bioinform Biol Insights 2020; 14: 1-24.
[http://dx.doi.org/10.1177/1177932219899051] [PMID: 32076369]
[13]
Heo YJ, Hwa C, Lee GH, Park JM, An JY. Integrative multi-omics approaches in cancer research: From biological networks to clinical subtypes. Mol Cells 2021; 44(7): 433-43.
[http://dx.doi.org/10.14348/molcells.2021.0042] [PMID: 34238766]
[14]
Lovino M, Randazzo V, Ciravegna G, Barbiero P, Ficarra E, Cirrincione G. A survey on data integration for multi-omics sample clustering. Neurocomputing 2022; 488: 494-508.
[http://dx.doi.org/10.1016/j.neucom.2021.11.094]
[15]
Pierre-Jean M, Deleuze JF, Le Floch E, Mauger F. Clustering and variable selection evaluation of 13 unsupervised methods for multi-omics data integration. Brief Bioinform 2020; 21(6): 2011-30.
[http://dx.doi.org/10.1093/bib/bbz138] [PMID: 31792509]
[16]
Tini G, Marchetti L, Priami C, Scott-Boyer MP. Multi-omics integration-a comparison of unsupervised clustering methodologies. Brief Bioinform 2019; 20(4): 1269-79.
[http://dx.doi.org/10.1093/bib/bbx167] [PMID: 29272335]
[17]
Menyhárt O, Győrffy B. Multi-omics approaches in cancer research with applications in tumor subtyping, prognosis, and diagnosis. Comput Struct Biotechnol J 2021; 19: 949-60.
[http://dx.doi.org/10.1016/j.csbj.2021.01.009] [PMID: 33613862]
[18]
Chauvel C, Novoloaca A, Veyre P, Reynier F, Becker J. Evaluation of integrative clustering methods for the analysis of multi-omics data. Brief Bioinform 2020; 21(2): 541-52.
[http://dx.doi.org/10.1093/bib/bbz015] [PMID: 31220206]
[19]
Reel PS, Reel S, Pearson E, Trucco E, Jefferson E. Using machine learning approaches for multi-omics data analysis: A review. Biotechnol Adv 2021; 49: 107739.
[http://dx.doi.org/10.1016/j.biotechadv.2021.107739] [PMID: 33794304]
[20]
Zhang X, Zhou Z, Xu H, Liu CT. Integrative clustering methods for multi‐omics data. Wiley Interdiscip Rev Comput Stat 2022; 14(3): e1553.
[http://dx.doi.org/10.1002/wics.1553] [PMID: 35573155]
[21]
Baldwin E, Han J, Luo W, et al. On fusion methods for knowledge discovery from multi-omics datasets. Comput Struct Biotechnol J 2020; 18: 509-17.
[http://dx.doi.org/10.1016/j.csbj.2020.02.011] [PMID: 32206210]
[22]
Cai Z, Poulos RC, Liu J, Zhong Q. Machine learning for multi-omics data integration in cancer. iScience 2022; 25(2): 103798.
[http://dx.doi.org/10.1016/j.isci.2022.103798] [PMID: 35169688]
[23]
Wei Z, Zhang Y, Weng W, Chen J, Cai H. Survey and comparative assessments of computational multi-omics integrative methods with multiple regulatory networks identifying distinct tumor compositions across pan-cancer data sets. Brief Bioinform 2021; 22(3): bbaa102.
[http://dx.doi.org/10.1093/bib/bbaa102] [PMID: 32533167]
[24]
Hwang J, Moon S, Lee H. SDGCCA: Supervised deep generalized canonical correlation analysis for multi-omics integration. arXiv 2022.
[25]
Das S, Mukhopadhyay I. TiMEG: An integrative statistical method for partially missing multi-omics data. Sci Rep 2021; 11(1): 24077.
[http://dx.doi.org/10.1038/s41598-021-03034-z] [PMID: 34911979]
[26]
Tian J, Zhao J, Zheng C. Clustering of cancer data based on Stiefel manifold for multiple views. BMC Bioinformatics 2021; 22(1): 268.
[http://dx.doi.org/10.1186/s12859-021-04195-4] [PMID: 34034643]
[27]
Wang J, Lu CH, Kong XZ, Dai LY, Yuan S, Zhang X. Multi-view manifold regularized compact low-rank representation for cancer samples clustering on multi-omics data. BMC Bioinformatics 2022; 22(12) (Suppl. 12): 334.
[PMID: 35057729]
[28]
Ge S, Wang X, Cheng Y, Liu J. Cancer subtype recognition based on laplacian rank constrained multiview clustering. Genes 2021; 12(4): 526.
[http://dx.doi.org/10.3390/genes12040526] [PMID: 33916856]
[29]
Mitra S, Saha S, Hasanuzzaman M. Multi-view clustering for multi-omics data using unified embedding. Sci Rep 2020; 10(1): 13654.
[http://dx.doi.org/10.1038/s41598-020-70229-1] [PMID: 32788601]
[30]
Liu C, Wang X, Genchev GZ, Lu H. Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction. Methods 2017; 124: 100-7.
[http://dx.doi.org/10.1016/j.ymeth.2017.06.010] [PMID: 28627406]
[31]
Argelaguet R, Velten B, Arnol D, et al. Multi‐omics factor analysis-a framework for unsupervised integration of multi‐omics data sets. Mol Syst Biol 2018; 14(6): e8124.
[http://dx.doi.org/10.15252/msb.20178124] [PMID: 29925568]
[32]
Dwivedi A, Paul S. Recursive multi-view integration for subtypes identification of cervical cancer. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Houston, USA. 2021; 706-9.
[33]
Sun Y, Ou-Yang L, Dai DQ. WMLRR: A weighted multi-view low rank representation to identify cancer subtypes from multiple types of omics data. IEEE/ACM Trans Comput Biol Bioinformatics 2021; 18(6): 2891-7.
[http://dx.doi.org/10.1109/TCBB.2021.3063284] [PMID: 33656995]
[34]
Lu Z, Chen X, Yang J, Ding Y. RSC-based differential model with correlation removal for improving multi-omics clustering. J Theor Biol 2023; 556: 111328.
[http://dx.doi.org/10.1016/j.jtbi.2022.111328] [PMID: 36273593]
[35]
Tyler SR, Chun Y, Ribeiro VM, et al. Merged affinity network association clustering: Joint multi-omic/clinical clustering to identify disease endotypes. Cell Rep 2021; 35(2): 108975.
[http://dx.doi.org/10.1016/j.celrep.2021.108975] [PMID: 33852839]
[36]
Liu J, Ge S, Cheng Y, Wang X. Multi-view spectral clustering based on multi-smooth representation fusion for cancer subtype prediction. Front Genet 2021; 12: 718915.
[http://dx.doi.org/10.3389/fgene.2021.718915] [PMID: 34552619]
[37]
Yuanyuan Z, Ziqi W, Shudong W, Chuanhua K. SSIG: Single-Sample Information Gain model for integrating multi-omics data to identify cancer subtypes. Chin J Electron 2021; 30(2): 303-12.
[http://dx.doi.org/10.1049/cje.2021.01.011]
[38]
Ma T, Zhang A. Integrate multi-omic data using affinity network fusion (ANF) for cancer patient clustering. 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Kansas City, USA. 2017; 398-403.
[http://dx.doi.org/10.1109/BIBM.2017.8217682]
[39]
Shi X, Liang C, Wang H. Multiview robust graph-based clustering for cancer subtype identification. IEEE/ACM Trans Comput Biol Bioinformatics 2022; 20(1): 544-56.
[40]
Liang C, Shang M, Luo J. Cancer subtype identification by consensus guided graph autoencoders. Bioinformatics 2021; 37(24): 4779-86.
[http://dx.doi.org/10.1093/bioinformatics/btab535] [PMID: 34289034]
[41]
Liu H, Shang M, Zhang H, Liang C. Cancer Subtype identification based on multi-view subspace clustering with adaptive local structure learning. IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Houston, USA. 2021; pp. 484-90.
[http://dx.doi.org/10.1109/BIBM52615.2021.9669659]
[42]
Rong Z, Liu Z, Song J, et al. MCluster-VAEs: An end-to-end variational deep learning-based clustering method for subtype discovery using multi-omics data. Comput Biol Med 2022; 150: 106085.
[http://dx.doi.org/10.1016/j.compbiomed.2022.106085] [PMID: 36162197]
[43]
Yang Y, Tian S, Qiu Y, Zhao P, Zou Q. MDICC: Novel method for multi-omics data integration and cancer subtype identification. Brief Bioinform 2022; 23(3): bbac132.
[http://dx.doi.org/10.1093/bib/bbac132] [PMID: 35437603]
[44]
Zhang G, Peng Z, Yan C, Wang J, Luo J, Luo H. MultiGATAE: A novel cancer subtype identification method based on multi-omics and attention mechanism. Front Genet 2022; 13: 855629.
[http://dx.doi.org/10.3389/fgene.2022.855629] [PMID: 35391797]
[45]
Hu Y, Cai H. Hypergraph-supervised deep subspace clustering. Mathematics 2021; 9(24): 3259.
[http://dx.doi.org/10.3390/math9243259]
[46]
Zhang C, Chen Y, Zeng T, Zhang C, Chen L. Deep latent space fusion for adaptive representation of heterogeneous multi-omics data. Brief Bioinform 2022; 23(2): bbab600.
[http://dx.doi.org/10.1093/bib/bbab600] [PMID: 35079777]
[47]
Osseni MA, Tossou P, Laviolette F, Corbeil J. MOT: A multi-omics transformer for multiclass classification tumour types predictions. bioRxiv 2022.
[48]
Pfeifer B, Voicu-Spineanu A, Schimek MG, Alachiotis N. Integrative hierarchical ensemble clustering for improved disease subtype discovery. IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Houston USA. 2021; pp. 720-5.
[http://dx.doi.org/10.1109/BIBM52615.2021.9669608]
[49]
Li X, Ma J, Leng L, et al. MoGCN: A multi-omics integration method based on graph convolutional network for cancer subtype analysis. Front Genet 2022; 13: 806842.
[http://dx.doi.org/10.3389/fgene.2022.806842] [PMID: 35186034]
[50]
Song W, Wang W, Dai DQ. Subtype-WESLR: Identifying cancer subtype with weighted ensemble sparse latent representation of multi-view data. Brief Bioinform 2022; 23(1): bbab398.
[http://dx.doi.org/10.1093/bib/bbab398] [PMID: 34607358]
[51]
Yang B, Xin TT, Pang SM, Wang M, Wang YJ. Deep subspace mutual learning for cancer subtypes prediction. Bioinformatics 2021; 37(21): 3715-22.
[http://dx.doi.org/10.1093/bioinformatics/btab625] [PMID: 34478501]
[52]
Kang M, Ko E, Mersha TB. A roadmap for multi-omics data integration using deep learning. Brief Bioinform 2022; 23(1): bbab454.
[http://dx.doi.org/10.1093/bib/bbab454] [PMID: 34791014]
[53]
Zhou G, Ewald J, Xia J. OmicsAnalyst: A comprehensive web-based platform for visual analytics of multi-omics data. Nucleic Acids Res 2021; 49(W1): W476-82.
[http://dx.doi.org/10.1093/nar/gkab394] [PMID: 34019646]
[54]
Lemsara A, Ouadfel S, Fröhlich H, Path ME. Pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data. BMC Bioinformatics 2020; 21(1): 146.
[http://dx.doi.org/10.1186/s12859-020-3465-2] [PMID: 32299344]
[55]
Wu D, Wang D, Zhang MQ, Gu J. Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: Application to cancer molecular classification. BMC Genomics 2015; 16(1): 1022.
[http://dx.doi.org/10.1186/s12864-015-2223-8] [PMID: 26626453]
[56]
Mo Q, Shen R, Guo C, Vannucci M, Chan KS, Hilsenbeck SG. A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data. Biostatistics 2018; 19(1): 71-86.
[http://dx.doi.org/10.1093/biostatistics/kxx017] [PMID: 28541380]
[57]
Wang B, Mezlini AM, Demir F, et al. Similarity network fusion for aggregating data types on a genomic scale. Nat Methods 2014; 11(3): 333-7.
[http://dx.doi.org/10.1038/nmeth.2810] [PMID: 24464287]
[58]
Wang B, Jiang J, Wei W, Zhou Z, Tu Z. Unsupervised metric fusion by cross diffusion. Computer Vision and Pattern Recognition 2012; 20(12): 2997-3004.
[59]
Guan Q, Zhao J, Zheng C. SNEMO: Spectral clustering based on the neighborhood for multi-omics data. International Conference on Intelligent Computing. Nanjian, China. 2021; pp. 490-8.
[http://dx.doi.org/10.1007/978-3-030-84532-2_44]
[60]
Yang H, Chen R, Li D, Wang Z. Subtype-GAN: A deep learning approach for integrative cancer subtyping of multi-omics data. Bioinformatics 2021; 37(16): 2231-7.
[http://dx.doi.org/10.1093/bioinformatics/btab109] [PMID: 33599254]
[61]
Yang B, Yang Y, Su X. Deep structure integrative representation of multi-omics data for cancer subtyping. Bioinformatics 2022; 38(13): 3337-42.
[http://dx.doi.org/10.1093/bioinformatics/btac345] [PMID: 35639657]
[62]
Fang Z, Ma T, Tang G, et al. Bayesian integrative model for multi-omics data with missingness. Bioinformatics 2018; 34(22): 3801-8.
[http://dx.doi.org/10.1093/bioinformatics/bty775] [PMID: 30184058]
[63]
Rappoport N, Shamir R. NEMO: cancer subtyping by integration of partial multi-omic data. Bioinformatics 2019; 35(18): 3348-56.
[http://dx.doi.org/10.1093/bioinformatics/btz058] [PMID: 30698637]
[64]
Voillet V, Besse P, Liaubet L, San Cristobal M, González I. Handling missing rows in multi-omics data integration: Multiple imputation in multiple factor analysis framework. BMC Bioinformatics 2016; 17(1): 402.
[http://dx.doi.org/10.1186/s12859-016-1273-5] [PMID: 27716030]
[65]
Xu H, Gao L, Huang M, Duan R. A network embedding based method for partial multi-omics integration in cancer subtyping. Methods 2021; 192: 67-76.
[http://dx.doi.org/10.1016/j.ymeth.2020.08.001] [PMID: 32805397]
[66]
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv 2013.
[67]
Ding H, Sharpnack M, Wang C, Huang K, Machiraju R. Integrative cancer patient stratification via subspace merging. Bioinformatics 2019; 35(10): 1653-9.
[http://dx.doi.org/10.1093/bioinformatics/bty866] [PMID: 30329022]
[68]
Chen J, Rong W, Tao G, Cai H. Similarity fusion via exploiting high order proximity for cancer subtyping. IEEE/ACM Trans Comput Biol Bioinformatics 2021; 20(1): 1-10.
[http://dx.doi.org/10.1109/TCBB.2021.3139597]
[69]
Rousseeuw PJ. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 1987; 20: 53-65.
[http://dx.doi.org/10.1016/0377-0427(87)90125-7]
[70]
Caliński T, Harabasz J. A dendrite method for cluster analysis. Commun Stat Theory Methods 1974; 3(1): 1-27.
[http://dx.doi.org/10.1080/03610927408827101]
[71]
Davies DL, Bouldin DW. A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1979; PAMI-1(2): 224-7.
[http://dx.doi.org/10.1109/TPAMI.1979.4766909] [PMID: 21868852]
[72]
Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 2015; 43(7): e47.
[http://dx.doi.org/10.1093/nar/gkv007] [PMID: 25605792]

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy