Multi-channel Partial Graph Integration Learning of Partial Multi-omics
Data for Cancer Subtyping

Qing-Qing      Cao; Jian-Ping      Zhao; Chun-Hou      Zheng

doi:10.2174/1574893618666230519145545

Abstract

Background: The appearance of cancer subtypes with different clinical significance fully reflects the high heterogeneity of cancer. At present, the method of multi-omics integration has become more and more mature. However, in the practical application of the method, the omics of some samples are missing.

Objective: The purpose of this study is to establish a depth model that can effectively integrate and express partial multi-omics data to accurately identify cancer subtypes.

Methods: We proposed a novel partial multi-omics learning model for cancer subtypes, MPGIL (Multichannel Partial Graph Integration Learning). MPGIL has two main components. Firstly, it obtains more lateral adjacency information between samples within the omics through the multi-channel graph autoencoders based on high-order proximity. To reduce the negative impact of missing samples, the weighted fusion layer is introduced to replace the concatenate layer to learn the consensus representation across multi-omics. Secondly, a classifier is introduced to ensure that the consensus representation is representative of clustering. Finally, subtypes were identified by K-means.

Results: This study compared MPGIL with other multi-omics integration methods on 16 datasets. The clinical and survival results show that MPGIL can effectively identify subtypes. Three ablation experiments are designed to highlight the importance of each component in MPGIL. A case study of AML was conducted. The differentially expressed gene profiles among its subtypes fully reveal the high heterogeneity of cancer.

Conclusion: MPGIL can effectively learn the consistent expression of partial multi-omics datasets and discover subtypes, and shows more significant performance than the state-of-the-art methods.

Keywords: Partial multi-omics data, high-order proximity, cancer data, multi-channel, classifier, graph autoencoder.

« Previous

Graphical Abstract

[1]
Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin  2021; 71(3): 209-49.
 [http://dx.doi.org/10.3322/caac.21660] [PMID: 33538338]

[2]
Garraway LA, Lander ES. Lessons from the cancer genome. Cell  2013; 153(1): 17-37.
 [http://dx.doi.org/10.1016/j.cell.2013.03.002] [PMID: 23540688]

[3]
Fisher R, Pusztai L, Swanton C. Cancer heterogeneity: Implications for targeted therapeutics. Br J Cancer  2013; 108(3): 479-85.
 [http://dx.doi.org/10.1038/bjc.2012.581] [PMID: 23299535]

[4]
Zhao L, Yan H. MCNF: A novel method for cancer subtyping by integrating multi-omics and clinical data. IEEE/ACM Trans Comput Biol Bioinformatics  2020; 17(5): 1682-90.
 [http://dx.doi.org/10.1109/TCBB.2019.2910515] [PMID: 30990192]

[5]
Bebber CM, Thomas ES, Stroh J, et al. Ferroptosis response segregates small cell lung cancer (SCLC) neuroendocrine subtypes. Nat Commun  2021; 12(1): 1-19.
 [PMID: 33397941]

[6]
Golub TR, Slonim DK, Tamayo P, et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science  1999; 286(5439): 531-7.
 [http://dx.doi.org/10.1126/science.286.5439.531] [PMID: 10521349]

[7]
Goodwin S, McPherson JD, McCombie WR. Coming of age: Ten years of next-generation sequencing technologies. Nat Rev Genet  2016; 17(6): 333-51.
 [http://dx.doi.org/10.1038/nrg.2016.49] [PMID: 27184599]

[8]
Hudson TJ, Anderson W, Artez A, et al. International network of cancer genome projects. Nature  2010; 464(7291): 993-8.
 [http://dx.doi.org/10.1038/nature08987] [PMID: 20393554]

[9]
Weinstein JN, Collisson EA, Mills GB, et al. The cancer genome atlas pan-cancer analysis project. Nat Genet  2013; 45(10): 1113-20.
 [http://dx.doi.org/10.1038/ng.2764] [PMID: 24071849]

[10]
Rappoport N, Shamir R. Multi-omic and multi-view clustering algorithms: review and cancer benchmark. Nucleic Acids Res  2018; 46(20): 10546-62.
 [http://dx.doi.org/10.1093/nar/gky889] [PMID: 30295871]

[11]
Duan R, Gao L, Gao Y, et al. Evaluation and comparison of multi-omics data integration methods for cancer subtyping. PLOS Comput Biol  2021; 17(8): e1009224.
 [http://dx.doi.org/10.1371/journal.pcbi.1009224] [PMID: 34383739]

[12]
Subramanian I, Verma S, Kumar S, Jere A, Anamika K. Multi-omics data integration, interpretation, and its application. Bioinform Biol Insights  2020; 14: 1-24.
 [http://dx.doi.org/10.1177/1177932219899051] [PMID: 32076369]

[13]
Heo YJ, Hwa C, Lee GH, Park JM, An JY. Integrative multi-omics approaches in cancer research: From biological networks to clinical subtypes. Mol Cells  2021; 44(7): 433-43.
 [http://dx.doi.org/10.14348/molcells.2021.0042] [PMID: 34238766]

[14]
Lovino M, Randazzo V, Ciravegna G, Barbiero P, Ficarra E, Cirrincione G. A survey on data integration for multi-omics sample clustering. Neurocomputing  2022; 488: 494-508.
 [http://dx.doi.org/10.1016/j.neucom.2021.11.094]

[15]
Pierre-Jean M, Deleuze JF, Le Floch E, Mauger F. Clustering and variable selection evaluation of 13 unsupervised methods for multi-omics data integration. Brief Bioinform  2020; 21(6): 2011-30.
 [http://dx.doi.org/10.1093/bib/bbz138] [PMID: 31792509]

[16]
Tini G, Marchetti L, Priami C, Scott-Boyer MP. Multi-omics integration-a comparison of unsupervised clustering methodologies. Brief Bioinform  2019; 20(4): 1269-79.
 [http://dx.doi.org/10.1093/bib/bbx167] [PMID: 29272335]

[17]
Menyhárt O, Győrffy B. Multi-omics approaches in cancer research with applications in tumor subtyping, prognosis, and diagnosis. Comput Struct Biotechnol J  2021; 19: 949-60.
 [http://dx.doi.org/10.1016/j.csbj.2021.01.009] [PMID: 33613862]

[18]
Chauvel C, Novoloaca A, Veyre P, Reynier F, Becker J. Evaluation of integrative clustering methods for the analysis of multi-omics data. Brief Bioinform  2020; 21(2): 541-52.
 [http://dx.doi.org/10.1093/bib/bbz015] [PMID: 31220206]

[19]
Reel PS, Reel S, Pearson E, Trucco E, Jefferson E. Using machine learning approaches for multi-omics data analysis: A review. Biotechnol Adv  2021; 49: 107739.
 [http://dx.doi.org/10.1016/j.biotechadv.2021.107739] [PMID: 33794304]

[20]
Zhang X, Zhou Z, Xu H, Liu CT. Integrative clustering methods for multi‐omics data. Wiley Interdiscip Rev Comput Stat  2022; 14(3): e1553.
 [http://dx.doi.org/10.1002/wics.1553] [PMID: 35573155]

[21]
Baldwin E, Han J, Luo W, et al. On fusion methods for knowledge discovery from multi-omics datasets. Comput Struct Biotechnol J  2020; 18: 509-17.
 [http://dx.doi.org/10.1016/j.csbj.2020.02.011] [PMID: 32206210]

[22]
Cai Z, Poulos RC, Liu J, Zhong Q. Machine learning for multi-omics data integration in cancer. iScience  2022; 25(2): 103798.
 [http://dx.doi.org/10.1016/j.isci.2022.103798] [PMID: 35169688]

[23]
Wei Z, Zhang Y, Weng W, Chen J, Cai H. Survey and comparative assessments of computational multi-omics integrative methods with multiple regulatory networks identifying distinct tumor compositions across pan-cancer data sets. Brief Bioinform  2021; 22(3): bbaa102.
 [http://dx.doi.org/10.1093/bib/bbaa102] [PMID: 32533167]

[24]
Hwang J, Moon S, Lee H. SDGCCA: Supervised deep generalized canonical correlation analysis for multi-omics integration. arXiv 2022.

[25]
Das S, Mukhopadhyay I. TiMEG: An integrative statistical method for partially missing multi-omics data. Sci Rep  2021; 11(1): 24077.
 [http://dx.doi.org/10.1038/s41598-021-03034-z] [PMID: 34911979]

[26]
Tian J, Zhao J, Zheng C. Clustering of cancer data based on Stiefel manifold for multiple views. BMC Bioinformatics  2021; 22(1): 268.
 [http://dx.doi.org/10.1186/s12859-021-04195-4] [PMID: 34034643]

[27]
Wang J, Lu CH, Kong XZ, Dai LY, Yuan S, Zhang X. Multi-view manifold regularized compact low-rank representation for cancer samples clustering on multi-omics data. BMC Bioinformatics  2022; 22(12) (Suppl. 12): 334.
 [PMID: 35057729]

[28]
Ge S, Wang X, Cheng Y, Liu J. Cancer subtype recognition based on laplacian rank constrained multiview clustering. Genes  2021; 12(4): 526.
 [http://dx.doi.org/10.3390/genes12040526] [PMID: 33916856]

[29]
Mitra S, Saha S, Hasanuzzaman M. Multi-view clustering for multi-omics data using unified embedding. Sci Rep  2020; 10(1): 13654.
 [http://dx.doi.org/10.1038/s41598-020-70229-1] [PMID: 32788601]

[30]
Liu C, Wang X, Genchev GZ, Lu H. Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction. Methods  2017; 124: 100-7.
 [http://dx.doi.org/10.1016/j.ymeth.2017.06.010] [PMID: 28627406]

[31]
Argelaguet R, Velten B, Arnol D, et al. Multi‐omics factor analysis-a framework for unsupervised integration of multi‐omics data sets. Mol Syst Biol  2018; 14(6): e8124.
 [http://dx.doi.org/10.15252/msb.20178124] [PMID: 29925568]

[32]
Dwivedi A, Paul S. Recursive multi-view integration for subtypes identification of cervical cancer. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).  Houston, USA. 2021; 706-9.

[33]
Sun Y, Ou-Yang L, Dai DQ. WMLRR: A weighted multi-view low rank representation to identify cancer subtypes from multiple types of omics data. IEEE/ACM Trans Comput Biol Bioinformatics  2021; 18(6): 2891-7.
 [http://dx.doi.org/10.1109/TCBB.2021.3063284] [PMID: 33656995]

[34]
Lu Z, Chen X, Yang J, Ding Y. RSC-based differential model with correlation removal for improving multi-omics clustering. J Theor Biol  2023; 556: 111328.
 [http://dx.doi.org/10.1016/j.jtbi.2022.111328] [PMID: 36273593]

[35]
Tyler SR, Chun Y, Ribeiro VM, et al. Merged affinity network association clustering: Joint multi-omic/clinical clustering to identify disease endotypes. Cell Rep  2021; 35(2): 108975.
 [http://dx.doi.org/10.1016/j.celrep.2021.108975] [PMID: 33852839]

[36]
Liu J, Ge S, Cheng Y, Wang X. Multi-view spectral clustering based on multi-smooth representation fusion for cancer subtype prediction. Front Genet  2021; 12: 718915.
 [http://dx.doi.org/10.3389/fgene.2021.718915] [PMID: 34552619]

[37]
Yuanyuan Z, Ziqi W, Shudong W, Chuanhua K. SSIG: Single-Sample Information Gain model for integrating multi-omics data to identify cancer subtypes. Chin J Electron  2021; 30(2): 303-12.
 [http://dx.doi.org/10.1049/cje.2021.01.011]

[38]
Ma T, Zhang A. Integrate multi-omic data using affinity network fusion (ANF) for cancer patient clustering. 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).  Kansas City, USA. 2017; 398-403.
 [http://dx.doi.org/10.1109/BIBM.2017.8217682]

[39]
Shi X, Liang C, Wang H. Multiview robust graph-based clustering for cancer subtype identification. IEEE/ACM Trans Comput Biol Bioinformatics  2022; 20(1): 544-56.

[40]
Liang C, Shang M, Luo J. Cancer subtype identification by consensus guided graph autoencoders. Bioinformatics  2021; 37(24): 4779-86.
 [http://dx.doi.org/10.1093/bioinformatics/btab535] [PMID: 34289034]

[41]
Liu H, Shang M, Zhang H, Liang C. Cancer Subtype identification based on multi-view subspace clustering with adaptive local structure learning. IEEE International Conference on Bioinformatics and Biomedicine (BIBM).  Houston, USA. 2021; pp. 484-90.
 [http://dx.doi.org/10.1109/BIBM52615.2021.9669659]

[42]
Rong Z, Liu Z, Song J, et al. MCluster-VAEs: An end-to-end variational deep learning-based clustering method for subtype discovery using multi-omics data. Comput Biol Med  2022; 150: 106085.
 [http://dx.doi.org/10.1016/j.compbiomed.2022.106085] [PMID: 36162197]

[43]
Yang Y, Tian S, Qiu Y, Zhao P, Zou Q. MDICC: Novel method for multi-omics data integration and cancer subtype identification. Brief Bioinform  2022; 23(3): bbac132.
 [http://dx.doi.org/10.1093/bib/bbac132] [PMID: 35437603]

[44]
Zhang G, Peng Z, Yan C, Wang J, Luo J, Luo H. MultiGATAE: A novel cancer subtype identification method based on multi-omics and attention mechanism. Front Genet  2022; 13: 855629.
 [http://dx.doi.org/10.3389/fgene.2022.855629] [PMID: 35391797]

[45]
Hu Y, Cai H. Hypergraph-supervised deep subspace clustering. Mathematics  2021; 9(24): 3259.
 [http://dx.doi.org/10.3390/math9243259]

[46]
Zhang C, Chen Y, Zeng T, Zhang C, Chen L. Deep latent space fusion for adaptive representation of heterogeneous multi-omics data. Brief Bioinform  2022; 23(2): bbab600.
 [http://dx.doi.org/10.1093/bib/bbab600] [PMID: 35079777]

[47]
Osseni MA, Tossou P, Laviolette F, Corbeil J. MOT: A multi-omics transformer for multiclass classification tumour types predictions. bioRxiv 2022.

[48]
Pfeifer B, Voicu-Spineanu A, Schimek MG, Alachiotis N. Integrative hierarchical ensemble clustering for improved disease subtype discovery. IEEE International Conference on Bioinformatics and Biomedicine (BIBM).  Houston USA. 2021; pp. 720-5.
 [http://dx.doi.org/10.1109/BIBM52615.2021.9669608]

[49]
Li X, Ma J, Leng L, et al. MoGCN: A multi-omics integration method based on graph convolutional network for cancer subtype analysis. Front Genet  2022; 13: 806842.
 [http://dx.doi.org/10.3389/fgene.2022.806842] [PMID: 35186034]

[50]
Song W, Wang W, Dai DQ. Subtype-WESLR: Identifying cancer subtype with weighted ensemble sparse latent representation of multi-view data. Brief Bioinform  2022; 23(1): bbab398.
 [http://dx.doi.org/10.1093/bib/bbab398] [PMID: 34607358]

[51]
Yang B, Xin TT, Pang SM, Wang M, Wang YJ. Deep subspace mutual learning for cancer subtypes prediction. Bioinformatics  2021; 37(21): 3715-22.
 [http://dx.doi.org/10.1093/bioinformatics/btab625] [PMID: 34478501]

[52]
Kang M, Ko E, Mersha TB. A roadmap for multi-omics data integration using deep learning. Brief Bioinform  2022; 23(1): bbab454.
 [http://dx.doi.org/10.1093/bib/bbab454] [PMID: 34791014]

[53]
Zhou G, Ewald J, Xia J. OmicsAnalyst: A comprehensive web-based platform for visual analytics of multi-omics data. Nucleic Acids Res  2021; 49(W1): W476-82.
 [http://dx.doi.org/10.1093/nar/gkab394] [PMID: 34019646]

[54]
Lemsara A, Ouadfel S, Fröhlich H, Path ME. Pathway based multi-modal sparse autoencoders for clustering of patient-level multi-omics data. BMC Bioinformatics  2020; 21(1): 146.
 [http://dx.doi.org/10.1186/s12859-020-3465-2] [PMID: 32299344]

[55]
Wu D, Wang D, Zhang MQ, Gu J. Fast dimension reduction and integrative clustering of multi-omics data using low-rank approximation: Application to cancer molecular classification. BMC Genomics  2015; 16(1): 1022.
 [http://dx.doi.org/10.1186/s12864-015-2223-8] [PMID: 26626453]

[56]
Mo Q, Shen R, Guo C, Vannucci M, Chan KS, Hilsenbeck SG. A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data. Biostatistics  2018; 19(1): 71-86.
 [http://dx.doi.org/10.1093/biostatistics/kxx017] [PMID: 28541380]

[57]
Wang B, Mezlini AM, Demir F, et al. Similarity network fusion for aggregating data types on a genomic scale. Nat Methods  2014; 11(3): 333-7.
 [http://dx.doi.org/10.1038/nmeth.2810] [PMID: 24464287]

[58]
Wang B, Jiang J, Wei W, Zhou Z, Tu Z. Unsupervised metric fusion by cross diffusion. Computer Vision and Pattern Recognition  2012; 20(12): 2997-3004.

[59]
Guan Q, Zhao J, Zheng C. SNEMO: Spectral clustering based on the neighborhood for multi-omics data. International Conference on Intelligent Computing.  Nanjian, China. 2021; pp. 490-8.
 [http://dx.doi.org/10.1007/978-3-030-84532-2_44]

[60]
Yang H, Chen R, Li D, Wang Z. Subtype-GAN: A deep learning approach for integrative cancer subtyping of multi-omics data. Bioinformatics  2021; 37(16): 2231-7.
 [http://dx.doi.org/10.1093/bioinformatics/btab109] [PMID: 33599254]

[61]
Yang B, Yang Y, Su X. Deep structure integrative representation of multi-omics data for cancer subtyping. Bioinformatics  2022; 38(13): 3337-42.
 [http://dx.doi.org/10.1093/bioinformatics/btac345] [PMID: 35639657]

[62]
Fang Z, Ma T, Tang G, et al. Bayesian integrative model for multi-omics data with missingness. Bioinformatics  2018; 34(22): 3801-8.
 [http://dx.doi.org/10.1093/bioinformatics/bty775] [PMID: 30184058]

[63]
Rappoport N, Shamir R. NEMO: cancer subtyping by integration of partial multi-omic data. Bioinformatics  2019; 35(18): 3348-56.
 [http://dx.doi.org/10.1093/bioinformatics/btz058] [PMID: 30698637]

[64]
Voillet V, Besse P, Liaubet L, San Cristobal M, González I. Handling missing rows in multi-omics data integration: Multiple imputation in multiple factor analysis framework. BMC Bioinformatics  2016; 17(1): 402.
 [http://dx.doi.org/10.1186/s12859-016-1273-5] [PMID: 27716030]

[65]
Xu H, Gao L, Huang M, Duan R. A network embedding based method for partial multi-omics integration in cancer subtyping. Methods  2021; 192: 67-76.
 [http://dx.doi.org/10.1016/j.ymeth.2020.08.001] [PMID: 32805397]

[66]
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv 2013.

[67]
Ding H, Sharpnack M, Wang C, Huang K, Machiraju R. Integrative cancer patient stratification via subspace merging. Bioinformatics  2019; 35(10): 1653-9.
 [http://dx.doi.org/10.1093/bioinformatics/bty866] [PMID: 30329022]

[68]
Chen J, Rong W, Tao G, Cai H. Similarity fusion via exploiting high order proximity for cancer subtyping. IEEE/ACM Trans Comput Biol Bioinformatics  2021; 20(1): 1-10.
 [http://dx.doi.org/10.1109/TCBB.2021.3139597]

[69]
Rousseeuw PJ. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math  1987; 20: 53-65.
 [http://dx.doi.org/10.1016/0377-0427(87)90125-7]

[70]
Caliński T, Harabasz J. A dendrite method for cluster analysis. Commun Stat Theory Methods  1974; 3(1): 1-27.
 [http://dx.doi.org/10.1080/03610927408827101]

[71]
Davies DL, Bouldin DW. A cluster separation measure. IEEE Trans Pattern Anal Mach Intell  1979; PAMI-1(2): 224-7.
 [http://dx.doi.org/10.1109/TPAMI.1979.4766909] [PMID: 21868852]

[72]
Ritchie ME, Phipson B, Wu D, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res  2015; 43(7): e47.
 [http://dx.doi.org/10.1093/nar/gkv007] [PMID: 25605792]

Rights & Permissions Print Cite

Article Metrics

33

3

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1574893618666230519145545	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

Multi-channel Partial Graph Integration Learning of Partial Multi-omics Data for Cancer Subtyping

Abstract

Graphical Abstract

Related Journals

Related Books

Related Articles