Multi-omic Data Integration and Feature Selection for Survival-Based Patient Stratification via Supervised Concrete Autoencoders | SpringerLink
Skip to main content

Multi-omic Data Integration and Feature Selection for Survival-Based Patient Stratification via Supervised Concrete Autoencoders

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2022)

Abstract

Cancer is a complex disease with significant social and economic impact. Advancements in high-throughput molecular assays and the reduced cost for performing high-quality multi-omic measurements have fuelled insights through machine learning. Previous studies have shown promise on using multiple omic layers to predict survival and stratify cancer patients. In this paper, we develop and report a Supervised Autoencoder (SAE) model for survival-based multi-omic integration, which improves upon previous work, as well as a Concrete Supervised Autoencoder model (CSAE) which uses feature selection to jointly reconstruct the input features as well as to predict survival. Our results show that our models either outperform or are on par with some of the most commonly used baselines, while either providing a better survival separation (SAE) or being more interpretable (CSAE). Feature selection stability analysis on our models shows a power-law relationship with features commonly associated with survival. The code for this project is available at: https://github.com/phcavelar/coxae.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 10295
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 12869
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Asada, K., et al.: Uncovering prognosis-related genes and pathways by multi-omics analysis in lung cancer. Biomolecules 10(4), 524 (2020)

    Article  Google Scholar 

  2. Balın, M.F., Abid, A., Zou, J.: Concrete autoencoders: differentiable feature selection and reconstruction. In: International Conference on Machine Learning, pp. 444–453. PMLR (2019)

    Google Scholar 

  3. Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: Proceedings of the 7th ACM SIGKDD, KDD 2001, pp. 245–250. Association for Computing Machinery, New York (2001). https://doi.org/10.1145/502512.502546

  4. Bode, A.M., Dong, Z.: Precision oncology-the future of personalized cancer medicine? NPJ Precis. Oncol. 1(1), 1–2 (2017). https://doi.org/10.1038/s41698-017-0010-5

    Article  Google Scholar 

  5. Cantini, L., et al.: Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer. Nat. Commun. 12(1), 1–12 (2021)

    Article  Google Scholar 

  6. Chaudhary, K., Poirion, O.B., Lu, L., Garmire, L.X.: Deep learning-based multi-omics integration robustly predicts survival in liver cancer. Clin. Can. Res. 24(6), 1248–1259 (2018)

    Article  Google Scholar 

  7. Ching, T., Zhu, X., Garmire, L.X.: Cox-nnet: an artificial neural network method for prognosis prediction of high-throughput omics data. PLoS Comput. Biol. 14(4), e1006076 (2018)

    Article  Google Scholar 

  8. Huang, Z., et al.: SALMON: survival analysis learning with multi-omics neural networks on breast cancer. Front. Genet. 10, 166 (2019)

    Article  Google Scholar 

  9. Katzman, J.L., Shaham, U., Cloninger, A., Bates, J., Jiang, T., Kluger, Y.: DeepSurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18(1), 1–12 (2018)

    Article  Google Scholar 

  10. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: Bengio, Y., LeCun, Y. (eds.) 2nd ICLR, Banff, AB, Canada, 14–16 April 2014, Conference Track Proceedings (2014)

    Google Scholar 

  11. Koch, C.M., et al.: A beginner’s guide to analysis of RNA sequencing data. Am. J. Respir. Cell Mol. Biol. 59(2), 145–157 (2018)

    Article  Google Scholar 

  12. Korsunsky, I., et al.: Fast, sensitive and accurate integration of single-cell data with harmony. Nat. Methods 16(12), 1289–1296 (2019)

    Article  Google Scholar 

  13. Lamb, J., et al.: The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313(5795), 1929–1935 (2006). https://doi.org/10.1126/science.1132939

    Article  Google Scholar 

  14. Lee, T.Y., Huang, K.Y., Chuang, C.H., Lee, C.Y., Chang, T.H.: Incorporating deep learning and multi-omics autoencoding for analysis of lung adenocarcinoma prognostication. Comput. Biol. Chem. 87, 107277 (2020)

    Article  Google Scholar 

  15. Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: a continuous relaxation of discrete random variables. arXiv:1611.00712 [cs, stat] (2017)

  16. Nicora, G., Vitali, F., Dagliati, A., Geifman, N., Bellazzi, R.: Integrated multi-omics analyses in oncology: a review of machine learning methods and tools. Front. Oncol. 10, 1030 (2020)

    Article  Google Scholar 

  17. Poirion, O.B., Jing, Z., Chaudhary, K., Huang, S., Garmire, L.X.: DeepProg: an ensemble of deep-learning and machine-learning models for prognosis prediction using multi-omics data. Genome Med. 13(1), 1–15 (2021)

    Article  Google Scholar 

  18. King’s College London e Research Team: King’s Computational Research, Engineering and Technology Environment (CREATE) (2022). https://doi.org/10.18742/RNVF-M076. https://docs.er.kcl.ac.uk/

  19. Ronen, J., Hayat, S., Akalin, A.: Evaluation of colorectal cancer subtypes and cell lines using deep learning. Life Sci. Alliance 2(6) (2019)

    Google Scholar 

  20. Tong, L., Mitchel, J., Chatlin, K., Wang, M.D.: Deep learning based feature-level integration of multi-omics data for breast cancer patients survival analysis. BMC Med. Inform. Decis. Mak. 20(1), 225 (2020). https://doi.org/10.1186/s12911-020-01225-8

    Article  Google Scholar 

  21. Uyar, B., Ronen, J., Franke, V., Gargiulo, G., Akalin, A.: Multi-omics and deep learning provide a multifaceted view of cancer. bioRxiv (2021)

    Google Scholar 

  22. Wissel, D., Rowson, D., Boeva, V.: Hierarchical autoencoder-based integration improves performance in multi-omics cancer survival models through soft modality selection. Technical report, bioRxiv (2022). https://doi.org/10.1101/2021.09.16.460589. Section: New Results Type: article

  23. Zhang, L., et al.: Deep learning-based multi-omics data integration reveals two prognostic subtypes in high-risk neuroblastoma. Front. Genet. 9, 477 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank Dr Jonathan Cardoso-Silva for fruitful conversations, and João Nuno Beleza Oliveira Vidal Lourenço for designing the diagrams. P.H.C.A. acknowledges that during his stay at KCL and A*STAR he’s partly funded by King’s College London and the A*STAR Research Attachment Programme (ARAP). The research was also supported by the National Institute for Health Research Biomedical Research Centre based at Guy’s and St Thomas’ NHS Foundation Trust and King’s College London (IS-BRC-1215-20006). The authors are solely responsible for study design, data collection, analysis, decision to publish, and preparation of the manuscript. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. This work used King’s CREATE compute cluster for its experiments [18]. The results shown here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pedro Henrique da Costa Avelar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

da Costa Avelar, P.H., Laddach, R., Karagiannis, S.N., Wu, M., Tsoka, S. (2023). Multi-omic Data Integration and Feature Selection for Survival-Based Patient Stratification via Supervised Concrete Autoencoders. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2022. Lecture Notes in Computer Science, vol 13811. Springer, Cham. https://doi.org/10.1007/978-3-031-25891-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25891-6_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25890-9

  • Online ISBN: 978-3-031-25891-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics