Abstract
Glioblastoma, the most malignant brain cancer in adults, exhibits vast heterogeneities in prognosis, clinicopathological features, immune landscapes, and immunotherapeutic responses, which calls the need to develop personalized therapeutic approaches. The identification of long/ short-term survivors, along with their associated gene expression markers, opens promising avenues for tailored treatments. However, modeling omics data is particularly challenging due to its high-dimensionality. Our study aimed to create survival models using gene expression data retrieved from tumour tissue, with the goal of detecting outlier observations. These observations correspond to glioblastoma patients whose survival time is much greater/smaller than predicted. To assist in dimensionality reduction and select relevant genes, elastic net and network-based regularization were applied. For each method, different outlier observations were obtained. The rank product test was used as a consensus method, enabling the identification of observations whose martingale residuals were consistently large across different models, thus producing a consensual list of outliers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Barlow, W., Prentice, R.: Residuals for relative risk regression. Biometrika 75 (1988)
Bavelas, A.: Communication patterns in task-oriented groups. J. Acoust. Soc. Am. 22, 725–730 (1950)
Breitling, R., Armengaud, P., Amtmann, A., Herzyk, P.: Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Lett. 573(1), 83–92 (2004)
Brennan, C.W., et al.: The somatic genomic landscape of glioblastoma. Cell 155(2), 462–77 (2013)
Bühlmann, P., Van De Geer, S.: Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Series in Statistics, Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20192-9
Carrasquinha, E., Veríssimo, A., Lopes, M.B., Vinga, S.: Identification of influential observations in high-dimensional cancer survival data through the rank product test. BioData Min. 11(1), 1 (2018)
Chen, H.C., Kodell, R.L., Cheng, K.F., Chen, J.J.: Assessment of performance of survival prediction models for cancer prognosis. BMC Med. Res. Methodol. 12(1), 102 (2012)
Cho, S.Y., Oh, Y., Jeong, E.M., et al.: Amplification of transglutaminase 2 enhances tumor-promoting inflammation in gastric cancers. Exp. Mol. Med. 52, 854–864 (2020)
Cox, D.R.: Regression models and life-tables. J. Roy. Stat. Soc. Ser. B (Methodol.) 34(2), 187–220 (1972)
Cui, C., Liu, Y., Gerloff, D., et al.: Nop10 predicts lung cancer prognosis and its associated small nucleolar RNAs drive proliferation and migration. Oncogene 40, 909–921 (2021)
Grambsch, P.M., Therneau, T.M.: Proportional hazards tests and diagnostics based on weighted residuals. Biometrika 81(3), 515–526 (1994)
Harrell, F.E., Jr., Califf, R.M., Pryor, D.B., Lee, K.L., Rosati, R.A.: Evaluating the yield of medical tests. JAMA 247(18), 2543–2546 (1982)
Heskes, T., Eisinga, R., Breitling, R.: A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments (2014)
Inda, M.D.M., Bonavia, R., Seoane, J.: Glioblastoma multiforme: a look inside its heterogeneous nature. Cancers 6(1), 226–239 (2014)
Jain, A., Duin, R., Mao, J.: Statistical pattern recognition: a review. IEEE Trans. Pattern Anal. Mach. Intell. 22(1), 4–37 (2000)
Johansen Taber, K.A., Dickinson, B.D., Wilson, M.: The promise and challenges of next-generation genome sequencing for clinical care. JAMA Intern. Med. 174(2), 275–280 (2014)
Kaplan, E.L., Meier, P.: Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53(282), 457–481 (1958)
Lopes, M.B., Vinga, S.: Tracking intratumoral heterogeneity in glioblastoma via regularized classification of single-cell RNA-seq data 21(1), 59 (2020)
Louis, D.N., et al.: The 2021 WHO classification of tumors of the central nervous system: a summary. Neuro Oncol. 23, 1231–1251 (2021)
Mantel, N.: Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother. Rep. 50(3), 163–170 (1966)
Mendonça, M.L., et al.: Updating TCGA glioma classification through integration of molecular profiling data following the 2016 and 2021 WHO guidelines. bioRxiv (2023)
Ozturk, K., Dow, M., Carlin, D.E., Bejar, R., Carter, H.: The emerging potential for network analysis to inform precision cancer medicine. J. Mol. Biol. 430(18), 2875–2899 (2018)
Peng, H., et al.: circCPA4 acts as a prognostic factor and regulates the proliferation and metastasis of glioma. J. Cell Mol. Med. 23, 6658–6665 (2019)
Pálsson, S., Cerri, S., Poulsen, H.S., Urup, T., Law, I., Van Leemput, K.: Predicting survival of glioblastoma from automatic whole-brain and tumor segmentation of MR images. Sci. Rep. 12(1), 19744 (2022)
Simon, N., Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for cox’s proportional hazards model via coordinate descent. J. Stat. Softw. 39(5) (2011)
Smoll, N.R., Schaller, K., Gautschi, O.P.: Long-term survival of patients with glioblastoma multiforme (GBM). J. Clin. Neurosci. 20(5), 670–675 (2013)
Storey, J.D.: A direct approach to false discovery rates. J. R. Stat. Soc. Ser. B Stat. Methodol. 64(3), 479–498 (2002)
Suza, W., Lee, D.: 1.10: Genetic pathways. In: Genetics, Agriculture, and Biotechnology (2024)
TCGA: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455(23), 1061–1068 (2008)
Team, R.C.: R: A language and environment for statistical computing. MSOR Connect. 1 (2014)
Tibshirani, R.: The lasso method for variable selection in the cox model. Stat. Med. 16(4), 385–395 (1997)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267–288 (1996)
Veríssimo, A., Carrasquinha, E., Lopes, M.B., Oliveira, A.L., Sagot, M.F., Vinga, S.: Sparse network-based regularization for the analysis of patientomics high-dimensional survival data. bioRxiv (2018)
Veríssimo, A., Oliveira, A.L., Sagot, M.F., Vinga, S.: DegreeCox - a network-based regularization method for survival analysis. BMC Bioinform. 17(16), 449 (2016)
Vinga, S.: Structured sparsity regularization for analyzing high-dimensional omics data. Brief. Bioinform. 22(1), 77–87 (2021)
Wald, A.: Tests of statistical hypotheses concerning several parameters when the number of observations is large. Trans. Am. Math. Soc. 54(3), 426–482 (1943)
Wei, C., et al.: Comprehensive analysis of CPA4 as a poor prognostic biomarker correlated with immune cells infiltration in bladder cancer. Biology 10, 1143 (2021)
Williams, G., Llewelyn, A., Thatcher, R., Hardisty, K.M., Loddo, M.: Utilisation of semiconductor sequencing for the detection of predictive biomarkers in glioblastoma. PLoS ONE 17(3), e0245817 (2022)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67(2), 301–320 (2005)
Acknowledgments
This work was financed by Fundação para a Ciência e a Tecnologia: UIDB/00006/2020 (DOI:10.54499/ UIDB/00006/2020), PTDC/CCI-BIO/4180/2020 (“MONET - Multi-omic networks in gliomas”, DOI: 10.54499/PTDC/CCI-BIO/4180/2020), UIDB/00297/2020 (DOI: 10.54499/UIDB/00297/2020) and UIDP/00297/2020 (DOI:10.54499/UIDP/00297/2020)(NOVA Math), UIDB/00667/2020 (DOI: 10.54499/UIDB/00667/2020) and UIDP /00667/2020 (DOI:10.54499/UIDP/00667/2020) (UNIDEMI), CEECINST/00042/2021.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Brandão, J., Lopes, M.B., Carrasquinha, E. (2024). Refining Gene Selection and Outlier Detection in Glioblastoma Based on a Consensus Approach for Regularized Survival Models. In: Rojas, I., Ortuño, F., Rojas, F., Herrera, L.J., Valenzuela, O. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2024. Lecture Notes in Computer Science(), vol 14848. Springer, Cham. https://doi.org/10.1007/978-3-031-64629-4_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-64629-4_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-64628-7
Online ISBN: 978-3-031-64629-4
eBook Packages: Computer ScienceComputer Science (R0)