Abstract
Random sampling of the feasible region defined by knowledge-based and data-driven constraints is being increasingly employed for the analysis of metabolic networks. The aim is to identify a set of reactions that are used at a significantly different extent between two conditions of biological interest, such as physiological and pathological conditions. A reference constraint-based model incorporating knowledge-based constraints on reaction stoichiometry and a reasonable mass balance constraint is thus deferentially constrained for the two conditions according to different types of -omics data, such as transcriptomics and/or proteomics. The hypothesis that two samples randomly obtained from the two models come from the same distribution is then rejected/confirmed according to standard statistical tests. However, the impact of under-sampling on false discoveries has not been investigated so far. To this aim, we evaluated the presence of false discoveries by comparing samples obtained from the very same feasible region, for which the null hypothesis must be confirmed. We compared different sampling algorithms and sampling parameters. Our results indicate that established sampling convergence tests are not sufficient to prevent high false discovery rates. We propose some best practices to reduce the false discovery rate. We advocate the usage of the CHRR algorithm, a large value of the thinning parameter, and a threshold on the fold-change between the averages of the sampled flux values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Note that it was not possible to compute the Raftery-Lewis tests when the n results less than 3.746.
References
Almaas, E., Kovacs, B., Vicsek, T., Oltvai, Z., Barabási, A.: Global organization of metabolic fluxes in the bacterium Escherichia coli. Nature 427, 839–843 (2004)
Bélisle, C., Romeijn, H., Smith, R.: Hit-and-run algorithms for generating multivariate distributions. Math. Oper. Res. 18, 255–266 (1993)
Bordel, S., Agren, R., Nielsen, J.: Sampling the solution space in genome-scale metabolic networks reveals transcriptional regulation in key enzymes. PLoS Comput. Biol. 6, e1000859 (2010)
Damiani, C., et al.: A metabolic core model elucidates how enhanced utilization of glucose and glutamine, with enhanced glutamine-dependent lactate production, promotes cancer cell growth: the WarburQ effect. PLoS Comput. Biol. 13, e1005758 (2017)
Di Filippo, M., et al.: INTEGRATE: model-based multi-omics data integration to characterize multi-level metabolic regulation. PLoS Comput. Biol. 18, e1009337 (2022)
Ebrahim, A., Lerman, J., Palsson, B., Hyduke, D.: COBRApy: constraints-based reconstruction and analysis for python. BMC Syst. Biol. 7, 1–6 (2013)
Fallahi, S., Skaug, H., Alendal, G.: A comparison of monte Carlo sampling methods for metabolic network models. PLoS ONE 15, e0235393 (2020)
Kaufman, D., Smith, R.: Direction choice for accelerated convergence in hit-and-run sampling. Oper. Res. 46, 84–95 (1998)
Haraldsdóttir, H., Cousins, B., Thiele, I., Fleming, R., Vempala, S.: CHRR: coordinate hit-and-run with rounding for uniform sampling of constraint-based models. Bioinformatics 33, 1741–1743 (2017)
Herrmann, H., Dyson, B., Miller, M., Schwartz, J., Johnson, G.: Metabolic flux from the chloroplast provides signals controlling photosynthetic acclimation to cold in Arabidopsis thaliana. Plant Cell Environ. 44, 171–185 (2021)
Herrmann, H., Dyson, B., Vass, L., Johnson, G., Schwartz, J.: Flux sampling is a powerful tool to study metabolism under changing environmental conditions. NPJ Syst. Bio. Appl. 5, 1–8 (2019)
Megchelenbrink, W., Huynen, M., Marchiori, E.: optGpSampler: an improved tool for uniformly sampling the solution-space of genome-scale metabolic networks. PLoS ONE 9, e86587 (2014)
Orth, J., Thiele, I., Palsson, B.: What is flux balance analysis? Nat. Biotechnol. 28, 245–248 (2010)
Plummer, M., Best, N., Cowles, K., Vines, K.: CODA: convergence diagnosis and output analysis for MCMC. R News 6, 7–11 (2006)
Režen, T., Martins, A., Mraz, M., Zimic, N., Rozman, D., Moškon, M.: Integration of omics data to generate and analyse COVID-19 specific genome-scale metabolic models. Comput. Biol. Med. 145, 105428 (2022)
Schellenberger, J., Palsson, B.: Use of randomized sampling for analysis of metabolic networks. J. Biol. Chem. 284, 5457–5461 (2009)
Schellenberger, J., et al.: Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat. Protoc. 6, 1290–1307 (2011)
Scott, W., Smid, E., Block, D., Notebaart, R.: Metabolic flux sampling predicts strain-dependent differences related to aroma production among commercial wine yeasts. Microb. Cell Fact. 20, 1–15 (2021)
Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods 17, 261–272 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Galuzzi, B.G., Milazzo, L., Damiani, C. (2023). Best Practices in Flux Sampling of Constrained-Based Models. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2022. Lecture Notes in Computer Science, vol 13811. Springer, Cham. https://doi.org/10.1007/978-3-031-25891-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-25891-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25890-9
Online ISBN: 978-3-031-25891-6
eBook Packages: Computer ScienceComputer Science (R0)