Discovering Biochemical Reaction Models by Evolving Libraries | SpringerLink
Skip to main content

Discovering Biochemical Reaction Models by Evolving Libraries

  • Conference paper
  • First Online:
Computational Methods in Systems Biology (CMSB 2024)

Abstract

In a time of data abundance, automatic methods increasingly support manual modeling. To this end, the Sparse Identification of Non-linear Dynamics (SINDy) provides a solid foundation for identifying non-linear dynamical systems in the form of differential equations. In biochemistry, reaction networks imply coupled differential equations. It has recently been demonstrated how this intrinsic coupling can be achieved within the SINDy framework, providing a straightforward interpretation of the learned equations as reaction systems with mass-action kinetics. However, this extension inherits from SINDy the requirement to enumerate all candidate reactions in a library, resulting in ill-posed optimization problems and long model descriptions, limiting its utility for identifying models with many species. Here, we elaborate on the recent advances in bringing SINDy to the biochemical domain by considering the sub-sampling of reaction libraries as part of an evolutionary optimization scheme. This enables the generation of parsimonious models, as well as the inclusion of model-level constraints, and allows the consideration of large numbers of candidate reactions. We evaluate the approach on two smaller case studies and the recovery of a large Wnt signaling model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 22879
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 28599
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://zenodo.org/doi/10.5281/zenodo.11654439.

References

  1. Ahmed, Y., Telmer, C., Miskov-Zivanov, N.: Accordion: Clustering and selecting relevant data for guided network extension and query answering. arXiv preprint arXiv:2002.05748 (2020). https://doi.org/10.48550/arXiv.2002.05748

  2. Askari, E., Crevecoeur, G.: Evolutionary sparse data-driven discovery of multibody system dynamics. Multibody Syst. Dyn. 58, 197–226 (2023). https://doi.org/10.1007/s11044-023-09901-z

    Article  MathSciNet  Google Scholar 

  3. Bortolussi, L., Cairoli, F., Klein, J., Petrov, T.: Data-driven inference of chemical reaction networks via graph-based variational autoencoders, pp. 143–147. Springer, Heidelberg (2023). https://doi.org/10.1007/978-3-031-43835-6_10

  4. Boutillier, P.: The kappa platform for rule-based modeling. Bioinformatics 34(13), i583–i592 (2018). https://doi.org/10.1093/bioinformatics/bty272

    Article  Google Scholar 

  5. Bro, R., De Jong, S.: A fast non-negativity-constrained least squares algorithm. J. Chemom. 11(5), 393–401 (1997). https://doi.org/10.1002/(SICI)1099-128X(199709/10)11:5<393::AID-CEM483>3.0.CO;2-L

  6. Brummer, A.B., et al.: Data driven model discovery and interpretation for car t-cell killing using sparse identification and latent variables. Front. Immunol. 14 (2023). https://doi.org/10.3389/fimmu.2023.1115536

  7. Brunton, S.L., Proctor, J.L., Kutz, J.N.: Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc. Natl. Acad. Sci. 113, 3932–3937 (2016). https://doi.org/10.1073/pnas.1517384113

    Article  MathSciNet  Google Scholar 

  8. Burrage, P.M., Weerasinghe, H.N., Burrage, K.: Using a library of chemical reactions to fit systems of ordinary differential equations to agent-based models: a machine learning approach. Numer. Algor. (2024). https://doi.org/10.1007/s11075-023-01737-0

  9. Craciun, G., Pantea, C.: Identifiability of chemical reaction networks. J. Math. Chem. 44(1), 244–259 (2008). https://doi.org/10.1007/s10910-007-9307-x

    Article  MathSciNet  Google Scholar 

  10. Daniels, B.C., Nemenman, I.: Automated adaptive inference of phenomenological dynamical models. Nat. Commun. 6 (2015). https://doi.org/10.1038/ncomms9133

  11. Faeder, J.R., Blinov, M.L., Hlavacek, W.S.: Rule-based modeling of biochemical systems with bionetgen. In: Systems Biology, pp. 113–167. Springer, Heidelberg (2009). https://doi.org/10.1007/978-1-59745-525-1_5

  12. Fasel, U., Kutz, J.N., Brunton, B.W., Brunton, S.L.: Ensemble-sindy: robust sparse model discovery in the low-data, high-noise limit, with active learning and control. Proc. Royal Soc. A: Math. Phys. Eng. Sci. 478 (2022). https://doi.org/10.1098/rspa.2021.0904

  13. Großmann, G., Zimmerlin, J., Backenköhler, M., Wolf, V.: Unsupervised relational inference using masked reconstruction. Appl. Netw. Sci. 8(1), 18 (2023). https://doi.org/10.1007/s41109-023-00542-x

    Article  Google Scholar 

  14. Haack, F., Lemcke, H., Ewald, R., Rharass, T., Uhrmacher, A.M.: Spatio-temporal model of endogenous ros and raft-dependent wnt/beta-catenin signaling driving cell fate commitment in human neural progenitor cells. PLoS Comput. Biol. 11(3), 1–28 (2015). https://doi.org/10.1371/journal.pcbi.1004106

    Article  Google Scholar 

  15. Helms, T., Warnke, T., Maus, C., Uhrmacher, A.M.: Semantics and efficient simulation algorithms of an expressive multilevel modeling language. ACM Trans. Model. Comput. Simul. (TOMACS) 27(2), 1–25 (2017). https://doi.org/10.1145/2998499

    Article  MathSciNet  Google Scholar 

  16. Keating, S.M., et al.: Sbml level 3: an extensible format for the exchange and reuse of biological models. Molec. Syst. Biol. 16(8), e9110 (2020). https://doi.org/10.15252/msb.20199110

  17. Klimovskaia, A., Ganscha, S., Claassen, M.: Sparse regression based structure learning of stochastic reaction networks from single cell snapshot time series. PLoS Comput. Biol. 12, e1005234 (2016). https://doi.org/10.1371/journal.pcbi.1005234

    Article  Google Scholar 

  18. Koza, J.R., Mydlowec, W., Lanza, G., Yu, J., Keane, M.A.: Reverse Engineering of Metabolic Pathways From Observed Data Using Genetic Programming, pp. 434–445. World Scientific (2000). https://doi.org/10.1142/9789814447362_0043

  19. Kozin, F., Natke, H.: System identification techniques. Struct. Saf. 3(3–4), 269–316 (1986). https://doi.org/10.1016/0167-4730(86)90006-8

    Article  Google Scholar 

  20. Kramer, O.: Genetic Algorithms. In: Genetic Algorithm Essentials. SCI, vol. 679, pp. 11–19. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52156-5_2

    Chapter  Google Scholar 

  21. Lee, E., Salic, A., Krüger, R., Heinrich, R., Kirschner, M.W.: The roles of apc and axin derived from experimental and theoretical analysis of the wnt pathway. PLoS Biol. 1(1), e10 (2003). https://doi.org/10.1371/journal.pbio.0000010

    Article  Google Scholar 

  22. Mangan, N.M., Brunton, S.L., Proctor, J.L., Kutz, J.N.: Inferring biological networks by sparse identification of nonlinear dynamics. IEEE Trans. Molec. Biol. Multi-Scale Commun. 2, 52–63 (2016). https://doi.org/10.1109/tmbmc.2016.2633265

    Article  Google Scholar 

  23. Manzi, M., Vasile, M.: Orbital anomaly reconstruction using deep symbolic regression. In: 71st International Astronautical Congress, IAC 2020 (2020)

    Google Scholar 

  24. Martinelli, J., Grignard, J., Soliman, S., Ballesta, A., Fages, F.: Reactmine: a statistical search algorithm for inferring chemical reactions from time series data. arXiv preprint arXiv:2209.03185v2 (2022)

  25. Milgroom, M.G.: Epidemiology and SIR Models, pp. 253–268. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-38941-2_16

  26. Nobile, M.S., Besozzi, D., Cazzaniga, P., Pescini, D., Mauri, G.: Reverse engineering of kinetic reaction networks by means of cartesian genetic programming and particle swarm optimization. In: 2013 IEEE Congress on Evolutionary Computation (CEC). IEEE ( 2013).https://doi.org/10.1109/cec.2013.6557752

  27. NumPy team an contributors: Numpy. Version 1.24.3 (2023). https://numpy.org/

  28. Parker, M., Kamenev, A.: Extinction in the lotka-volterra model. Phys. Rev. E 80, 021129 (2009). https://doi.org/10.1103/PhysRevE.80.021129

    Article  Google Scholar 

  29. Petzold, L.: Automatic selection of methods for solving stiff and nonstiff systems of ordinary differential equations. SIAM J. Sci. Stat. Comput. 4(1), 136–148 (1983). https://doi.org/10.1137/0904010

    Article  MathSciNet  Google Scholar 

  30. Rackauckas, C., et al.: Universal differential equations for scientific machine learning. arXiv preprint arXiv:2001.04385v4 (2020). https://doi.org/10.48550/arXiv.2001.04385

  31. SciPy team and contributors: Scipy. Verison 1.10.1 (2023). https://scipy.org/

  32. Soliman, S., Heiner, M.: A unique transformation from ordinary differential equations to reaction networks. PLoS ONE 5(12), e14284 (2010). https://doi.org/10.1371/journal.pone.0014284

    Article  Google Scholar 

  33. Spitzer, M.H., Nolan, G.P.: Mass cytometry: single cells, many features. Cell 165(4), 780–791 (2016). https://doi.org/10.1016/j.cell.2016.04.019

    Article  Google Scholar 

  34. Staehlke, S., et al.: Ros dependent wnt/\(\beta \)-catenin pathway and its regulation on defined micro-pillars-a combined in vitro and in silico study. Cells 9(8) (2020). https://doi.org/10.3390/cells9081784

  35. Székely, T., Burrage, K.: Stochastic simulation in systems biology. Comput. Struct. Biotechnol. J. 12(20), 14–25 (2014). https://doi.org/10.1016/j.csbj.2014.10.003

    Article  Google Scholar 

  36. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. Series B (Methodological) 58(1), 267–288 (1996). http://www.jstor.org/stable/2346178

Download references

Acknowledgments

JNK and AU acknowledge the funding of the DFG Project GrEASE (grant number 320435134). KB acknowledges the funding of the ARC Centre of Excellence for Plant Success in Nature and Agriculture CE 200100015. The authors thank Fiete Haack for many helpful discussions, particularly over his model of the Wnt pathway, and his feedback on the evaluation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Justin N. Kreikemeyer .

Editor information

Editors and Affiliations

Appendices

A Complete List of Experiment (Hyper-)parameters

(See Table 1).

Table 1. Hyperparameters used for evolib. The random search and c-SINDy use the same parameters where applicable. In particular, for the non-negative least squares, the maximum number of iterations was limited to \(10^8\). Abbreviations: X extended, cstr. constrained.

B Learned Model’s Trajectories for the Wnt Pathway

Fig. 5.
figure 5

Results from simulating the models learned with evolib for the Wnt pathway. The + symbols mark measurement points, and the lines are the trajectories simulated for the 19 species. Note that, for clarity, we omit the labeling of species, and only every second measurement point is shown. The integration of the (large) models produced by c-SINDy resulted in errors due to numerical problems.

C Learned Models for the Wnt Pathway

(See Figs. 6 and 7).

Fig. 6.
figure 6

The models inferred in the Wnt case study (unconstrained) compared to the ground truth model. For the extension, the reactions above the line are fixed. Only reactions with a rate above \(10^{-6}\) are shown, and if applicable the number of excluded reactions is shown in the lower left. Bolded reactions indicate an overlap with the ground truth reactions. Note in particular how in (a) the Axin-induced degradation of \(\beta \)-catenin and in (b) the synthesis of Ros was recovered, which are both central components of the ground truth model of the Wnt pathway.

Fig. 7.
figure 7

The models inferred in the Wnt case study (constrained) compared to the ground truth model. For the extension, the reactions above the line are fixed. Only reactions with a rate above \(10^{-6}\) are shown, and if applicable, the number of excluded reactions is shown in the lower left. Bolded reactions indicate an overlap with the ground truth reactions. Note in particular how in (a) the shuttling of \(\beta \)-catenin in and out of the nucleus and in (b) the production of TCF was recovered.

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kreikemeyer, J.N., Burrage, K., Uhrmacher, A.M. (2024). Discovering Biochemical Reaction Models by Evolving Libraries. In: Gori, R., Milazzo, P., Tribastone, M. (eds) Computational Methods in Systems Biology. CMSB 2024. Lecture Notes in Computer Science(), vol 14971. Springer, Cham. https://doi.org/10.1007/978-3-031-71671-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-71671-3_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-71670-6

  • Online ISBN: 978-3-031-71671-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics