Abstract
Nucleic acid kinetic simulators aim to predict the kinetics of interacting nucleic acid strands. Many simulators model the kinetics of interacting nucleic acid strands as continuous-time Markov chains (CTMCs). States of the CTMCs represent a collection of secondary structures, and transitions between the states correspond to the forming or breaking of base pairs and are determined by a nucleic acid kinetic model. The number of states these CTMCs can form may be exponentially large in the length of the strands, making two important tasks challenging, namely, mean first passage time (MFPT) estimation and parameter estimation for kinetic models based on MFPTs. Gillespie’s stochastic simulation algorithm (SSA) is widely used to analyze nucleic acid folding kinetics, but could be computationally expensive for reactions whose CTMC has a large state space or for slow reactions. It could also be expensive for arbitrary parameter sets that occur in parameter estimation. Our work addresses these two challenging tasks, in the full state space of all non-pseudoknotted secondary structures of each reaction. In the first task, we show how to use a reduced variance stochastic simulation algorithm (RVSSA), which is adapted from SSA, to estimate the MFPT of a reaction’s CTMC. In the second task, we estimate model parameters based on MFPTs. To this end, first, we show how to use a generalized method of moments (GMM) approach, where we minimize a squared norm of moment functions that we formulate based on experimental and estimated MFPTs. Second, to speed up parameter estimation, we introduce a fixed path ensemble inference (FPEI) approach, that we adapt from RVSSA. We implement and evaluate RVSSA and FPEI using the Multistrand kinetic simulator. In our experiments on a dataset of DNA reactions, FPEI speeds up parameter estimation compared to inference using SSA, by more than a factor of three for slow reactions. Also, for reactions with large state spaces, it speeds up parameter estimation by more than a factor of two.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A pseudoknot is a secondary structure that has at least two base pairs in which one nucleotide of a base pair is intercalated between the two nucleotides of the other base pair.
- 2.
For our purpose here, we are only interested in the MFPT, so the smaller variance is good. In other contexts, the full distribution of FPTs will be of interest, and for that purpose only SSA, but not RVSSA, will be appropriate.
References
Andrieu, C., Roberts, G.O.: The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Stat. 37, 697–725 (2009)
Andronescu, M., Aguirre-Hernandez, R., Condon, A., Hoos, H.H.: RNAsoft: a suite of RNA secondary structure prediction and design software tools. Nucleic Acids Res. 31, 3416–3422 (2003)
Andronescu, M., Condon, A., Hoos, H.H., Mathews, D.H., Murphy, K.P.: Computational approaches for RNA energy parameter estimation. RNA 16(12), 2304–2318 (2010)
Barton, R.R., Ivey Jr., J.S.: Nelder-Mead simplex modifications for simulation optimization. Manag. Sci. 42(7), 954–973 (1996)
Bonnet, G., Krichevsky, O., Libchaber, A.: Kinetics of conformational fluctuations in DNA hairpin-loops. Proc. Natl. Acad. Sci. 95(15), 8602–8606 (1998)
Cherry, K.M., Qian, L.: Scaling up molecular pattern recognition with DNA-based winner-take-all neural networks. Nature 559(7714), 370 (2018)
Cisse, I.I., Kim, H., Ha, T.: A rule of seven in Watson-Crick base-pairing of mismatched sequences. Nat. Struct. Mol. Biol. 19(6), 623 (2012)
Doucet, A., Johansen, A.M.: A tutorial on particle filtering and smoothing: fifteen years later. Handb. Nonlinear Filter. 12(656–704), 3 (2009)
Flamm, C., Fontana, W., Hofacker, I.L., Schuster, P.: RNA folding at elementary step resolution. RNA 6, 325–338 (2000)
Georgoulas, A., Hillston, J., Sanguinetti, G.: Unbiased Bayesian inference for population Markov jump processes via random truncations. Stat. Comput. 27(4), 991–1002 (2017)
Gillespie, D.T.: Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81(25), 2340–2361 (1977)
Golightly, A., Wilkinson, D.J.: Bayesian parameter inference for stochastic biochemical network models using particle Markov chain Monte Carlo. Interface Focus 1(6), 807–820 (2011)
Hajiaghayi, M., Kirkpatrick, B., Wang, L., Bouchard-Côté, A.: Efficient continuous-time Markov chain estimation. In: International Conference on Machine Learning, pp. 638–646 (2014)
Hansen, L.P.: Large sample properties of generalized method of moments estimators. Econ. J. Econ. Soc. 50, 1029–1054 (1982)
Hansen, L.P., Heaton, J., Yaron, A.: Finite-sample properties of some alternative GMM estimators. J. Bus. Econ. Stat. 14(3), 262–280 (1996)
Hata, H., Kitajima, T., Suyama, A.: Influence of thermodynamically unfavorable secondary structures on DNA hybridization kinetics. Nucleic Acids Res. 46(2), 782–791 (2017)
Hofacker, I.L.: Vienna RNA secondary structure server. Nucleic Acids Res. 31(13), 3429–3431 (2003)
Hordijk, A., Iglehart, D.L., Schassberger, R.: Discrete time methods for simulating continuous time Markov chains. Adv. Appl. Probab. 8(4), 772–788 (1976)
Horváth, A., Manini, D.: Parameter estimation of kinetic rates in stochastic reaction networks by the EM method. In: 2008 International Conference on BioMedical Engineering and Informatics, vol. 1, pp. 713–717. IEEE (2008)
Lehmann, E.L., Casella, G.: Theory of Point Estimation. Springer, New York (2006)
Loskot, P., Atitey, K., Mihaylova, L.: Comprehensive review of models and methods for inferences in bio-chemical reaction networks. arXiv preprint arXiv:1902.05828 (2019)
Lück, A., Wolf, V.: Generalized method of moments for estimating parameters of stochastic reaction networks. BMC Syst. Biol. 10(1), 98 (2016)
Mathews, D.H., Sabina, J., Zuker, M., Turner, D.H.: Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol. 288(5), 911–940 (1999)
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953)
Morrison, L.E., Stols, L.M.: Sensitive fluorescence-based thermodynamic and kinetic measurements of DNA hybridization in solution. Biochemistry 32, 3095–3104 (1993)
Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)
Ouldridge, T.E., Louis, A.A., Doye, J.P.: Structural, mechanical, and thermodynamic properties of a coarse-grained DNA model. J. Chem. Phys. 134(8), 02B627 (2011)
Schaeffer, J.M.: Stochastic simulation of the kinetics of multiple interacting nucleic acid strands. Ph.D. thesis, California Institute of Technology (2012)
Schaeffer, J.M., Thachuk, C., Winfree, E.: Stochastic simulation of the kinetics of multiple interacting nucleic acid strands. In: Phillips, A., Yin, P. (eds.) DNA 2015. LNCS, vol. 9211, pp. 194–211. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21999-8_13
Schnoerr, D., Sanguinetti, G., Grima, R.: Approximation and inference methods for stochastic biochemical kinetics–a tutorial review. J. Phys. Math. Theor. 50(9), 093001 (2017)
Schreck, J.S., et al.: DNA hairpins destabilize duplexes primarily by promoting melting rather than by inhibiting hybridization. Nucleic Acids Res. 43(13), 6181–6190 (2015)
Srinivas, N., et al.: On the biophysics and kinetics of toehold-mediated DNA strand displacement. Nucleic Acids Res. 41, 10641–10658 (2013)
Suhov, Y., Kelbert, M.: Probability and Statistics by Example: Volume 2, Markov Chains: A Primer in Random Processes and Their Applications, vol. 2. Cambridge University Press, Cambridge (2008)
Šulc, P., Romano, F., Ouldridge, T.E., Rovigatti, L., Doye, J.P., Louis, A.A.: Sequence-dependent thermodynamics of a coarse-grained DNA model. J. Chem. Phys. 137(13), 135101 (2012)
Tang, X., Kirkpatrick, B., Thomas, S., Song, G., Amato, N.M.: Using motion planning to study RNA folding kinetics. J. Comput. Biol. 12(6), 862–881 (2005)
Wackerly, D., Mendenhall, W., Scheaffer, R.L.: Mathematical Statistics with Applications. Cengage Learning, Boston (2014)
Wetmur, J.G.: Hybridization and renaturation kinetics of nucleic acids. Annu. Rev. Biophys. Bioeng. 5(1), 337–361 (1976)
Xu, Z.Z., Mathews, D.H.: Experiment-assisted secondary structure prediction with RNAstructure. In: Turner, D., Mathews, D. (eds.) RNA Structure Determination: Methods and Protocols, pp. 163–176. Humana Press, New York (2016)
Zadeh, J.N., et al.: NUPACK: analysis and design of nucleic acid systems. J. Comput. Chem. 32, 170–173 (2011)
Zhang, D.Y., Winfree, E.: Control of DNA strand displacement kinetics using toehold exchange. J. Am. Chem. Soc. 131, 17303–17314 (2009)
Zhang, J.X., et al.: Predicting DNA hybridization kinetics from sequence. Nat. Chem. 10(1), 91 (2018)
Zolaktaf, S., et al.: Inferring parameters for an elementary step model of DNA structure kinetics with locally context-dependent arrhenius rates. In: Brijder, R., Qian, L. (eds.) DNA 2017. LNCS, vol. 10467, pp. 172–187. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66799-7_12
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zolaktaf, S., Dannenberg, F., Winfree, E., Bouchard-Côté, A., Schmidt, M., Condon, A. (2019). Efficient Parameter Estimation for DNA Kinetics Modeled as Continuous-Time Markov Chains. In: Thachuk, C., Liu, Y. (eds) DNA Computing and Molecular Programming. DNA 2019. Lecture Notes in Computer Science(), vol 11648. Springer, Cham. https://doi.org/10.1007/978-3-030-26807-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-26807-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26806-0
Online ISBN: 978-3-030-26807-7
eBook Packages: Computer ScienceComputer Science (R0)