Abstract
We describe an improvement of an algorithm for detecting frequently occurring patterns and modules in biological networks. The improvement is based on the observation that the problem of finding frequent network parts can be reduced to the problem of finding maximal frequent item sets (MFI). The MFI problem is a classical problem in the data mining community and there exist numerous efficient tools for it, most of them publicly available. We apply MFI tools to find frequent subgraphs in metabolic pathways from the KEGG database. Our experimental results show that, compared to the existing specialized tools for frequent subgraphs detection, the MFI tools coupled with an adequate postprocessing are much more efficient with regard to the running time and the size of the frequent graphs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cook, D.J., Holder, L.B.: Graph Based Data Mining. IEEE Intell. Syst. 15, 32–41 (2000)
Frequent Itemset Mining Implementations Repository, http://fimi.cs.helsinki.fi/
Gouda, K., Zaki, M.J.: Efficiently Mining Maximal Frequent Itemsets. In: IEEE International Conference on Data Mining ICDM 2001, pp. 163–170 (2001)
Hu, J., Shen, X., Shao, Y., Bysstoff, C., Zaki, M.J.: Mining Protein Contact Maps. In: BIOKDD, pp. 3–10 (2002)
Huan, J., Wang, W., Prins, J., Yang, J.: SPIN: Mining Maximal Frequent Subgraphs from Graph Databases. In: KDD 2004, pp. 581–586 (2004)
Inokuchi, A., Wahio, T., Okada, T., Motoda, H.: Applying a priori-based graph mining method to mutagenesis data analysis. J. Comput. Aided Chem. 2, 87–92 (2001)
Karp, P.D., Mavrovouniotis, M.L.: Representing, Analyzing, and Synthesizing Biochemical Pathways. IEEE Expert, 11–21 (1994)
Koyutürk, M., Grama, A., Szpankowski, W.: An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics 20(suppl. 1), i200–i207 (2004)
Koyutürk, M., Kim, Y., Subramaniam, S., Szpankowski, W., Grama, A.: Detecting Conserved Interaction Patterns in Biological Networks. Journal of Computational Biology 13(7), 1299–1322 (2006)
Krishnamurthy, L., Nadeau, J., Özosyoğlu, G., Özosyoğlu, M., Schaeffer, G., Taşan, M., Xu, W.: em Pathways Database System: An Integrated System for Biological Pathways. Bioinformatics 19, 930–937 (2003)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: IEEE International Conference on Data Mining ICDM 2001, pp. 313–320 (2001)
Liu, G., Lu, H., Yu, J.X., Wang, W., Xiao, X.: AFOPT: An Efficient Implementation of Pattern Growth Approach. In: Proc. of the ICDM 2003 Workshop on Frequent Itemset Mining Implementations, FIMI 2003, Melbourne, Florida, USA, December 19 (2003)
Rivest, R.L., Leiserson, C.E.: Introduction to Algorithms. McGraw-Hill, New York (1990)
Thomas, L.T., Valluri, S.R., Karalaplem, K.: MARGIN: Maximal Frequent Subgraph Mining. In: Proc. of the 6th International Conference on Data Mining ICDM 2006, IEEE, Los Alamitos (2006)
Yan, X., Han, J.: gSpan: graph-based substructure pattern mining. In: IEEE International Conference on Data Mining ICDM 2002, pp. 721–724 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zantema, H., Wagemans, S., Bošnački, D. (2008). Finding Frequent Subgraphs in Biological Networks Via Maximal Item Sets. In: Elloumi, M., Küng, J., Linial, M., Murphy, R.F., Schneider, K., Toma, C. (eds) Bioinformatics Research and Development. BIRD 2008. Communications in Computer and Information Science, vol 13. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70600-7_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-70600-7_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70598-7
Online ISBN: 978-3-540-70600-7
eBook Packages: Computer ScienceComputer Science (R0)