A Novel Algorithm for Classifying Protein Structure Familiar by Using the Graph Mining Approach | SpringerLink
Skip to main content

A Novel Algorithm for Classifying Protein Structure Familiar by Using the Graph Mining Approach

  • Conference paper
  • First Online:
Intelligent Computing Theories and Methodologies (ICIC 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9225))

Included in the following conference series:

Abstract

Protein structural classification is critical in bioinformatics. In this study, a simple and connected graph was used to represent a 3D protein structure in which each node represented an amino acid and each edge represented a contact distance between two amino acids. The B-factor (atomic displacement parameters) was then used to substantially reduce the number of nodes and edges in each graph representation. A graph mining approach was applied to determine the critical subgraphs among these graphs, which can be applied to classify protein structural families. An experimental study was conducted in which characteristic substructural patterns were identified in several protein families in the SCOP database.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
JPY 3498
Price includes VAT (Japan)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
JPY 5719
Price includes VAT (Japan)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
JPY 7149
Price includes VAT (Japan)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In the test protein graph G, the same labeled node is enumerated once.

  2. 2.

    In general, the nodes of candidate subgraph are between 10 and 15.

References

  1. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Limpman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389–3402 (1997)

    Article  Google Scholar 

  2. Aloy, P., Querol, E., Aviles, F.X., Sternberg, M.J.E.: Automates structure-based prediction of functional sites in proteins: applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking. J. Mol. Biol. 311(2), 395–408 (2001)

    Article  Google Scholar 

  3. Aung, Z., Tan, K.L.: Automatic protein structure classification through structural fingerprinting. In: 4th IEEE Symposium on Bioinformatics and Bioengineering, pp. 508–515 (2004)

    Google Scholar 

  4. Bandyopadhyay, D., Huan, J., Liu, J., Prins, J., Snoeyink, J., Tropsha, A., Wang, W.: Using Fast Subgraph Isomorphism Checking for Protein Functional Annotation Using SCOP and Gene Ontology. Technical report, The University of North Carolina at Chapel Hill Department of Computer Science (2005)

    Google Scholar 

  5. Bandyopadhyay, D., Huan, J., Liu, J., Prins, J., Snoeyink, J., Wang, W., Tropsha, A.: Functional neighbors: inferring relationships between nonhomologous protein families using family-specific packing motifs. IEEE Trans. Inf Technol. Biomed. 14(5), 1137–1143 (2010)

    Article  Google Scholar 

  6. Henikoff, S., Henikoff, J.G., Pietrokovski, S.: Blocks + : a non-redundant database of protein alignment blocks derived from multiple compilations. Bioinformatics 15(6), 471–479 (1999)

    Article  Google Scholar 

  7. Holder, L.B., Cook, D.J., Djoko, S.: Substructure discovery in the SUBDUE system. In: Association for the Advancement of Artificial Intelligence Workshop on Knowledge Discovery in Database (AAAI), pp. 169–180 (1994)

    Google Scholar 

  8. Huan, J., Bandyopadhyay, D., Wang, W., Snoeyink, J., Prins, J., Tropsha, A.: Comparing graph representations of protein structure for mining family-specific residue-based packing motifs. J. Comput. Biol. 12(6), 657–671 (2005)

    Article  Google Scholar 

  9. Huan, J., Wang, W., Bandyopadhyay, D., Snoeyink, J., Prins, J., Tropsha, A.: Mining protein family specific residue packing patterns from protein structure graphs. In: 8th Annual International Conference on Research in Computational Molecular Biology (RECOMB), pp. 308–315 (2004)

    Google Scholar 

  10. Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraph in the presence of isomorphism. In: 3th IEEE International Conference on Data Mining (ICDM), pp. 549–552 (2003)

    Google Scholar 

  11. Kent, W.J.: BLAT-the BLAST-like alignment tool. Genome Res. 12(4), 656–664 (2000)

    Article  Google Scholar 

  12. Krishna, V., Suri, N.N.R.R., Athithan, G.: A comparative survey of algorithms for frequent subgraph discovery. Curr. Sci. 100(25), 190–198 (2011)

    Google Scholar 

  13. Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: 1st IEEE Conference on Data Mining (ICDM), pp. 313–320 (2001)

    Google Scholar 

  14. Laberge, M., Yonetani, T.: Common dynamics of globin family proteins. Int. Union Biochem. Mol. Biol. 59(8), 528–534 (2007)

    Article  Google Scholar 

  15. Lam, W.W.M., Chan, K.C.C.: A graph mining algorithm for classifying chemical compounds. In: IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 321–324 (2008)

    Google Scholar 

  16. Nijssen, S., Kok, J.N.: A quickstart in frequent structure mining can make a difference. In: 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 647–652 (2004)

    Google Scholar 

  17. Nijssen, S., Kok, J.N.: Frequent graph mining and its application to molecular databases. In: IEEE International Conference on Systems, Man and Cybernetics (SMC), 5, pp. 4571–4577 (2004)

    Google Scholar 

  18. Petros, A.M., Olejniczak, E.T., Fesik, S.W.: Structural biology of the Bcl-2 family of proteins. Biochim. et Biophys Acta (BBA)-Mol. Cell Res. 1644(2), 83–94 (2004)

    Article  Google Scholar 

  19. Remold-O’Donnell, E.: The ovalbumin family of serpin proteins. Fed. Eur. Biochem. Societeies Lett. 315(2), 105–108 (1993)

    Article  Google Scholar 

  20. Wackersreuther, B., Wackersreuther, P., Oswald, A.: Frequent subgraph discovery in dynamic networks. In: 8th Workshop on Mining and Learning with Graphs (MLG), pp. 155–162 (2010)

    Google Scholar 

  21. Weskamp, N., Kuhn, D., Hllermeier, E., Klebe, G.: Efficient similarity search in protein structure databases by k-clique hashing. Bioinformatics 20(10), 1522–1526 (2005)

    Article  Google Scholar 

  22. Williams, D.W., Huan, J., Wang, W.: Graph database indexing using structured graph decomposition. In: 23th International Conference on Data Engineering (ICDE), pp. 976–975 (2007)

    Google Scholar 

  23. Yan, X., Han, J.: gSpan: Graph-based substructure pattern mining. In: 3th IEEE International Conference on Data Mining (ICDM), pp. 721–724 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sun-Yuan Hsieh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Hsieh, SY., Lee, CW., Yang, ZY., Wang, HW., Yu, JH. (2015). A Novel Algorithm for Classifying Protein Structure Familiar by Using the Graph Mining Approach. In: Huang, DS., Bevilacqua, V., Premaratne, P. (eds) Intelligent Computing Theories and Methodologies. ICIC 2015. Lecture Notes in Computer Science(), vol 9225. Springer, Cham. https://doi.org/10.1007/978-3-319-22180-9_72

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22180-9_72

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22179-3

  • Online ISBN: 978-3-319-22180-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics