Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model | International Journal of Computer Assisted Radiology and Surgery Skip to main content
Log in

Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model

  • Original Article
  • Published:
International Journal of Computer Assisted Radiology and Surgery Aims and scope Submit manuscript

Abstract

Purpose

Improving radiologists’ performance in classification between malignant and benign breast lesions is important to increase cancer detection sensitivity and reduce false-positive recalls. For this purpose, developing computer-aided diagnosis schemes has been attracting research interest in recent years. In this study, we investigated a new feature selection method for the task of breast mass classification.

Methods

We initially computed 181 image features based on mass shape, spiculation, contrast, presence of fat or calcifications, texture, isodensity, and other morphological features. From this large image feature pool, we used a sequential forward floating selection (SFFS)-based feature selection method to select relevant features and analyzed their performance using a support vector machine (SVM) model trained for the classification task. On a database of 600 benign and 600 malignant mass regions of interest, we performed the study using a tenfold cross-validation method. Feature selection and optimization of the SVM parameters were conducted on the training subsets only.

Results

The area under the receiver operating characteristic curve \((\hbox {AUC}) = 0.805\pm 0.012\) was obtained for the classification task. The results also showed that the most frequently selected features by the SFFS-based algorithm in tenfold iterations were those related to mass shape, isodensity, and presence of fat, which are consistent with the image features frequently used by radiologists in the clinical environment for mass classification. The study also indicated that accurately computing mass spiculation features from the projection mammograms was difficult, and failed to perform well for the mass classification task due to tissue overlap within the benign mass regions.

Conclusion

In conclusion, this comprehensive feature analysis study provided new and valuable information for optimizing computerized mass classification schemes that may have potential to be useful as a “second reader” in future clinical practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. American Cancer Society (2013) Cancer facts & figures 2013. http://www.cancerorg/research/cancerfactsstatistics/cancerfactsfigures2013/index

  2. Siegel R, Naishadham D, Jemal A (2013) Cancer statistics. CA Cancer J Clin 63(1):11–30

    Article  PubMed  Google Scholar 

  3. Madigan MP, Ziegler RG, Benichou J, Byrne C, Hoover RN (1995) Proportion of breast cancer cases in the United States explained by well-established risk factors. J Natl Cancer Inst 87(22):1681–1685

    Article  PubMed  CAS  Google Scholar 

  4. Amir E, Freedman OC, Seruga B, Evans DG (2010) Assessing women at high risk of breast cancer: a review of risk assessment models. J Natl Cancer Inst 102(10):680–691

    Article  PubMed  Google Scholar 

  5. Sickles EA, Wolverton DE, Dee KE (2002) Performance parameters for screening and diagnostic mammography: specialist and general radiologists. Radiology 224(3):861–869

    Article  PubMed  Google Scholar 

  6. Rangayyan RM, Ayres FJ, Desautels JEL (2007) A review of computer-aided diagnosis of breast cancer: toward the detection of subtle signs. J Frankl Inst Eng Appl Math 344(3–4):312–348

    Article  Google Scholar 

  7. Oliver A, Freixenet J, Marti J, Perez E, Pont J, Denton ERE, Zwiggelaar R (2010) A review of automatic mass detection and segmentation in mammographic images. Med Image Anal 14(2):87–110

    Article  PubMed  Google Scholar 

  8. Elter M, Horsch A (2009) CADx of mammographic masses and clustered microcalcifications: a review. Med Phys 36(6):2052–2068

    Article  PubMed  Google Scholar 

  9. Horsch A, Hapfelmeier A, Elter M (2011) Needs assessment for next generation computer-aided mammography reference image databases and evaluation studies. Int J CARS 6(6):749–767

    Article  Google Scholar 

  10. Varela C, Timp S, Karssemeijer N (2006) Use of border information in the classification of mammographic masses. Phys Med Biol 51:425–441

    Article  PubMed  CAS  Google Scholar 

  11. te Brake GM, Karssemeijer N, Hendriks JH (2000) An automatic method to discriminate malignant masses from normal tissue in digital mammograms. Phys Med Biol 45(10):2843–2857

    Google Scholar 

  12. Shi J, Sahiner B, Chan HP, Ge J, Hadjiiski L, Helvie MA, Nees A, Wu YT, Wei J, Zhou C, Zhang Y, Cui J (2008) Characterization of mammographic masses based on level set segmentation with new image features and patient information. Med Phys 35(1):280–290

    Article  PubMed  PubMed Central  Google Scholar 

  13. Nandi RJ, Nandi AK, Rangayyan RM, Scutt D (2006) Classification of breast masses in mammograms using genetic programming and feature selection. Med Bio Eng Comput 44(8):683–694

    Article  CAS  Google Scholar 

  14. Rangayyan RM, El-Faramawy NM, Desautels JE, Alim OA (1997) Measures of acutance and shape for classification of breast tumors. IEEE Trans Med Imaging 16(6):799–810. doi:10.1109/42.650876

    Article  PubMed  CAS  Google Scholar 

  15. Suckling J, Parker J, Dance DR, Astley S, Hutt I, Boggis C, Ricketts I, Stamatakis E, Cerneaz N, Kok SL, Taylor PM, Betal D, Savage J (1994) The mammographic image analysis society digital mammogram database. In: Proceedings of the 2nd international workshop on digital mammography, pp 375–378

  16. Retico A, Delogu P, Fantacci ME, Kasae P (2006) An automatic system to discriminate malignant from benign massive lesions on mammograms. Nucl Instrum Meth Phys Res Sect A 569:596–600

    Article  CAS  Google Scholar 

  17. Mavroforakis M, Georgiou H, Dimitropoulos N, Cavouras D, Theodoridis S (2005) Significance analysis of qualitative mammographic features, using linear classifiers, neural networks and support vector machines. Eur J Radiol 54(1):80–89

    Article  PubMed  Google Scholar 

  18. Kilday J, Palmieri F, Fox MD (1993) Classifying mammographic lesions using computerized image analysis. IEEE Trans Med Imaging 12(4):664–669

    Article  PubMed  CAS  Google Scholar 

  19. Zheng B, Lu A, Hardesty LA, Sumkin JH, Hakim CM, Ganott MA, Gur D (2006) A method to improve visual similarity of breast masses for an interactive computer-aided diagnosis environment. Med Phys 33(1):111–117

    Article  PubMed  Google Scholar 

  20. Sahiner B, Petrick N, Chan HP, Hadjiiski LM, Paramagul C, Helvie MA, Gurcan MN (2001) Computer-aided characterization of mammographic masses: accuracy of mass segmentation and its effects on characterization. IEEE Trans Med Imaging 20(12):1275–1284. doi:10.1109/42.974922

    Article  PubMed  CAS  Google Scholar 

  21. Bruce LM, Adhami RR (1999) Classifying mammographic mass shapes using the wavelet transform modulus-maxima method. IEEE Trans Med Imaging 18(12):1170–1177. doi:10.1109/42.819326

    Article  PubMed  CAS  Google Scholar 

  22. Huo Z, Giger ML, Vyborny CJ, Wolverton DE, Schmidt RA, Doi K (1998) Automated computerized classification of malignant and benign masses on digitized mammograms. Acad Radiol 5(3):155–168

    Article  PubMed  CAS  Google Scholar 

  23. Huo Z, Giger ML, Vyborny CJ, Wolverton DE, Metz CE (2000) Computerized classification of benign and malignant masses on digitized mammograms: a study of robustness. Acad Radiol 7(12):1077–1084

    Article  PubMed  CAS  Google Scholar 

  24. Sahiner B, Chan HP, Petrick N, Helvie MA, Goodsitt MM (1998) Computerized characterization of masses on mammograms: the rubber band straightening transform and texture analysis. Med Phys 25(4):516–526

    Article  PubMed  CAS  Google Scholar 

  25. Sahiner B, Chan HP, Petrick N, Helvie MA, Goodsitt MM (1998) Design of a high-sensitivity classifier based on a genetic algorithm: application to computer-aided diagnosis. Phys Med Biol 43(10):2853–2871

    Article  PubMed  CAS  Google Scholar 

  26. Hadjiiski L, Sahiner B, Chan HP, Petrick N, Helvie M (1999) Classification of malignant and benign masses based on hybrid ART2LDA approach. IEEE Trans Med Imaging 18(12):1178–1187. doi:10.1109/42.819327

    Article  PubMed  CAS  Google Scholar 

  27. Ververidis D, Kotropoulos C (2008) Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition. Signal Process 88(12):2956–2970. doi:10.1016/j.sigpro.2008.07.001

    Article  Google Scholar 

  28. Ververidis D, Kotropoulos C (2009) Information loss of the mahalanobis distance in high dimensions: application to feature selection. IEEE Trans Pattern Anal Mach Intell 31(12):2275–2281. doi:10.1109/tpami.2009.84

    Article  PubMed  Google Scholar 

  29. Park SC, Wang XH, Zheng B (2009) Assessment of performance improvement in content-based medical image retrieval schemes using fractal dimension. Acad Radiol 16(10):1171–1178

    Article  PubMed  PubMed Central  Google Scholar 

  30. Zheng B, Leader JK, Abrams G, Shindel B, Catullo V, Good WF, Gur D (2004) Computer-aided detection schemes: the effect of limiting the number of cued regions in each case. Am J Roentgenol 182(3):579–583. doi:10.2214/ajr.182.3.1820579

    Article  Google Scholar 

  31. Gur D, Stalder JS, Hardesty LA, Zheng B, Sumkin JH, Chough DM, Shindel BE, Rockette HE (2004) Computer-aided detection performance in mammographic examination of masses: assessment. Radiology 233(2):418–423. doi:10.1148/radiol.2332040277

    Article  PubMed  Google Scholar 

  32. Zheng B, Chang Y-H, Gur D (1995) Computerized detection of masses in digitized mammograms using single-image segmentation and a multilayer topographic feature analysis. Acad Radiol 2(11):959–966

    Article  PubMed  CAS  Google Scholar 

  33. Wang XH, Park SC, Zheng B (2009) Improving performance of content-based image retrieval schemes in searching for similar breast mass regions: an assessment. Phys Med Biol 54(4):949–961. doi:10.1088/0031-9155/54/4/009

    Article  PubMed  PubMed Central  Google Scholar 

  34. Heath M, Bowyer K, Kopans D, Kegelmeyer P Jr, Moore R, Chang K, Munishkumaran S (1998) Current status of the digital database for screening mammography. Digit Mammogr 13:457–460. doi:10.1007/978-94-011-5318-8_75

    Article  Google Scholar 

  35. Tan M, Pu J, Zheng B (2014) A new mass classification system derived from multiple features and a trained MLP model. In: Proc SPIE (Medical Imaging 2014: Computer-Aided Diagnosis) Accepted

  36. Cheng HD, Shi XJ, Min R, Hu LM, Cai XP, Du HN (2006) Approaches for automated detection and classification of masses in mammograms. Pattern Recognit 39(4):646–668. doi:10.1016/j.patcog.2005.07.006

    Article  Google Scholar 

  37. Zheng B, Leader JK, Abrams GS, Lu AH, Wallace LP, Maitz GS, Gur D (2006) Multiview-based computer-aided detection scheme for breast masses. Med Phys 33(9):3135–3143

    Article  PubMed  Google Scholar 

  38. Shen L, Rangayyan RM, Desautels JEL (1993) Detection and classification of mammographic calcifications. Int J Pattern Recognit Artif Intell 7(6):1403–1416

    Article  Google Scholar 

  39. Sickles EA (1989) Breast masses: mammographic evaluation. Radiology 173(2):297–303

    Article  PubMed  CAS  Google Scholar 

  40. D’Orsi CJ, Kopans DB (1993) Mammographic feature analysis. Semin Roentgenol 28(3):204–230

    Article  PubMed  Google Scholar 

  41. Franquet T, De Miguel C, Cozcolluela R, Donoso L (1993) Spiculated lesions of the breast: mammographic-pathologic correlation. Radiographics 13(4):841–852

    Article  PubMed  CAS  Google Scholar 

  42. Zwiggelaar R, Astley SM, Boggis CRM, Taylor CJ (2004) Linear structures in mammographic images: detection and classification. IEEE Trans Med Imaging 23(9):1077–1086. doi:10.1109/tmi.2004.828675

    Article  PubMed  Google Scholar 

  43. Sampat MP, Whitman GJ, Markey MK, Bovik AC (2005) Evidence based detection of spiculated masses and architectural distortions. In: Medical imaging 2005: image processing, pp 26–37

  44. Vyborny CJ, Doi T, O’Shaughnessy KF, Romsdahl HM, Schneider AC, Stein AA (2000) Breast cancer: importance of spiculation in computer-aided detection. Radiology 215(3):703–707

    Article  PubMed  CAS  Google Scholar 

  45. Jiang L, Song E, Xu X, Ma G, Zheng B (2008) Automated detection of breast mass spiculation levels and evaluation of scheme performance. Acad Radiol 15(12):1534–1544. doi:10.1016/j.acra.2008.07.015

    Article  PubMed  PubMed Central  Google Scholar 

  46. Sampat MP, Bovik AC, Whitman GJ, Markey MK (2008) A model-based framework for the detection of spiculated masses on mammography. Med Phys 35(5):2110–2123

    Article  PubMed  Google Scholar 

  47. Kegelmeyer WP Jr, Pruneda JM, Bourland PD, Hillis A, Riggs MW, Nipper ML (1994) Computer-aided mammographic screening for spiculated lesions. Radiology 191(2):331–337

    Article  PubMed  Google Scholar 

  48. Tan M, Deklerck R, Cornelis J, Jansen B (2013) Phased searching with NEAT in a time-scaled framework: experiments on a computer-aided detection system for lung nodules. Artif Intell Med 59(3):157–167. doi:10.1016/j.artmed.2013.07.002

    Article  PubMed  Google Scholar 

  49. Tan M, Deklerck R, Jansen B, Bister M, Cornelis J (2011) A novel computer-aided lung nodule detection system for CT images. Med Phys 38(10):5630–5645

    Article  PubMed  Google Scholar 

  50. Haar Romeny BM (2003) Front-end vision and multi-scale image analysis. Springer, Berlin

  51. Li Q, Sone S, Doi K (2003) Selective enhancement filters for nodules, vessels, and airway walls in two- and three-dimensional CT scans. Med Phys 30(8):2040–2051

    Article  PubMed  Google Scholar 

  52. Li Q, Arimura H, Doi K (2004) Selective enhancement filters for lung nodules, intracranial aneurysms, and breast microcalcifications. Proc CARS 1268:929–934

    Google Scholar 

  53. Mudigonda NR, Rangayyan RM, Desautels JE (2000) Gradient and texture analysis for the classification of mammographic masses. IEEE Trans Med Imaging 19(10):1032–1043. doi:10.1109/42.887618

    Article  PubMed  CAS  Google Scholar 

  54. Gupta S, Markey MK (2005) Correspondence in texture features between two mammographic views. Med Phys 32(6):1598–1606

    Article  PubMed  Google Scholar 

  55. Tang X (1998) Texture information in run-length matrices. IEEE Trans Image Proc 7(11):1602–1609

    Article  CAS  Google Scholar 

  56. Wei X (2007) Gray level run length matrix toolbox v1.0. Beijing Aeronautical Technology Research Center, http://www.mathworkscom/matlabcentral/fileexchange/17482-gray-level-run-length-matrix-toolbox. Accessed 12 Dec 2013

  57. Haralick RM, Shanmugam K, Dinstein I (1973) Texture features for image classification. IEEE Trans Syst Man Cybern 3(6):610–621

    Article  Google Scholar 

  58. Li H, Wang Y, Liu KJ, Lo SC, Freedman MT (2001) Computerized radiographic mass detection-part I: Lesion site selection by morphological enhancement and contextual segmentation. IEEE Trans Med Imaging 20(4):289–301

    Article  PubMed  CAS  Google Scholar 

  59. Petrick N, Chan HP, Sahiner B, Helvie MA (1999) Combined adaptive enhancement and region-growing segmentation of breast masses on digitized mammograms. Med Phys 26(8):1642–1654

    Article  PubMed  CAS  Google Scholar 

  60. Chang Y-H, Wang X-H, Hardesty LA, Chang TS, Poller WR, Good WF, Gur D (2002) Computerized assessment of tissue composition on digitized mammograms. Acad Radiol 9(8):899–905

    Article  PubMed  Google Scholar 

  61. Zheng B, Wang X, Lederman D, Tan J, Gur D (2010) Computer-aided detection; the effect of training databases on detection of subtle breast masses. Acad Radiol 17(20650667):1401–1408

    Article  PubMed  PubMed Central  Google Scholar 

  62. Powell WB (2007) Approximate dynamic programming: solving the curses of dimensionality, 1st edn. Wiley-Interscience, New York

    Book  Google Scholar 

  63. Langley P (1994) Selection of relevant features in machine learning. In: AAAI fall symposium on relevance. AAAI Press, New Orleans, pp 140–144

  64. Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324

    Article  Google Scholar 

  65. Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15(11):1119–1125

    Article  Google Scholar 

  66. Burman P (1989) A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3):503–514. doi:10.2307/2336116

    Article  Google Scholar 

  67. Huo Z, Giger ML, Vyborny CJ, Bick U, Lu P, Wolverton DE, Schmidt RA (1995) Analysis of spiculation in the computerized classification of mammographic masses. Med Phys 22(10):1569–1579

    Article  PubMed  CAS  Google Scholar 

  68. Mavroforakis ME, Georgiou HV, Dimitropoulos N, Cavouras D, Theodoridis S (2006) Mammographic masses characterization based on localized texture and dataset fractal analysis using linear, neural and support vector machine classifiers. Artif Intell Med 37(2):145–162. doi:10.1016/j.artmed.2006.03.002

    Article  PubMed  Google Scholar 

  69. Gupta S, Chyn PF, Markey MK (2006) Breast cancer CADx based on BI-RADS descriptors from two mammographic views. Med Phys 33(6):1810–1817

    Article  PubMed  Google Scholar 

  70. Lim WK, Er MJ (2004) Classification of mammographic masses using generalized dynamic fuzzy neural networks. Med Phys 31(5):1288–1295

    Article  PubMed  Google Scholar 

  71. Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM TIST 2(3):27

    Google Scholar 

  72. Hsu C-W, Chang C-C, Lin C-J (2009) A practical guide to support vector classification. Technical report. National Taiwan University, Taipei

    Google Scholar 

  73. Theodoridis S, Koutroumbas K (2008) Pattern recognition, 4th edn. Academic Press, San Diego

    Google Scholar 

  74. Theodoridis S, Koutroumbas K (2010) Introduction to pattern recognition: a MATLAB approach. Academic Press, New York

    Google Scholar 

  75. Rangayyan RM, Mudigonda NR, Desautels JE (2000) Boundary modelling and shape analysis methods for classification of mammographic masses. Med Biol Eng Comput 38(5):487–496

    Article  PubMed  CAS  Google Scholar 

  76. Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc B 36:111–147

    Google Scholar 

  77. Land JWH, McKee DW, Anderson FR, Lo JY (2004) Breast cancer classification improvements using a new kernel function with evolutionary-programming-configured support vector machines. Proc SPIE 5370:880–887

    Article  Google Scholar 

  78. Heidt S-R, Elter M, Wittenberg T, Paulus D (2009) Model-based characterization of mammographic masses. In: Meinzer H-P, Deserno T, Handels H, Tolxdorff T (eds) Bildverarbeitung für die Medizin 2009. Informatik aktuell. Springer, Berlin, pp 287–291. doi:10.1007/978-3-540-93860-6_58.

  79. Eltonsy NH, Elmaghraby AS, Tourassi GD (2007) Bilateral breast volume asymmetry in screening mammograms as a potential marker of breast cancer: preliminary experience. In: Proceedings of the ICIP 2007, Sept 16 2007–Oct 19 2007, pp V-5–V-8

Download references

Acknowledgments

This work was supported in part by the National Cancer Institute, National Institutes of Health, under Grant R01CA160205.

Conflict of interest

Maxine Tan, Jiantao Pu and Bin Zheng declare they have no conflict of interest.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maxine Tan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tan, M., Pu, J. & Zheng, B. Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model. Int J CARS 9, 1005–1020 (2014). https://doi.org/10.1007/s11548-014-0992-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11548-014-0992-1

Keywords

Navigation