Abstract
Purpose
Improving radiologists’ performance in classification between malignant and benign breast lesions is important to increase cancer detection sensitivity and reduce false-positive recalls. For this purpose, developing computer-aided diagnosis schemes has been attracting research interest in recent years. In this study, we investigated a new feature selection method for the task of breast mass classification.
Methods
We initially computed 181 image features based on mass shape, spiculation, contrast, presence of fat or calcifications, texture, isodensity, and other morphological features. From this large image feature pool, we used a sequential forward floating selection (SFFS)-based feature selection method to select relevant features and analyzed their performance using a support vector machine (SVM) model trained for the classification task. On a database of 600 benign and 600 malignant mass regions of interest, we performed the study using a tenfold cross-validation method. Feature selection and optimization of the SVM parameters were conducted on the training subsets only.
Results
The area under the receiver operating characteristic curve \((\hbox {AUC}) = 0.805\pm 0.012\) was obtained for the classification task. The results also showed that the most frequently selected features by the SFFS-based algorithm in tenfold iterations were those related to mass shape, isodensity, and presence of fat, which are consistent with the image features frequently used by radiologists in the clinical environment for mass classification. The study also indicated that accurately computing mass spiculation features from the projection mammograms was difficult, and failed to perform well for the mass classification task due to tissue overlap within the benign mass regions.
Conclusion
In conclusion, this comprehensive feature analysis study provided new and valuable information for optimizing computerized mass classification schemes that may have potential to be useful as a “second reader” in future clinical practice.
Similar content being viewed by others
References
American Cancer Society (2013) Cancer facts & figures 2013. http://www.cancerorg/research/cancerfactsstatistics/cancerfactsfigures2013/index
Siegel R, Naishadham D, Jemal A (2013) Cancer statistics. CA Cancer J Clin 63(1):11–30
Madigan MP, Ziegler RG, Benichou J, Byrne C, Hoover RN (1995) Proportion of breast cancer cases in the United States explained by well-established risk factors. J Natl Cancer Inst 87(22):1681–1685
Amir E, Freedman OC, Seruga B, Evans DG (2010) Assessing women at high risk of breast cancer: a review of risk assessment models. J Natl Cancer Inst 102(10):680–691
Sickles EA, Wolverton DE, Dee KE (2002) Performance parameters for screening and diagnostic mammography: specialist and general radiologists. Radiology 224(3):861–869
Rangayyan RM, Ayres FJ, Desautels JEL (2007) A review of computer-aided diagnosis of breast cancer: toward the detection of subtle signs. J Frankl Inst Eng Appl Math 344(3–4):312–348
Oliver A, Freixenet J, Marti J, Perez E, Pont J, Denton ERE, Zwiggelaar R (2010) A review of automatic mass detection and segmentation in mammographic images. Med Image Anal 14(2):87–110
Elter M, Horsch A (2009) CADx of mammographic masses and clustered microcalcifications: a review. Med Phys 36(6):2052–2068
Horsch A, Hapfelmeier A, Elter M (2011) Needs assessment for next generation computer-aided mammography reference image databases and evaluation studies. Int J CARS 6(6):749–767
Varela C, Timp S, Karssemeijer N (2006) Use of border information in the classification of mammographic masses. Phys Med Biol 51:425–441
te Brake GM, Karssemeijer N, Hendriks JH (2000) An automatic method to discriminate malignant masses from normal tissue in digital mammograms. Phys Med Biol 45(10):2843–2857
Shi J, Sahiner B, Chan HP, Ge J, Hadjiiski L, Helvie MA, Nees A, Wu YT, Wei J, Zhou C, Zhang Y, Cui J (2008) Characterization of mammographic masses based on level set segmentation with new image features and patient information. Med Phys 35(1):280–290
Nandi RJ, Nandi AK, Rangayyan RM, Scutt D (2006) Classification of breast masses in mammograms using genetic programming and feature selection. Med Bio Eng Comput 44(8):683–694
Rangayyan RM, El-Faramawy NM, Desautels JE, Alim OA (1997) Measures of acutance and shape for classification of breast tumors. IEEE Trans Med Imaging 16(6):799–810. doi:10.1109/42.650876
Suckling J, Parker J, Dance DR, Astley S, Hutt I, Boggis C, Ricketts I, Stamatakis E, Cerneaz N, Kok SL, Taylor PM, Betal D, Savage J (1994) The mammographic image analysis society digital mammogram database. In: Proceedings of the 2nd international workshop on digital mammography, pp 375–378
Retico A, Delogu P, Fantacci ME, Kasae P (2006) An automatic system to discriminate malignant from benign massive lesions on mammograms. Nucl Instrum Meth Phys Res Sect A 569:596–600
Mavroforakis M, Georgiou H, Dimitropoulos N, Cavouras D, Theodoridis S (2005) Significance analysis of qualitative mammographic features, using linear classifiers, neural networks and support vector machines. Eur J Radiol 54(1):80–89
Kilday J, Palmieri F, Fox MD (1993) Classifying mammographic lesions using computerized image analysis. IEEE Trans Med Imaging 12(4):664–669
Zheng B, Lu A, Hardesty LA, Sumkin JH, Hakim CM, Ganott MA, Gur D (2006) A method to improve visual similarity of breast masses for an interactive computer-aided diagnosis environment. Med Phys 33(1):111–117
Sahiner B, Petrick N, Chan HP, Hadjiiski LM, Paramagul C, Helvie MA, Gurcan MN (2001) Computer-aided characterization of mammographic masses: accuracy of mass segmentation and its effects on characterization. IEEE Trans Med Imaging 20(12):1275–1284. doi:10.1109/42.974922
Bruce LM, Adhami RR (1999) Classifying mammographic mass shapes using the wavelet transform modulus-maxima method. IEEE Trans Med Imaging 18(12):1170–1177. doi:10.1109/42.819326
Huo Z, Giger ML, Vyborny CJ, Wolverton DE, Schmidt RA, Doi K (1998) Automated computerized classification of malignant and benign masses on digitized mammograms. Acad Radiol 5(3):155–168
Huo Z, Giger ML, Vyborny CJ, Wolverton DE, Metz CE (2000) Computerized classification of benign and malignant masses on digitized mammograms: a study of robustness. Acad Radiol 7(12):1077–1084
Sahiner B, Chan HP, Petrick N, Helvie MA, Goodsitt MM (1998) Computerized characterization of masses on mammograms: the rubber band straightening transform and texture analysis. Med Phys 25(4):516–526
Sahiner B, Chan HP, Petrick N, Helvie MA, Goodsitt MM (1998) Design of a high-sensitivity classifier based on a genetic algorithm: application to computer-aided diagnosis. Phys Med Biol 43(10):2853–2871
Hadjiiski L, Sahiner B, Chan HP, Petrick N, Helvie M (1999) Classification of malignant and benign masses based on hybrid ART2LDA approach. IEEE Trans Med Imaging 18(12):1178–1187. doi:10.1109/42.819327
Ververidis D, Kotropoulos C (2008) Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition. Signal Process 88(12):2956–2970. doi:10.1016/j.sigpro.2008.07.001
Ververidis D, Kotropoulos C (2009) Information loss of the mahalanobis distance in high dimensions: application to feature selection. IEEE Trans Pattern Anal Mach Intell 31(12):2275–2281. doi:10.1109/tpami.2009.84
Park SC, Wang XH, Zheng B (2009) Assessment of performance improvement in content-based medical image retrieval schemes using fractal dimension. Acad Radiol 16(10):1171–1178
Zheng B, Leader JK, Abrams G, Shindel B, Catullo V, Good WF, Gur D (2004) Computer-aided detection schemes: the effect of limiting the number of cued regions in each case. Am J Roentgenol 182(3):579–583. doi:10.2214/ajr.182.3.1820579
Gur D, Stalder JS, Hardesty LA, Zheng B, Sumkin JH, Chough DM, Shindel BE, Rockette HE (2004) Computer-aided detection performance in mammographic examination of masses: assessment. Radiology 233(2):418–423. doi:10.1148/radiol.2332040277
Zheng B, Chang Y-H, Gur D (1995) Computerized detection of masses in digitized mammograms using single-image segmentation and a multilayer topographic feature analysis. Acad Radiol 2(11):959–966
Wang XH, Park SC, Zheng B (2009) Improving performance of content-based image retrieval schemes in searching for similar breast mass regions: an assessment. Phys Med Biol 54(4):949–961. doi:10.1088/0031-9155/54/4/009
Heath M, Bowyer K, Kopans D, Kegelmeyer P Jr, Moore R, Chang K, Munishkumaran S (1998) Current status of the digital database for screening mammography. Digit Mammogr 13:457–460. doi:10.1007/978-94-011-5318-8_75
Tan M, Pu J, Zheng B (2014) A new mass classification system derived from multiple features and a trained MLP model. In: Proc SPIE (Medical Imaging 2014: Computer-Aided Diagnosis) Accepted
Cheng HD, Shi XJ, Min R, Hu LM, Cai XP, Du HN (2006) Approaches for automated detection and classification of masses in mammograms. Pattern Recognit 39(4):646–668. doi:10.1016/j.patcog.2005.07.006
Zheng B, Leader JK, Abrams GS, Lu AH, Wallace LP, Maitz GS, Gur D (2006) Multiview-based computer-aided detection scheme for breast masses. Med Phys 33(9):3135–3143
Shen L, Rangayyan RM, Desautels JEL (1993) Detection and classification of mammographic calcifications. Int J Pattern Recognit Artif Intell 7(6):1403–1416
Sickles EA (1989) Breast masses: mammographic evaluation. Radiology 173(2):297–303
D’Orsi CJ, Kopans DB (1993) Mammographic feature analysis. Semin Roentgenol 28(3):204–230
Franquet T, De Miguel C, Cozcolluela R, Donoso L (1993) Spiculated lesions of the breast: mammographic-pathologic correlation. Radiographics 13(4):841–852
Zwiggelaar R, Astley SM, Boggis CRM, Taylor CJ (2004) Linear structures in mammographic images: detection and classification. IEEE Trans Med Imaging 23(9):1077–1086. doi:10.1109/tmi.2004.828675
Sampat MP, Whitman GJ, Markey MK, Bovik AC (2005) Evidence based detection of spiculated masses and architectural distortions. In: Medical imaging 2005: image processing, pp 26–37
Vyborny CJ, Doi T, O’Shaughnessy KF, Romsdahl HM, Schneider AC, Stein AA (2000) Breast cancer: importance of spiculation in computer-aided detection. Radiology 215(3):703–707
Jiang L, Song E, Xu X, Ma G, Zheng B (2008) Automated detection of breast mass spiculation levels and evaluation of scheme performance. Acad Radiol 15(12):1534–1544. doi:10.1016/j.acra.2008.07.015
Sampat MP, Bovik AC, Whitman GJ, Markey MK (2008) A model-based framework for the detection of spiculated masses on mammography. Med Phys 35(5):2110–2123
Kegelmeyer WP Jr, Pruneda JM, Bourland PD, Hillis A, Riggs MW, Nipper ML (1994) Computer-aided mammographic screening for spiculated lesions. Radiology 191(2):331–337
Tan M, Deklerck R, Cornelis J, Jansen B (2013) Phased searching with NEAT in a time-scaled framework: experiments on a computer-aided detection system for lung nodules. Artif Intell Med 59(3):157–167. doi:10.1016/j.artmed.2013.07.002
Tan M, Deklerck R, Jansen B, Bister M, Cornelis J (2011) A novel computer-aided lung nodule detection system for CT images. Med Phys 38(10):5630–5645
Haar Romeny BM (2003) Front-end vision and multi-scale image analysis. Springer, Berlin
Li Q, Sone S, Doi K (2003) Selective enhancement filters for nodules, vessels, and airway walls in two- and three-dimensional CT scans. Med Phys 30(8):2040–2051
Li Q, Arimura H, Doi K (2004) Selective enhancement filters for lung nodules, intracranial aneurysms, and breast microcalcifications. Proc CARS 1268:929–934
Mudigonda NR, Rangayyan RM, Desautels JE (2000) Gradient and texture analysis for the classification of mammographic masses. IEEE Trans Med Imaging 19(10):1032–1043. doi:10.1109/42.887618
Gupta S, Markey MK (2005) Correspondence in texture features between two mammographic views. Med Phys 32(6):1598–1606
Tang X (1998) Texture information in run-length matrices. IEEE Trans Image Proc 7(11):1602–1609
Wei X (2007) Gray level run length matrix toolbox v1.0. Beijing Aeronautical Technology Research Center, http://www.mathworkscom/matlabcentral/fileexchange/17482-gray-level-run-length-matrix-toolbox. Accessed 12 Dec 2013
Haralick RM, Shanmugam K, Dinstein I (1973) Texture features for image classification. IEEE Trans Syst Man Cybern 3(6):610–621
Li H, Wang Y, Liu KJ, Lo SC, Freedman MT (2001) Computerized radiographic mass detection-part I: Lesion site selection by morphological enhancement and contextual segmentation. IEEE Trans Med Imaging 20(4):289–301
Petrick N, Chan HP, Sahiner B, Helvie MA (1999) Combined adaptive enhancement and region-growing segmentation of breast masses on digitized mammograms. Med Phys 26(8):1642–1654
Chang Y-H, Wang X-H, Hardesty LA, Chang TS, Poller WR, Good WF, Gur D (2002) Computerized assessment of tissue composition on digitized mammograms. Acad Radiol 9(8):899–905
Zheng B, Wang X, Lederman D, Tan J, Gur D (2010) Computer-aided detection; the effect of training databases on detection of subtle breast masses. Acad Radiol 17(20650667):1401–1408
Powell WB (2007) Approximate dynamic programming: solving the curses of dimensionality, 1st edn. Wiley-Interscience, New York
Langley P (1994) Selection of relevant features in machine learning. In: AAAI fall symposium on relevance. AAAI Press, New Orleans, pp 140–144
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324
Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15(11):1119–1125
Burman P (1989) A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76(3):503–514. doi:10.2307/2336116
Huo Z, Giger ML, Vyborny CJ, Bick U, Lu P, Wolverton DE, Schmidt RA (1995) Analysis of spiculation in the computerized classification of mammographic masses. Med Phys 22(10):1569–1579
Mavroforakis ME, Georgiou HV, Dimitropoulos N, Cavouras D, Theodoridis S (2006) Mammographic masses characterization based on localized texture and dataset fractal analysis using linear, neural and support vector machine classifiers. Artif Intell Med 37(2):145–162. doi:10.1016/j.artmed.2006.03.002
Gupta S, Chyn PF, Markey MK (2006) Breast cancer CADx based on BI-RADS descriptors from two mammographic views. Med Phys 33(6):1810–1817
Lim WK, Er MJ (2004) Classification of mammographic masses using generalized dynamic fuzzy neural networks. Med Phys 31(5):1288–1295
Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines. ACM TIST 2(3):27
Hsu C-W, Chang C-C, Lin C-J (2009) A practical guide to support vector classification. Technical report. National Taiwan University, Taipei
Theodoridis S, Koutroumbas K (2008) Pattern recognition, 4th edn. Academic Press, San Diego
Theodoridis S, Koutroumbas K (2010) Introduction to pattern recognition: a MATLAB approach. Academic Press, New York
Rangayyan RM, Mudigonda NR, Desautels JE (2000) Boundary modelling and shape analysis methods for classification of mammographic masses. Med Biol Eng Comput 38(5):487–496
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc B 36:111–147
Land JWH, McKee DW, Anderson FR, Lo JY (2004) Breast cancer classification improvements using a new kernel function with evolutionary-programming-configured support vector machines. Proc SPIE 5370:880–887
Heidt S-R, Elter M, Wittenberg T, Paulus D (2009) Model-based characterization of mammographic masses. In: Meinzer H-P, Deserno T, Handels H, Tolxdorff T (eds) Bildverarbeitung für die Medizin 2009. Informatik aktuell. Springer, Berlin, pp 287–291. doi:10.1007/978-3-540-93860-6_58.
Eltonsy NH, Elmaghraby AS, Tourassi GD (2007) Bilateral breast volume asymmetry in screening mammograms as a potential marker of breast cancer: preliminary experience. In: Proceedings of the ICIP 2007, Sept 16 2007–Oct 19 2007, pp V-5–V-8
Acknowledgments
This work was supported in part by the National Cancer Institute, National Institutes of Health, under Grant R01CA160205.
Conflict of interest
Maxine Tan, Jiantao Pu and Bin Zheng declare they have no conflict of interest.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tan, M., Pu, J. & Zheng, B. Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model. Int J CARS 9, 1005–1020 (2014). https://doi.org/10.1007/s11548-014-0992-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11548-014-0992-1