Abstract
For the diagnosis of pathological voices it is of particular importance to examine the dynamic properties of the underlying vocal fold (VF) movements occurring at a fundamental frequency of 100–300 Hz. To this end, a patient’s laryngeal oscillation patterns are captured with state-of-the-art endoscopic high-speed (HS) camera systems capable of recording 4000 frames/second. To date the clinical analysis of these HS videos is commonly performed in a subjective manner via slow-motion playback. Hence, the resulting diagnoses are inherently error-prone, exhibiting high inter-rater variability. In this paper an objective method for overcoming this drawback is presented which employs a quantitative description and classification approach based on a novel image analysis strategy called Phonovibrography. By extracting the relevant VF movement information from HS videos the spatio-temporal patterns of laryngeal activity are captured using a set of specialized features. As reference for performance, conventional voice analysis features are also computed. The derived features are analyzed with different machine learning (ML) algorithms regarding clinically meaningful classification tasks. The applicability of the approach is demonstrated using a clinical data set comprising individuals with normophonic and paralytic voices. The results indicate that the presented approach holds a lot of promise for providing reliable diagnosis support in the future.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dejonckere, P., Bradley, P., Clemente, P., Cornut, G., Crevier-Buchman, L., Friedrich, G., Heyning, P.V.D., Remacle, M., Woisard, V.: Committee on Phoniatrics of the European Laryngological Society (ELS): A basic protocol for functional assessment of voice pathology. Eur. Arch. Otorhinolaryngol. 258, 77–82 (2001)
Raes, J., Lebrun, Y., Clement, P.: Videostroboscopy of the larynx. Acta Otorhinolaryngol. Belg. 40, 421–425 (1986)
Švec, J., Schutte, H.: Videokymography: high-speed line scanning of vocal fold vibration. J. Voice 10(2), 201–205 (1996)
Deliyski, D., Petrushev, P., Bonilha, H., Gerlach, T., Martin-Harris, B., Hillman, R.: Clinical implementation of laryngeal high-speed videoendoscopy: challenges and evolution. Folia Phoniatr Logop 60(1), 33–44 (2008)
Švec, J., Sram, F., Schutte, H.: Videokymography in voice disorders: what to look for? Ann. Otol. Rhinol. Laryngol. 116(3), 172–180 (2007)
Qiu, Q., Schutte, H., Gu, L., Yu, Q.: An automatic method to quantify the vibration properties of human vocal folds via videokymography. Folia Phoniatr Logop 55(3), 128–136 (2003)
Mergell, P., Herzel, H., Titze, I.: Irregular vocal-fold vibration–high-speed observation and modeling. J. Acoust. Soc. Am. 108(6), 2996–3002 (2000)
Lohscheller, J., Eysholdt, U., Toy, H., Döllinger, M.: Phonovibrography: mapping high-speed movies of vocal fold vibrations into 2-d diagrams for visualizing and analyzing the underlying laryngeal dynamics. IEEE Trans. Med. Imaging 27(3), 300–309 (2008)
Lohscheller, J., Toy, H., Rosanowski, F., Eysholdt, U., Döllinger, M.: Clinically evaluated procedure for the reconstruction of vocal fold vibrations from endoscopic digital high-speed videos. Med. Image Anal. 11(4), 400–413 (2007)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley & Sons, Chichester (2001)
Beyer, H., Schwefel, H.: Evolution strategies - a comprehensive introduction. Natural Computing 1, 3–52 (2002)
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: IJCAI, pp. 1137–1145 (1995)
Verikas, A., Gelzinis, A., Bacauskiene, M., Uloza, V.: Towards a computer-aided diagnosis system for vocal cord diseases. Artif. Intell. Med. 36(1), 71–84 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Voigt, D., Döllinger, M., Yang, A., Eysholdt, U., Lohscheller, J. (2009). Voice Pathology Classification by Using Features from High-Speed Videos. In: Combi, C., Shahar, Y., Abu-Hanna, A. (eds) Artificial Intelligence in Medicine. AIME 2009. Lecture Notes in Computer Science(), vol 5651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02976-9_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-02976-9_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02975-2
Online ISBN: 978-3-642-02976-9
eBook Packages: Computer ScienceComputer Science (R0)