Abstract
Losing VoIP packets or speech frames decreases the perceptual speech quality. The statistical relation between randomly lost speech frames and speech quality is well known. In cases of bursty and rate-distortion optimized losses, a precise quality model is required to relate losses to quality. In the present paper, we present a model that is based on the loss impact – or the importance – of single speech frames. We present a novel metric to calculate the impact of the loss of multiple frames by adding the importance of the respective single frames. This metric shows a high prediction accuracy for distant losses. For losses following each other closely, we present an aggregation function which models the psychoacoustic post-masking effect. Our model helps to develop networking algorithms that control the packet dropping process in networks. For example, we show that a proper packet dropping strategy can significantly increase the drop rate while maintaining the same level of speech quality.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Mohammed, S., Cercantes-Perez, F., Afifi, H.: Integrating networks measurements and speech quality subjective scores for control purposes. In: Infocom 2001, Anchorage, AK, USA, vol. 2, pp. 641–649 (2001)
Sun, L.: Subjective and objective speech quality evaluation under bursty losses. In: MESAQIN 2002, Prague, CZ (2002)
Jiang, W., Schulzrinne, H.: Comparison and optimization of packet loss repair methods on voip perceived quality under bursty loss. In: NOSSDAV, pp. 73–81 (2002)
Sanneck, H., Tuong, N., Le, L., Wolisz, A., Carle, G.: Intra-flow loss recovery and control for VoIP. In: ACM Multimedia, pp. 441–454 (2001)
Petracca, M., Servetti, A., De Martin, J.C.: Voice transmission over 802.11 wireless networks using analysis-by-synthesis packet classification. In: First International Symposium on Control, Communications and Signal Processing, Hammamet, Tunisia, pp. 587–590 (2004)
Chou, P., Miao, Z.: Rate-distortion optimized streaming of packetized media. Technical Report MSR-TR-2001-35, Microsoft Research Technical Report, Redmond, WA (2001)
Hoene, C., Dulamsuren-Lalla, E.: Predicting performance of PESQ in case of single frame losses. In: MESAQIN 2004, Prague, CZ (2004)
Hoene, C., Rathke, B., Wolisz, A.: On the importance of a VoIP packet. In: ISCA Tutorial and Research Workshop on the Auditory Quality of Systems, Mont-Cenis, Germany (2003)
ITU -T: Recommendation P.862 - Perceptual Evaluation of Speech Quality (PESQ), an Objective Method for End-To-End Speech Quality Assessment of Narrowband Telephone Networks and Speech Codecs (2001)
Rix, A.W., Hollier, M.P., Hekstra, A.P., Beerends, J.G.: Perceptual evaluation of speech quality (PESQ), the new ITU standard for end-to-end speech quality assessment, part I - time alignment. Journal of the Audio Engineering Society 50, 755–764 (2002)
Beerends, J.G., Hekstra, A.P., Rix, A.W., Hollier, M.P.: Perceptual evaluation of speech quality (PESQ), the new ITU standard for end-to-end speech quality assessment, part II - psychoacoustic model. Journal of the Audio Engineering Society 50, 765–778 (2002)
Hoene, C., Wiethölter, S., Wolisz, A.: Predicting the perceptual service quality using a trace of VoIP packets. In: Solé-Pareta, J., Smirnov, M., Van Mieghem, P., Domingo-Pascual, J., Monteiro, E., Reichl, P., Stiller, B., Gibbens, R.J. (eds.) QofIS 2004. LNCS, vol. 3266, pp. 21–30. Springer, Heidelberg (2004)
Beerends, J.G.: Measuring the quality of speech and music codecs: An integrated psychoacoustic approach. presented at the 98th Convention of the Audio Engineering Society, preprint 3945 (1995)
Zwicker, E., Fastl, H.: Psychoacoustics, facts and models. Springer, Heidelberg (1990)
Beerends, J.G., Stemerdink, J.A.: A perceptual speech-quality measure based on a psychoacoustic sound representation. Journal of the Audio Engineering Society 42, 115–123 (1994)
Hoene, C., Schäfer, G., Wolisz, A.: Predicting the importance of a speech frame, work in progress (2005)
Telchemy: Delayed contribution 105: Description of VQmon algorithm. ITU-T Study Group 12 (2003)
Psytechnics: Delayed contribution 175: High level description of psytechnics ITU-T P.VTQ candidate. ITU-T Study Group 12 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hoene, C., Wiethölter, S., Wolisz, A. (2005). Calculation of Speech Quality by Aggregating the Impacts of Individual Frame Losses. In: de Meer, H., Bhatti, N. (eds) Quality of Service – IWQoS 2005. IWQoS 2005. Lecture Notes in Computer Science, vol 3552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11499169_12
Download citation
DOI: https://doi.org/10.1007/11499169_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26294-7
Online ISBN: 978-3-540-31659-6
eBook Packages: Computer ScienceComputer Science (R0)