Abstract
In this paper, we present a new sign language machine translation approach based on regression method. The aim of this work is to improve the translation quality and accuracy of existing regularized regression methods. Our approach represents a methodological foundation for small-scale corpus domains such as the Sign Language Machine Translation field. Our method is based on the Elastic net regularization using linear combination of the L1 and L2 penalties of the lasso and ridge methods. We show that using both the de-bruijn graph with the Latent Semantic Analysis technique in the decoding process improves the translation results. The system is experimented on American Sign Language parallel corpora containing 300 sentences and assessed by BLEU, METEOR, NIST and F1-MESURE machine translation evaluation metrics. We obtained good experimental results compared to classical phrase based approach i.e MOSES framework. Also our approach improved the translation results compared to LASSO and RIDGE regression approaches.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Banerjee S, Lavie A (2005) METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization at the 43rd Annual Meeting of the Association of Computational Linguistics (ACL-2005), Ann Arbor, June 2005
Battison R (1978) Lexical borrowing in American sign language: phonological and morphological restructuring. Linstok Press, Silver Spring
Biçici E, Yuret D (2010) L1 regularization for learning word alignments in sparse feature matrices. In: Proceedings of the Computer Science Student Workshop, 2010
Bishop CM (2006). Pattern recognition and machine learning. Springer ISBN 978-0-387-31073-2, 2006
Boulares M, Jemni M (2012) Mobile sign language translation system for deaf community. In: Proceedings of the International Cross-Disciplinary Conference on Web Accessibility. ACM, ISBN: 978-1-4503-1019-2
Boulares M, Jemni M (2014) Combined methodology based on kernel regression and kernel density estimation for sign language machine translation. Advances in neural networks. ISNN’14, Springer pp 374–384
Charles A, Rebecca S (2000) Reading optimally builds on spoken language implication for deaf readers. Learning research and development center University of Pittsburgh, Pittsburgh
Colin CA et al (1997) An R-squared measure of goodness of fit for some common nonlinear regression models. J Econ 77(2):1790–1792. doi:10.1016/S0304-4076(96)01818-0 (PMID 11230695)
Cortes C, Mehryar M, Jason W (2007) A general regression framework for learning string-to-string mappings. In: Predicting structured data. The MIT Press, pp 143–168, September 2007
Coughlin D (2003) Correlating automated and human assessments of machine translation quality. MT Summit IX, New Orleans
De Lathauwer L et al (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21(4):1253–1278
Deerwester SC et al (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci (1986–1998); Sep 1990; 41, 6; ABI/INFORM Global pg. 391
Denoual E, Lepage Y (2005) BLEU in characters: towards automatic MT evaluation in languages without word delimiters. In: Companion Volume to the Proceedings of the Second International Joint Conference on Natural Language Processing, pp 81–86
Doddington G (2002) Automatic evaluation of machine translation quality using n-gram cooccurrence statistics. In: Proceedings of the Human Language Technology Conference (HLT), San Diego, pp 128–132
Doddington G (2002) The NIST automated measure and its relation to IBMs BLEU. In: Proceedings of LREC-2002 Workshopon Machine Translation Evaluation: Human Evaluators Meet Automated Metrics
Efthimiou E et al (2009) Sign language recognition, generation, and modelling: a research effort with applications in deaf communication. In: Universal access in human–computer interaction addressing diversity. Springer, pp 21–30
Efthimiou E et al (2007) GSLC: creation and annotation of a greek sign language corpus for HCI. In: Stephanidis C (ed) HCI 2007, LNCS, vol 4554. Springer, Heidelberg, pp 657–666
Emmorey K (2005) The confluence of space and language in signed languages. Linguist Am Sign Lang Introd 3:318–346
Finch A, Hwang Y-S, Sumita E (2005) Using machine translation evaluation techniques to determine sentence-level semantic equivalence. IWP2005 2005
Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12(1):55–67. doi:10.2307/1267351 JSTOR 1271436
Huenerfauth M, Lu P (2011) Effect of spatial reference and verb inflection on usability of sign language animations. Springer-Verlag Univ Access Inf Soc. doi:10.1007/s10209-011-0247-7
Hung-Yu S, Chung-Hsien W (2009) Improving structural statistical machine translation for sign language with small corpus using thematic role templates as translation memory. IEEE Trans Audio Speech Lang Process 17(7):1305–1315 September 2009
Koehn P et al (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions, pp 177–180
Koehn P, Hoang H (2007) Factored translation models. In: Proceedings of EMNLP-CoNLL’07
Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. In: Proceedings of HAACL-HLT’03, pp 48–54
Lavie A, Sagae K, Jayaraman S (2004) The Significance of Recall in Automatic Metrics for MT Evaluation. In: Proceedings of AMTA 2004, Washington DC. September 2004
Leslie C, Eskin E, Stafford W (2002) The spectrum kernel: a string kernel forsvm protein classifi-cation. Pacific symposium on Biocomputing. pp 566–575. 2002
Lodhi H, Saunders C, Shawe-Taylor J, Nello C, Watkins C (2002) Text classification using string kernels. J Mach Learn Res 2:419–444
Neidle C et al (2000) The syntax of American sign language: functional categories and hierarchical structure. MIT Press, Cambridge
Neidle C, Sclaroff S (2002) Data collected at the national center for sign language and gesture resources. Boston University, Boston
Papineni K et al (2002) BLEU: a method for automatic evaluation of machine translation. ACL-2002: 40th Annual meeting of the Association for Computational Linguistics. pp 311–318. CiteSeerX: 10.1.1.19.9416
Powers DMW (2011) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J Mach Learn Technol 2(1):37–63
Przybocki M (2004) NIST machine translation 2004 evaluation summary of results. In: Machine Translation Evaluation Workshop
Sandler W (1989) Phonological representation of the sign: linearity and nonlinearity in American sign language. Foris, Dordrecht
Schmidt C, Koller O, Ney H, Hoyoux T, Piater J (2013) Using viseme recognition to improve a sign language translation system. International Workshop on Spoken Language Translation, pp 197–203
Serrano N, Andres-Ferrer J, Casacuberta F (2009) On a kernel regression approach to machine translation. In: Iberian Conference on Pattern Recognition and Image Analysis, pp 394–401, 2009
Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge university press, Cambridge
Stein D (2012) Analysis, preparation, and optimization of statistical sign language machine translation. Mach Trans 26(4):325–357
Stokoe WC (1978) Sign language structure. 1978. ERIC
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc. Series B (Methodol). 267–288
Trevor H, Jonathan T, Robert T, Guenther W (2006) Forward stagewise regression and the monotone lasso. Electron J Stat 1:1–29
Trevor H, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer, New York
Valli C, Lucas C (2000) Linguistics of American sign language: an introduction. Gallaudet University Press, Washington, DC
Watkins C (2000) Dynamic alignment kernels. Adv Large Margin Classif, 39–50
Zhuoran W, Shawe-Taylor J (2008) Kernel regression framework for machine translation: UCL system description for WMT 2008 shared translation task. In: Proceedings of the Third Workshop on Statistical Machine Translation, pp 155–158, 2008
Zhuoran W, Shawe-Taylor J, Sandor S (2007) Kernel regression based machine translation. In: Human Language Technologies. The Conference of the North American Chapter of the Association for Computational Linguistics; pp 185–188, 2007
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc: Series B (Stat Methodol) 67(2):301–320 April 2005
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Boulares, M., Jemni, M. Learning sign language machine translation based on elastic net regularization and latent semantic analysis. Artif Intell Rev 46, 145–166 (2016). https://doi.org/10.1007/s10462-016-9460-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-016-9460-3