JCP 2016 Vol.11(1): 33-40 ISSN: 1796-203X
doi: 10.17706/jcp.11.1.33-40
doi: 10.17706/jcp.11.1.33-40
Enhancements in Statistical Spoken Language Translation by De-normalization of ASR Results
Agnieszka Wołk1, Krzysztof Wołk2, Krzysztof Marasek2
1Department of Cybernetics, Military University of Technology, Kaliskiego 2, Warsaw, Poland.
2Polish-Japanese Institute of Information Technology, Koszykowa 86, Warsaw, Poland.
Abstract—Spoken language translation (SLT) has become very important in an increasingly globalized world. Machine translation (MT) for automatic speech recognition (ASR) systems is a major challenge of great interest. This research investigates that automatic sentence segmentation of speech that is important for enriching speech recognition output and for aiding downstream language processing. This article focuses on the automatic sentence segmentation of speech and improving MT results. We explore the problem of identifying sentence boundaries in the transcriptions produced by automatic speech recognition systems in the Polish language. We also experiment with reverse normalization of the recognized speech samples.
Index Terms—Machine translation, de-normalization, NLP.
2Polish-Japanese Institute of Information Technology, Koszykowa 86, Warsaw, Poland.
Abstract—Spoken language translation (SLT) has become very important in an increasingly globalized world. Machine translation (MT) for automatic speech recognition (ASR) systems is a major challenge of great interest. This research investigates that automatic sentence segmentation of speech that is important for enriching speech recognition output and for aiding downstream language processing. This article focuses on the automatic sentence segmentation of speech and improving MT results. We explore the problem of identifying sentence boundaries in the transcriptions produced by automatic speech recognition systems in the Polish language. We also experiment with reverse normalization of the recognized speech samples.
Index Terms—Machine translation, de-normalization, NLP.
Cite: Agnieszka Wołk, Krzysztof Wołk, Krzysztof Marasek, "Enhancements in Statistical Spoken Language Translation by De-normalization of ASR Results," Journal of Computers vol. 11, no. 1, pp. 33-40, 2016.
General Information
ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO, ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
E-mail: jcp@iap.org
-
Nov 14, 2019 News!
Vol 14, No 11 has been published with online version [Click]
-
Mar 20, 2020 News!
Vol 15, No 2 has been published with online version [Click]
-
Dec 16, 2019 News!
Vol 14, No 12 has been published with online version [Click]
-
Sep 16, 2019 News!
Vol 14, No 9 has been published with online version [Click]
-
Aug 16, 2019 News!
Vol 14, No 8 has been published with online version [Click]
- Read more>>