Abstract
This paper refers to work carried out in the Subspeak project [1] in which we are investigating the use of speech recognition in live television subtitling. Research to date has shown that with current speech recognition technology it is not possible to achieve a satisfactory level of accuracy in the direct transcription of broadcast material. To circumvent this problem in our system the broadcast speech data is respoken by a native English speaker in a quiet environment. Recognition rates of up to 98% can be achieved by a trained speaker where there are no out of vocabulary words. However, using conventional keyboard input, subtitlers can currently achieve near to 100%, with typically only minor errors of spelling or punctuation. The challenge is therefore to provide a speech-based subtitling system which mirrors the conventional systems in accuracy and speed, but which requires far less time to train subtitlers to use. Subtitles must typically be output at between 150 and 180 words per minute and the delay between the broadcast speech and the appearance of the subtitle must be at most 8 seconds. In the prototype system, output from the speech recognition system is passed in to a custom-built editor from where it can be corrected and passed on to an existing subtitling system.
Original language | English |
---|---|
Title of host publication | 6th International Conference on Spoken Language Processing, ICSLP 2000 |
Publisher | International Speech Communication Association |
Pages | 29-32 |
Number of pages | 4 |
ISBN (Electronic) | 7801501144, 9787801501141 |
Publication status | Published - 1 Oct 2000 |
Event | 6th International Conference on Spoken Language Processing, ICSLP 2000 - Beijing, China Duration: 16 Oct 2000 → 20 Oct 2000 |
Conference
Conference | 6th International Conference on Spoken Language Processing, ICSLP 2000 |
---|---|
Country/Territory | China |
City | Beijing |
Period | 16/10/00 → 20/10/00 |