Real-time speech-generated subtitles: Problems and solutions

J. Hewitt, A. Bateman, A. Lambourne, A. Ariyaeeinia, P. Sivakumaran

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)


This paper refers to work carried out in the Subspeak project [1] in which we are investigating the use of speech recognition in live television subtitling. Research to date has shown that with current speech recognition technology it is not possible to achieve a satisfactory level of accuracy in the direct transcription of broadcast material. To circumvent this problem in our system the broadcast speech data is respoken by a native English speaker in a quiet environment. Recognition rates of up to 98% can be achieved by a trained speaker where there are no out of vocabulary words. However, using conventional keyboard input, subtitlers can currently achieve near to 100%, with typically only minor errors of spelling or punctuation. The challenge is therefore to provide a speech-based subtitling system which mirrors the conventional systems in accuracy and speed, but which requires far less time to train subtitlers to use. Subtitles must typically be output at between 150 and 180 words per minute and the delay between the broadcast speech and the appearance of the subtitle must be at most 8 seconds. In the prototype system, output from the speech recognition system is passed in to a custom-built editor from where it can be corrected and passed on to an existing subtitling system.

Original languageEnglish
Title of host publication6th International Conference on Spoken Language Processing, ICSLP 2000
PublisherInternational Speech Communication Association
Number of pages4
ISBN (Electronic)7801501144, 9787801501141
Publication statusPublished - 1 Oct 2000
Event6th International Conference on Spoken Language Processing, ICSLP 2000 - Beijing, China
Duration: 16 Oct 200020 Oct 2000


Conference6th International Conference on Spoken Language Processing, ICSLP 2000


Dive into the research topics of 'Real-time speech-generated subtitles: Problems and solutions'. Together they form a unique fingerprint.

Cite this