A Comparative Study on Enhancing the Accuracy of Chinese Speech-to-Text in Instructional Videos Using Large Language Models

Yang, Chih Chang; Chou, Tzren-Ru; Liu, Shu Wei

A Comparative Study on Enhancing the Accuracy of Chinese Speech-to-Text in Instructional Videos Using Large Language Models

James Alexander Gordon on 30th August 2024

Author Information

Chih Chang Yang, National Taiwan Normal University, Taiwan
Tzren-Ru Chou, National Taiwan Normal University, Taiwan
Shu Wei Liu, National Taiwan University of Science and Technology, Taiwan

Abstract

With the rapid development of speech recognition technology, Chinese speech-to-text (STT) systems play an important role in the production of subtitles and are often used in instructional videos. However, due to the complexity of the Chinese language and the large number of homophones, there is still significant room for improvement in the accuracy of existing STT systems. In this study, we proposed two optimization methods based on large language models (LLM), including language model-assisted editing and fine-tuned language model-assisted text editing, to improve the accuracy of Chinese STT, and verified them by producing subtitles for instructional videos in various domains and calculating the Levenshtein distance between two strings with dynamic programming. The results indicated that the fine-tuned language model-assisted text editing approach is significantly better than the language model-assisted editing approach in terms of text accuracy, and it can generate fine-tuning strategies for specific language characteristics to recognize language nuances more efficiently, thus significantly improving the accuracy of Chinese speech-to-text systems.

Category: Design, Implementation & Assessment of Innovative Technologies in Education

Posted by James Alexander Gordon

All Posts

A Comparative Study on Enhancing the Accuracy of Chinese Speech-to-Text in Instructional Videos Using Large Language Models

Author Information

Abstract

Paper Information

Comments & Feedback

Comments

Powered by WP LinkPress

Posted by James Alexander Gordon