Development of an ASR-Based Subtitle Generation System for Lecture Videos to Improve Searchability

Abstract

This study focused on improving the searchability of video-based online learning, specifically for Japanese lecture videos. A prototype semantic search API was developed to enhance search functionality using automatic speech recognition (ASR) and text embeddings. The system employs OpenAI Whisper to generate subtitles from uploaded videos. Text embeddings were generated using two models. The embeddings were stored in a vector database, enabling semantic search by calculating similarity between query embeddings and stored data. The system was evaluated on macOS using mlx-whisper, an optimized version of Whisper for Apple silicon. The preliminary evaluation demonstrated high ASR accuracy and efficient embedding generation, with the ruri-large model in particular providing more relevant search results for Japanese lecture videos.



Author Information
Koichi Yoshizaki, Oita University, Japan

Paper Information
Conference: SEACE2025
Stream: Innovation & Technology

This paper is part of the SEACE2025 Conference Proceedings (View)
Full Paper
View / Download the full paper in a new tab/window


To cite this article:
Yoshizaki K. (2025) Development of an ASR-Based Subtitle Generation System for Lecture Videos to Improve Searchability ISSN: 2435-5240 The Southeast Asian Conference on Education 2025: Official Conference Proceedings (pp. 591-596) https://doi.org/10.22492/issn.2435-5240.2025.49
To link to this article: https://doi.org/10.22492/issn.2435-5240.2025.49


Comments & Feedback

Place a comment using your LinkedIn profile

Comments

Share on activity feed

Powered by WP LinkPress

Share this Research

Posted by James Alexander Gordon