Whisper AI: Live English Subtitles for 96 Languages
- Level:
- beginner
- Room:
- terrace 2b
- Start:
- Duration:
- 30 minutes
Abstract
Whisper AI, a model from OpenAI, has been largely overlooked despite its impressive ability to accurately transcribe and translate human speech from audio.
In this talk I will explore the architecture of the model and explain why it works so well. Additionally, I will live demo the model's capabilities in three languages, showing how you can use it on your own computer to generate English subtitles for a wide range of content.
Description
From this talk you will gain a basic understanding of OpenAI's Whisper AI, how it works, how it was trained, and how to run it and experiment with it yourself.
I will demonstrate how you can use Whisper to generate real-time English subtitles for 97 spoken languages. Not only are the subtitles displayed with minimal delay, but this solution is also suitable for meetings with sensitive content as it can be run locally on your PC without relying on any third-party cloud services.
I'll also be sharing what I learned about Python's threading and queueing modules, and how to use parallel processing to achieve real-time performance.