Whisper AI: Live English Subtitles for 96 Languages

Level:: beginner
Room:: terrace 2b
Start:: 15:30 on 20 July 2023
Duration:: 30 minutes

Abstract

Whisper AI, a model from OpenAI, has been largely overlooked despite its impressive ability to accurately transcribe and translate human speech from audio.

In this talk I will explore the architecture of the model and explain why it works so well. Additionally, I will live demo the model's capabilities in three languages, showing how you can use it on your own computer to generate English subtitles for a wide range of content.

TalkPyData: Deep Learning, NLP, CV

Description

From this talk you will gain a basic understanding of OpenAI's Whisper AI, how it works, how it was trained, and how to run it and experiment with it yourself.

I will demonstrate how you can use Whisper to generate real-time English subtitles for 97 spoken languages. Not only are the subtitles displayed with minimal delay, but this solution is also suitable for meetings with sensitive content as it can be run locally on your PC without relying on any third-party cloud services.

I'll also be sharing what I learned about Python's threading and queueing modules, and how to use parallel processing to achieve real-time performance.

The speaker

Mathias Arens

Physicist turned Developer with a passion for open source artificial intelligence.

Mathias received his Masters in Natural Sciences from Cambridge University and has since worked on a wide range of projects and technologies. He enjoys combining open source AI models into useful solutions for real life problems.

← Back to schedule