Converting spoken words into text

Transcription is one of the most useful techniques to apply to audio. It’s also fiendishly difficult, although models have gotten much better in recent years.

Use cases

If speech can be accurately converted into text, everything we can do with text we can now do with speech.

Try it out

Here we will try using the large version of the Whisper model, which should give excellent results over many languages.


State of the art

