Although Whisper’s transcription is highly accurate, there is always jargon (GPT) or non-standard spellings that make the transcript flawed (example: “Dave Prior” is a podcast host and transcription will spell his last name as “Pryor.”) What are some ways to improve transcription?
There are three usual ways to improve Whisper transcription service:
I suggest the above order is in increasing difficulty. If Whisper is having trouble with your accent or how you say acronyms, then fine tuning will be the best solution. The first two options are nice as one could build the prompts dynamically.
With long recordings (I’ve used up to 40 minutes long so far), I’ve successfully got option 3 to transcribe at 100% accuracy, getting company names correct, people’s names correct, and acronyms correct.