Hello Transcribe 2.4 — Timestamps and more

29 May 2023

Introduction

I’ve had a couple of requests to add timestamps to transcriptions, which is now available in version 2.4 on the App Store 🥳.

There are some other changes like new toolbars, on-demand model loading, and a rewritten underlying Whisper framework, which you can read more about below.

Timestamps

I’ve had to rewrite quite a bit of the underlying Whisper framework code to support timestamps, but this was also a good time to do some refactoring. The code is much cleaner now and ready for a macOS version of the app.

Pro users can enable showing timestamps by default in Settings, or on-demand on the toolbar (the “clock” button):

If you have timestamps enabled when sharing they will be included in the output text:

[00:00.0-00:03.9] Today, I want to tell you three stories from my life.
[00:03.9-00:04.9] That's it.
[00:04.9-00:05.9] No big deal.
[00:05.9-00:08.4] Just three stories.
[00:08.4-00:13.4] The first story is about connecting the dots.
[00:13.4-00:17.9] I dropped out of Reed College after the first six months, but then stayed around as a drop-in
[00:17.9-00:22.0] for another 18 months or so before I really quit.
[00:22.0-00:24.8] So why'd I drop out?
[00:24.8-00:27.8] It started before I was born.
[00:27.8-00:33.0] My biological mother was a young, unwed graduate student, and she decided to put me up for
[00:33.0-00:34.9] adoption.
[00:34.9-00:39.4] She felt very strongly that I should be adopted by college graduates, so everything was all
[00:39.4-00:44.4] set for me to be adopted at birth by a lawyer and his wife.
[00:44.4-00:48.4] Except that when I popped out, they decided at the last minute that they really wanted
[00:48.4-00:50.4] a girl.
[00:50.4-00:55.1] So my parents, who were on a waiting list, got a call in the middle of the night asking,
[00:55.1-00:57.8] "We've got an unexpected baby boy.
[00:57.8-00:60.0] Do you want him?"
[00:60.0-01:01.4] They said, "Of course."

I haven’t implemented .vtt or .srt export of captions yet, that will come in version 2.5.

New toolbars & UI updates

You will also notice new toolbars:

I’ve consolidated the controls on the left and options on the right on all the toolbars: file transcribe, dictate, and the main toolbar. These toolbars will be shared (I think) with the upcoming macOS version.

On-demand model loading

The transcription model will load on-demand instead of loading during app startup. This drastically reduces the memory footprint of the app while not transcribing, and makes the workflows simpler and less bug-prone, especially when working with Shortcuts or other external app launches.

Conclusion

Timestamps are available, and .vtt and .srt export is coming soon. The UI is simpler and uses more standard SwiftUI components, which will help with the macOS version.

I’ll also be looking into supporting the bigger Whisper models in the near future as there have been several requests.

P.S. If you don’t want to pay $6.99 for the Pro version I’ll add you to the Beta list then it’s free.

P.P.S. I’ve had reports of Shortcuts broken on iOS 16.5 - looking into it now!