Hello Transcribe 2

09 Feb 2023

Introduction

Hello Transcribe 2 is on the App Store 🥳. If you already updated recently please update to 2.0.5 because 2.0.4 had a bug :/

This is a major update, with the addition of 5 more Whisper models, better multilingual support, transcription persistence (including iCloud sync), and in-app purchases.

Let’s go…

New models and multilingual support

New Models

Version 1 shipped with one Whisper model: “tiny”.

Version 2 has 5 more models, “tiny.en”, “base”, “base.en”, “small”, and “small.en”. The “.en” variants of each model is the same size and has the same performance, but is English-only. For English speech these models will give better accuracy than the multilingual versions.

Even though “tiny” is a multilingual model, it doesn’t perform very well on languages other than English. Here are some WER scores from the Whisper Paper for comparison:

The “tiny” model is included in the app, if you want to use the other models they have to be downloaded. All these new options are under the “model” icon () on the toolbar:

Which will bring up this screen:

Above you will see three of the six models:

The “tiny” model. This model is included and cannot be deleted.
The “tiny.en” model. This model has been downloaded and is currently active. You can delete it if it’s not currently active.
The “base” model. This model is a “Pro” model and must be downloaded to use.

What is a “Pro” model? I have added In-App Purchases for “Pro” models, you can read more under the “In-App Purchases” section below.

The models are progressively slower and bigger, but more accurate. I use the “base” model on my iPhone 14 Pro and it gives good results in a reasonable amount of time.

Multilingual support

There are two new options for multilingual support:

Audio language can be set to “auto” and Whisper will try to detect it for you. You can change it to something else (e.g. “German”) if the detection doesn’t work for you. I’ve actually just realised the detected language isn’t shown anywhere… I will have to add it in the next release.
Translate to English: If this option is enabled it will use Whisper’s built-in translation feature. If it is left disabled the output will be in the audio language.

If you receive audio or a voice note in a foreign language and you can only speak English - you can translate it!

If you speak a language other than English, please let me know how it goes. I can speak Afrikaans but Whisper’s performance on Afrikaans is not great. I’m particularly interested in the experience of the languages listed in the Whisper paper mentioned above, i.e. Dutch, French, German, Italian, Polish, Portuguese, & Spanish.

Save and iCloud sync

You can now save your transcriptions by tapping the “save” icon in the result view:

This will save the transcription on your device, which also backs it up to iCloud. If you choose not to save the result, it never leaves your device. If you have iCloud encryption enabled, the result is encrypted.

The home screen will now show all your saved transcriptions:

This view shows the date and time, which model was used, which language options were used, and the length of the audio.

You can use the standard iOS “Edit” action to delete, or swipe left to delete. To see the whole transcription you can tap on the summary bubble, and share, copy and delete:

In-App Purchases

This is the section I am most uncertain about.

Several people have encouraged me to add In-App Purchases (IAP) so I can continue improving the app. This would be great, I would happily work more on this if it makes financial sense. I also want to support Georgi Gerganov for his amazing work on Whisper.cpp even though he hasn’t asked for anything. So I’ve added payments.

The IAP strategy is fairly simple: Add payments for using larger or “Pro” models, i.e. “base”, “base.en”, “small”, “small.en”.

“tiny” remains 100% free. “tiny.en” is also free.

There are two payment options:

Pay-per-hour: You can buy 60 minutes for $1. 60 Pro minutes for free.
Subscribe: Unlimited Pro usage for $2 per month or the equivalent in your location. 2 week free trial.

The alternative I could implement is a once-off unlimited Pro purchase for N dollars, which I’m considering. Thinking about it now there’s nothing to prevent me from adding that option so maybe I should just add it. Again, feedback is very welcome.

The toolbar will show how many minutes you have left:

If you have a free model selected or a subscription it will show ∞. If you tap on that icon it will bring up the store view where you can select an option:

If you are willing to be a Beta tester then you can have everything for free - just email me and ask!

Conclusion and next

The App is maturing, I am getting great feedback from users, and I’m enjoying working in the Apple/Swift/SwiftUI ecosystem. I hope I can generate some revenue and work on it more.

Here are some suggestions for features to add, let me know what you want:

A voice keyboard so you can input text using your voice.
Adding support for Shortcuts.
Adding an iWatch app.
Speaker labelling (e.g. in a meeting scenario).

Aside from that, I will continue to work on performance improvements and improvements to the core transcription engine.

Follow me on Twitter for updates.