Talking is faster than typing. In 2016, a study by researchers from Stanford University, University of Washington, and Baidu found that speech dictation was around three times faster than touch screen typing on mobile devices. The experimenters used short phrases in both English and Mandarin Chinese for their test.
Recognition accuracy matters, too, though, and the accuracy of speech recognition has improved significantly since 2015, when Google touted a word error rate of 8%. In late 2016, Microsoft claimed an error rate of 6.3%. A few months later, in March 2017, IBM announced that they'd reduced their recognition error rate to 5.5%. Then, Google announced a speech recognition error rate of 4.9% in May 2017. Amazon, Apple, Baidu, Nuance, and others are also competing to recognize speech best.
I'd done my own (informal) speech recognition test in 2015, when I tested the native speech dictation capabilities on an iPhone and in Google Docs with both Google's own voice-typing system and a third-party service.
In June 2017, I tested four different speech recognition systems using the same two-sentence phrase I'd used in 2015: Apple's Siri voice dictation system on iOS, Google's Gboard keyboard voice typing, the "Voice typing" option in Google Docs (used on a Chromebook), and Nuance Communications' Dragon Anywhere app. All of these systems are free, except Dragon Anywhere, which costs $15 per month. The Dragon Anywhere app supports voice edits to a document on a mobile device, as does Google's "Voice typing" on a desktop.
I spoke the same phrase I used in 2015 (see image), which included spoken punctuation: I said "...interest colon science period," hoping to see "interest: science." appear. Both Gboard's voice dictation option and Dragon Anywhere perfectly captured and transcribed the sentences. Gboard's dictation system displayed the word "twelve" instead of "12" following the year 1660. The word may be a better choice than the number, since two numbers in sequence might cause confusion for a reader. Google voice typing on a Chromebook made one error ("Mint" instead of "met"), while the native iOS dictation system produced two ("mad" and "is" instead of "met" and "as").
I experimented with dictating other phrases, as well. None of the transcription systems delivered 100% accuracy all the time, but all of the options created usable transcriptions with only a few minor errors.
Touch-screen input was slowest for me, with little difference between tapping on Apple's native keyboard and swiping words with Gboard. The touchscreen keyboard methods took me a little over a minute-and-a-half to enter the 41 words of text accurately. Apple's autocomplete performed well, while it took me a bit to correct word errors with Gboard while swiping.
I typed the text with a physical keyboard in about 35 seconds, roughly one-third the time it took me to enter it on a touchscreen keyboard. A proficient typist could likely type it even faster.
Talking was the fastest way to input text. That surprised me. I tried several different phrases to make sure it wasn't an anomaly. It wasn't. Every time, speech dictation was the fastest way to enter text, as it took me about 18 seconds to say the sentences.
Of course, voice input isn't always optimal. If you're trying to figure out what to say, then speed doesn't really matter. And in a noisy or public environment, many people may prefer to swipe or tap.
Talk, then edit
If you currently use a touchscreen keyboard for email and/or messaging on your phone, a switch to speech input might save you a significant amount of time.
My experiment convinced me I need to type less on my phone. For long documents, I'll reach for a physical keyboard. But for most email and messages, I should tap the microphone, talk, then review and correct any errors.
Have you recently tested how long it takes you to talk vs. type text? (Try it!) Let me know how accurate and fast you've found speech recognition systems to be for you.
- No. 1 takeaway in Meeker's 2017 report: Is your business ready for voice? (TechRepublic)
- Google updates Docs with new voice typing and editing capabilities (TechRepublic)
- Accessibility tools for Chrome and Google Apps users (TechRepublic)
- How to speech-to-text in Google Docs (TechRepublic)
- How we learned to talk to computers, and how they learned to answer back (TechRepublic)
- Why IBM's speech recognition breakthrough matters for AI and IoT (TechRepublic)
Andy Wolber helps people understand and leverage technology for social impact. He resides in Ann Arbor, MI with his wife, Liz, and daughter, Katie.