You’re trying to pass off the greasy bean burritos at your desk as a working lunch; the final edit on the new software documentation is due this afternoon. Your fingers are covered in—what is that, anyway?—and you’ve just spotted an error. Napkin, napkin, who’s got the napkin?
It might not matter as much as you think.
Voice recognition software (VRS), once a clunky, cumbersome product that caused more aggravation than it was worth, has evolved into a sleek tool that can increase your productivity significantly. If you’re a sloppy eater, a slow typist, or just into cool new gadgets, add VRS to your bag of tricks and watch the magic happen.
Extremely fast typists—those who type more than 120 words a minute—may not benefit as much from VRS as other users, but there are still advantages to using the software. Power multitasking is easier; the latest release of ViaVoice offers complete desktop navigation. Switch between applications, tab between fields, enter data, and edit your documents without touching your keyboard or mouse. Your hands are free to dial the phone, jot a note in your planner, or unwrap your Big Mac—and your work is still getting done.
“Not everyone who uses ViaVoice is looking to replace the keyboard and mouse,” said Toby Maners, program director in the retail voice segment for IBM pervasive computing. “Often, our users are looking to give themselves another option, an additional way to input information.”
Maners said IBM has worked on advancing speech recognition technology since the early 1970s, although the first commercial products didn’t become available until the early ‘90s. Just a decade ago, the limited technology available required “discrete dictation”; users had to pause between each and every word.
Maners herself began using the software in 1994. The biggest leap, she says, happened three years later when the product went to continuous dictation, eliminating the need for constant pauses. “It’s a much more natural way to speak,” she said.
VRS has continued to improve dramatically. IBM added more vocabulary (the “basic” version of ViaVoice recognizes about 160,000, words, but can be taught up to half a million words), and increased processor speeds and memory provide ever-enhanced accuracy (95 percent to 98 percent for users who complete an initial “enrollment” session).
The enrollment session used to be a big turnoff, and many would-be users today are deterred by what they expect to be a lengthy, time-consuming process. “We’ve significantly reduced the amount of training time needed,” Maners said. “It takes about 15 minutes to dictate an initial document, and it takes the software 10 to 15 minutes to process your enrollment. That’s it.”
Part of the training, she added, is really on the human side: Dictation is slightly different from speaking. Users have to be taught to say, “period, new paragraph, tab,” because the software doesn’t automatically add them. “Just as people once had to learn to think while composing e-mail, rather than first writing a longhand draft, now we’re training them to think without typing,” she said.
Maners, who calls herself a “four-finger typist,” said the software cuts the time she spends on “writing” by about a third. “You can truly close your eyes and just talk to the person you’re writing to. It makes communication so much easier,” she says. “In essence, we’re finding that using voice as an interface will improve the way people interact with computers and drive the evolution of transparent computing, making the human-to-machine interaction easier and more natural.”
If you’re looking for “cool factor” tricks, try this: The latest version of ViaVoice includes transcription support for digital handheld recorders, so you can dictate text, upload your dictation as an audio file, and make your final edits in any text application.
The software is modeless, too, meaning that users can switch from dictation to commands and back again without a second thought. For example, you’re in Word, dictating the table of contents for your presentation. You’re manually shuffling through the pages you’ve printed, and you realize that you forgot to include one of your spreadsheets. “Switch to Excel. Open presentation.xls. Print that. Switch to Word. Page 5. Tab. Office productivity spreadsheet.” It’s like having your own personal assistant to boss around all day.
If speaking is something you do naturally, using VRS can probably save you some time. You can make it your primary input mode or simply use it as one of your weapons against working late nights. And if you spend a little time using the power user tips and tricks below, you should be able to enjoy significantly more daylight hours.
Power user tips and tricks
- Use the Analyze My Documents feature to help the software learn the words and phrases you use most frequently.
- Use the phonetic alphabet on the Quick Reference Card to improve accuracy when dictating proper names, acronyms, or words that the software might not recognize.
- To distinguish between dictated text and commands, speak commands forcefully—as if you’re teaching a dog to do a trick. For example, say “go to sleep,” with only short pauses between the words, to set the microphone to sleep mode. On the other hand, say “go to sleep” using your natural speech pattern if you want the words to appear as text in a document.
- The more naturally you dictate, the better your accuracy will be. Dictate documents as if speaking to a person. Enunciate well, but keep speaking with your natural cadence.
- Learn how to make macros for dictation and command control. If you have a standard closing in typed letters and e-mails, you’ll be able to say, “Standard closing,” and the software will know what you mean.
- If you’re working in a new environment—from a home office or on the floor of a noisy trade show—take the time to run a new Audio Setup. The software will be better able to ignore background noise.
- Resist the urge to correct errors using the keyboard and mouse. Use the software to correct mistakes so that it can learn and continually improve its accuracy.