How to convert images to text with LetterSnap app powered by Google machine learning

This iOS app extracts text from images and turns it into an editable document. It also demonstrates the capabilities of Google's machine learning services.

Photo of iPhone with LetterSnap app open, pointed at text
Image: Andy Wolber / TechRepublic

At first glance, the LetterSnap iOS app released on November 17, 2016 appears fairly straightforward. You open the app, point your phone at some text, and tap to take a picture. LetterSnap identifies text in the image and gives you text that you can edit.

To test the app, I used LetterSnap to convert characters in photos to text for three common tasks.

I took a picture of a page of text in a book. In this case, page 136 of "Impossible to Ignore" by Carmen Simon. The app tried to convert some of the illustration into text, but otherwise captured much of the text accurately.

Photo of book page, with resulting LetterSnap recognized text

LetterSnap uses Google's Cloud Vision API to convert a photo of text into text you can edit.

Next, I captured a picture of information displayed on a computer screen. I often take a photo of system information to avoid writing it down, but then end up looking at the image and typing the data later. LetterSnap gave me text I could copy and paste, without typing.

Photo of Windows system info screen, with LetterSnap recognized text

The app identifies text in more complex layouts, too, such as system information on a screen.

Then, I wrote on a whiteboard and took a photo of the text. Again, while not perfect, LetterSnap captured and converted most of what I wrote--and ignored the stars I scribbled around the title.

Photo of handwritten text on a whiteboard, with LetterSnap recognized text output

LetterSnap recognizes handwritten text -- as long as you write neatly.

Note that all of these are complicated tasks. I took the photo of the book while holding it. The laptop screen photo included a logo, uneven spacings, and text displayed in various parts of the image. And the whiteboard--and all the photos--captured both glare and shadows intentionally. But text recognition still, for the most part, worked.

You can use LetterSnap to take and extract text from 10 photos for free. After that, you pay per bundle of photos: $0.99 buys 200 conversions, $1.99 for 450, or $3.99 for 950.

But, the most interesting part of LetterSnap isn't in the app. It's that the app uses Google Cloud Vision to deliver optical character recognition (OCR). That means that OCR occurs thanks to Google's Vision API and image machine learning models.

To date, Google's own applications have tended to feature the benefits of machine learning, such as image recognition in Google Photos, language translation in Google Translate, and smarter related topic suggestions with Google Explore. But LetterSnap shows that an independent mobile app developer can leverage the power of Google Cloud Vision machine learning, too.

As more developers integrate support for machine learning cloud services, I expect app pricing and update cycles to change.

First, app pricing may move toward a use-more, pay-more model. For example, frequent users of LetterSnap will need to pay for additional conversions. I expect to see more apps with pay-per-use or tiered-pricing models. (For another example, take a look at Evernote, which offers character search at a premium price--and which recently announced a transition to the Google Cloud Platform.)

Second, you can expect apps to improve without app updates. LetterSnap recognition accuracy will get better as Google Cloud Vision character recognition capabilities improve. That's a change from most current mobile apps that only improve when a new version is released and installed. Apps that build-in Cloud Platform capabilities will improve as the back-end services get better, much like Google Search results improve over time.

So, if you'd like to convert photos to text you can edit, give LetterSnap a try. That's exactly what the app does. But LetterSnap also demonstrates the power of Google's machine learning delivered as a service, in an app that most people can understand.

What about you?

Aside from Google Photos, Google Translate, and LetterSnap, what apps do you use to demonstrate the power of machine learning? Tell us in the comments.

Also see