Innovation

Google Cloud Speech API gets enterprise upgrade with new tools and 30 more languages

Google is adding support for a new feature called word-level timestamps, files up to three hours long, and more languages for its Cloud Speech API.


On Monday, Google announced new updates to its Cloud Speech API that could help make it a more effective tool for business users. According to a Google blog post, the API is getting a new feature called word-level timestamps, along with support for 30 new languages and three hour files.

For those unfamiliar, the Google Cloud Speech API uses neural network models to allow developers to convert audio to text. It's powered by machine learning, and can return its results in real-time.

Word-level timestamps, the post said, was the most requested feature for the API from developers. Essentially, this feature adds a timestamp for each word it identifies in a given transcription. "Word-level timestamps let users jump to the moment in the audio where the text was spoken, or display the relevant text while the audio is playing," the post said.

SEE: How we learned to talk to computers, and how they learned to answer back (PDF download)

One of the customers cited in the post, Happy Scribe, uses the word-level timestamps to lower the time it takes for them to proofread the transcriptions they offer their customers. Another firm, VoxImplant, uses it to better analyze recorded phone conversations between two parties.

As part of a broader announcement around Google's voice input capabilities, the Cloud Speech API will now offer support for 30 additional languages, bringing the total number supported up to 119. The languages will first be offered to Cloud Speech API customers, but will eventually be supported on other Google products, like Gboard, as well.

As noted by ZDnet's Stephanie Condon, the extended language support could help Google win over some customers in emerging markets.

The full list of languages that work with the Cloud Speech API can be found here.

Additionally, the post said, the Cloud Speech API will now support files that are longer than three hours in duration, an increase from the previous limit of 80 minutes. Files that are longer than three hours can be supported on a "case-by-case basis by applying for a quota extension through Cloud Support."

The 3 big takeaways for TechRepublic readers

  1. Google Cloud Speech API now supports 119 languages and three hour long files, and offers a new word-level timestamp feature.
  2. Word-level timestamps were the no. 1 most-requested feature, allowing developers to jump to the moment where a certain word was said in an audio transcript.
  3. The features could help make the API more business-friendly, and the language support could win Google some business in emerging markets.

Also see

voiceinput.jpg
Image: iStockphoto/AntonioGuillem

About Conner Forrest

Conner Forrest is a Senior Editor for TechRepublic. He covers enterprise technology and is interested in the convergence of tech and culture.

Editor's Picks

Free Newsletters, In your Inbox