Peter Cochrane's Blog: Lost in translation?

Today's machine translation is good, but not quite up to <em>Star Trek</em> standards...

Written at Cambridge University and dispatched to via a 3G service because my hotel demanded £7 per hour for wi-fi. I won't be staying there again.

We are truly a species divided by tongues, but technology looks set to rescue us.

In 1979, I travelled across China on a lecture tour with stops in Beijing, Chengdu and Shanghai. All my material was on 35mm slides and overhead projector foils. There were no laptops and definitely no mobile phones.

At that time, the country had only about 500,000 phone lines, and fax communication was the favoured mode for international links.

Looking back, it was an epic tour and a delightful visit to a China that no longer exists. The transition from an agrarian society to one dominated by technology and rapid change could not be more pronounced. My most recent visits have been culturally far less dramatic.

I am no longer seen as a strange-looking being who speaks some weird and incomprehensible tongue. Everyone - in the cities at least - now seems to have watched Western movies and TV, and many speak reasonably good English.

Today could not be more different as I present to a Chinese delegation with the aid of my laptop, augmenting text with pictures, graphics, movies and animations. But best of all, I have been able to translate my words into Chinese for free with an online service.

Interestingly, the free online service is significantly better than a commercial package. This superiority is due, I suspect, to the contributions made by intelligent users who have added successive corrections and improvements.

I have to say that the gulf between the late 1970s and today is even bigger than I was forecasting at the time. I well remember that when I postulated good machine translation by the year 2000, people would smile sympathetically and shake their heads.

So how are we doing? How good is machine translation today? While I'm seriously impressed, I still see some space for improvement.

Let's try the acid test - a translation back and forth.

Original phrase
As a young man I researched electrical impulses

a series of English to Chinese machine translations

How free online software coped with repeatedly translating the same phrase from English to Chinese and back.
(Screenshot: Peter Cochrane)

This is interesting in its original accuracy and gentle degradation on successive operations. The machine translation is even more impressive on a few lines of William Wordsworth:

I wandered lonely as a cloud,
That floats on high o'er vales and hills,
When all at once I saw a crowd,
A host, of golden daffodils;
Beside the lake, beneath the trees,
Fluttering and dancing in the breeze.

a machine translation of Wordsworth into Chinese

The software's first attempt at rendering a Wordsworth verse into Chinese
(Screenshot: Peter Cochrane)

Second translation
I wandered lonely as a cloud
It floats, your valleys and hills high
When all the time I saw a group of people,
Host, golden daffodils;
The lake, beneath the trees,
Under the trees, dancing in the breeze.

a second machine translation of Wordsworth into Chinese

The software's second attempt at the Wordsworth stanza
(Screenshot: Peter Cochrane)

Fourth translation
I wandered lonely as a cloud
It floats, your valleys and high hills
When all the time, I saw a group of people,
Host, golden daffodils;
In the lake, trees,
In the trees, dancing in the breeze.

Not perfect, but pretty good given the abstract nature and wording. Most impressive of all is that we - humans, that is - can correct the errors with a modest knowledge of poetry and we can get the gist without any knowledge at all.

Can machines do that today? Not quite all of it, but it won't be long before they can do it 100 per cent. The artificial intelligence required to achieve this step will not take another 30 years, and the addition of a speech I/O is the next obvious step to realise a Star Trek-type translation device.

The biggest challenge is actually the inclusion of context and cognition, which may see the technology surpass our abilities.

In the mean time I can report that my Chinese audience were both impressed and at times, amused. A win all round.