Tech firms slug it out to create ultimate AI translator

The universal translator is a staple of science fiction, but Google, Meta and others are locked in a battle to get as many languages as possible working with their AI models. Picture: geralt/Pixabay

The universal translator is a staple of science fiction, but Google, Meta and others are locked in a battle to get as many languages as possible working with their AI models. Picture: geralt/Pixabay

Published Jul 13, 2022

Share

by Laurent Barthelemy

Paris: A man from South Africa speaks Sepedi to a Peruvian woman who knows only Quechua, yet they can understand each other.

The universal translator is a staple of science fiction, but Google, Meta and others are locked in a battle to get as many languages as possible working with their AI models.

Meta chief Mark Zuckerberg announced on Wednesday that his firm had a block of 200 languages that could be translated into one another, doubling the number in just two years.

Meta's innovation, trumpeted in 2020, was to break the link with English – long a conduit language because of the vast availability of sources.

Instead, Meta's models go direct from, say, Chinese to French without going through English.

In May, Google announced its great leap forward, adding 24 languages to Google Translate after pioneering techniques to reduce noise in the samples of lesser-used languages.

Sepedi and Quechua, of course, were among them, so the Peruvian and the South African could communicate, but only with text.

Researchers warn that the dream of a real-time conversation translator is some way off.

Quantity versus quality

Google and Meta have business motivations for their research, not least because the more people using their tools, the better the data to feed back into the AI loop.

They are also in competition with the likes of Microsoft, which has a paid-for translator, and DeepL, a popular web-based tool that focuses on fewer languages than its rivals.

The challenge of automatic translation is "particularly important" for Facebook because of the hate speech and inappropriate content it needs to filter, researcher Francois Yvon said.

The tool would help English-speaking moderators, for example, to identify such content in many other languages.

Meta's promotional videos, however, focus on the liberating aspects of the technology – amateur chefs having recipes from far and wide appearing at their fingertips.

But both companies are also at the forefront of AI research, and both accompanied their announcements with academic papers that highlight their ambitions.

The Google paper, titled "Building Machine Translation Systems for the Next Thousand Languages", makes clear that the firm is not satisfied with the 133 languages it features on Google Translate.

However, as the cliché goes, quantity does not always mean quality.

European primacy

"We should not imagine that the 200x200 language pairs will be at the same level of quality," said Yvon of Facebook's model.

European languages, for example, would probably always have an advantage simply because there are more reliable sources.

As regular users of tools such as Google Translate and other automatic programmes will attest, the text produced can be robotic and mistakes are not uncommon.

While this might not be a problem for day-to-day use lie restaurant menus, it does limit the utility of the tools.

"When you're working on the translation of an assembly manual for a fighter jet, you can't afford a single mistake," said Vincent Godard, who runs French tech firm Systran.

And the ultimate nut to crack is inventing a tool that can seamlessly translate the spoken word.

"We're not there yet, but we're working on it," said Antoine Bordes, who runs Fair, Meta's AI research lab.

He said Meta's speech translation project works on far fewer languages.

"But the interest will be in connecting the two projects, so that one day we will be able to speak in 200 languages while retaining intonations, emotions, accents," he said.

AFP