Facebook researchers use maths for better translations

Facebook, Google and Microsoft as well as Russia’s Yandex, China’s Baidu and others are constantly seeking to improve their translation tools. (File/AFP)
Updated 13 October 2019
Follow

Facebook researchers use maths for better translations

  • Facebook researchers say rendering words into figures and exploiting mathematical similarities between languages is a promising avenue
  • Allowing as many people as possible worldwide to communicate is not just an altruistic goal, but also good business

PARIS: Designers of machine translation tools still mostly rely on dictionaries to make a foreign language understandable. But now there is a new way: numbers.

Facebook researchers say rendering words into figures and exploiting mathematical similarities between languages is a promising avenue — even if a universal communicator a la Star Trek remains a distant dream.

Powerful automatic translation is a big priority for Internet giants. Allowing as many people as possible worldwide to communicate is not just an altruistic goal, but also good business.

Facebook, Google and Microsoft as well as Russia’s Yandex, China’s Baidu and others are constantly seeking to improve their translation tools.

Facebook has artificial intelligence experts on the job at one of its research labs in Paris. Up to 200 languages are currently used on Facebook, said Antoine Bordes, European co-director of fundamental AI research for the social network.

Automatic translation is currently based on having large databases of identical texts in both languages to work from. But for many language pairs there just aren’t enough such parallel texts.

That’s why researchers have been looking for another method, like the system developed by Facebook which creates a mathematical representation for words.

Each word becomes a “vector” in a space of several hundred dimensions. Words that have close associations in the spoken language also find themselves close to each other in this vector space.

“For example, if you take the words ‘cat’ and ‘dog’, semantically, they are words that describe a similar thing, so they will be extremely close together physically” in the vector space, said Guillaume Lample, one of the system’s designers.

“If you take words like Madrid, London, Paris, which are European capital cities, it’s the same idea.”

These language maps can then be linked to one another using algorithms — at first roughly, but eventually becoming more refined, until entire phrases can be matched without too many errors.

Lample said results are already promising. For the language pair of English-Romanian, Facebook’s current machine translation system is “equal or maybe a bit worse” than the word vector system, said Lample.

But for the rarer language pair of English-Urdu, where Facebook’s traditional system doesn’t have many bilingual texts to reference, the word vector system is already superior, he said.

But could the method allow translation from, say, Basque into the language of an Amazonian tribe? In theory, yes, said Lample, but in practice a large body of written texts are needed to map the language, something lacking in Amazonian tribal languages.

“If you have just tens of thousands of phrases, it won’t work. You need several hundreds of thousands,” he said.

Experts at France’s CNRS national scientific center said the approach Lample has taken for Facebook could produce useful results, even if it doesn’t result in perfect translations.

Thierry Poibeau of CNRS’s Lattice laboratory, which also does research into machine translation, called the word vector approach “a conceptual revolution.”

He said “translating without parallel data” — dictionaries or versions of the same documents in both languages — “is something of the Holy Grail” of machine translation.

“But the question is what level of performance can be expected” from the word vector method, said Poibeau. The method “can give an idea of the original text” but the capability for a good translation every time remains unproven.

Francois Yvon, a researcher at CNRS’s Computer Science Laboratory for Mechanics and Engineering Sciences, said “the linking of languages is much more difficult” when they are far removed from one another.

“The manner of denoting concepts in Chinese is completely different from French,” he added.
However even imperfect translations can be useful, said Yvon, and could prove sufficient to track hate speech, a major priority for Facebook.


Israeli court overturns conviction of officer who assaulted Palestinian journalist, citing ‘Oct. 7 PTSD’

Updated 25 February 2026
Follow

Israeli court overturns conviction of officer who assaulted Palestinian journalist, citing ‘Oct. 7 PTSD’

  • Judge sentenced Yitzhak Sofer to 300 hours of community service, saying officer “devoted his life to Israel’s security” and conviction was “disproportionate to severity of his actions”
  • Footage shows Sofer throwing photojournalist Mustafa Alkharouf to the ground, and repeatedly beating and kicking him while he covered Palestinian gatherings near Al-Aqsa Mosque

LONDON: An Israeli court overturned the conviction of a border police officer who assaulted a Palestinian journalist, ruling his actions were influenced by post-traumatic stress disorder from serving during the Oct. 7 2023 attacks.

On Tuesday, the Jerusalem Magistrate’s Court sentenced officer Yitzhak Sofer to 300 hours of community service for assaulting Anadolu Agency photojournalist Mustafa Alkharouf in occupied East Jerusalem in December 2023.

Footage shows Sofer and other officers drawing weapons, throwing Alkharouf to the ground, and repeatedly beating and kicking him while he covered Palestinian gatherings near Al-Aqsa Mosque amid heavy restrictions.

Alkharouf was hospitalized with facial and body injuries. His cameraman, Faiz Abu Ramila, was also attacked.

Sofer had been convicted in September 2024 of assault causing bodily harm (acquitted of threats) and initially faced six months’ community service, as recommended by Mahash, the Justice Ministry’s police misconduct unit.

Judge Amir Shaked accepted the defense request to cancel the conviction, replacing it with community service.

He cited Sofer’s PTSD from responding to the Oct. 7 Hamas-led attack, noting the officer had “no prior criminal record” and had “devoted his life to Israel’s security.”

“The court cannot ignore this when considering whether the defendant’s conviction should stand,” he said, adding that while the incident is “serious and does cross the criminal threshold,” the conviction in place could cause Sofer harm “disproportionate to the severity of his actions.”

The ruling comes amid surging attacks on journalists in the West Bank, East Jerusalem and Gaza since Israel’s war on Gaza began.

The Committee to Protect Journalists reported Israel responsible for two-thirds of the 129 media workers killed worldwide in 2025, the deadliest year on record, citing a “persistent culture of impunity” and lack of transparent probes.

Reporters Without Borders called the Israeli army the “worst enemy of journalists” in its 2025 report, with nearly half of global reporter deaths in Gaza.

Foreign journalists face raids, arrests and intimidation. In late January 2026, Israel’s Supreme Court granted a delay on ruling a ban on foreign media access to Gaza.