How Gulf-developed large language models like Jais are bringing Arabic into the AI mainstream

As Gulf states aim to become AI leaders by investing in R&D and startups (Supplied/MBZUAI)
Short Url
Updated 09 October 2023
Follow

How Gulf-developed large language models like Jais are bringing Arabic into the AI mainstream

  • ChatGPT understands inquiries in Arabic, but answers can sound unnatural or fail to convey the right message
  • Now homegrown LLMs can capture linguistic nuances and even comprehend dialects and cultural references

DUBAI: When ChatGPT made its debut last year, the artificial intelligence program caused a global sensation, as users found themselves communicating with a machine that could pass as another human being.

However, the enthusiasm among techies in the Arab world was somewhat diminished by ChatGPT’s limited grasp of Arabic, in part the result of the language’s complexity, diacritical markings, inflection system and regional dialects.

Although ChatGPT, which is based on a large language model, or LLM, can understand inquiries in Arabic and is able to translate, especially when using Modern Standard Arabic, answers can come across as unnatural, while literal translations do not always convey the right message.

That is why Jais, an LLM designed to support Arabic, was unveiled in July, bringing one of the world’s most widely spoken, though occasionally overlooked, languages into the AI mainstream.

Jais, a name that recalls the UAE’s highest peak in Ras Al-Khaimah, is the brainchild of a team of academics and engineers who embarked on the project because they felt too few LLMs were credibly multilingual.




The Ameca humanoid robot greets visitors at Dubai's Museum of the Future. (AFP)

Downloadable on the machine learning platform Hugging Face, Jais is the result of a collaboration between Cerebras Systems, Mohamed bin Zayed University of Artificial Intelligence, or MBZUAI, and a subsidiary of the Abu Dhabi-based G42 called Inception.

“It is vital that large language models are developed for languages other than English to ensure that innovation is accessible to everyone,” Andy Jackson, CEO of Inception, told Arab News.

“A quality Arabic LLM is critical for all sectors, businesses and organizations, as well as individuals. Innovation thrives when we collaborate, and Jais sets a new standard for AI advancement in the Middle East, ensuring that the Arabic language, with its depth and heritage, finds its voice within the AI landscape.

“Jais demonstrates our commitment to excellence, and our dedication to democratizing AI and promoting innovation.”

LLMs are functional machine learning models that use deep learning algorithms to process and understand natural human language. These models are then trained on large amounts of text data to learn patterns in the language.

These programs, which are rapidly proliferating in the wake of ChatGPT’s success, are capable of generating text on a seemingly endless array of subjects, producing everything from academic papers to poetry.

What is especially impressive about them is their ability to create responses to questions that are so convincingly human-like in almost any language, including coding.

But in order to make those languages sound convincing, native-speaking human programmers are often required to provide a critical layer of context and understanding that can enhance accuracy and reliability.

“Jais is purpose-built for the Arabic language and excels in capturing its intricacies and nuances, ensuring highly accurate and contextually relevant responses — a distinct advantage over general-purpose models,” said Jackson.




AI programs that are responsive to the Arabic language could widen access to a transformational new technology. (MBZUAI)

“This specialization is a pivotal development, opening up opportunities for governments, industries, and individuals across the Arab world to tap into the potential of generative AI.”

Currently considered among the foremost Arabic LLMs, Jais, a 13-billion parameter model, was trained on a newly developed 395-billion-token Arabic and English dataset on Condor Galaxy, one of the largest cloud AI supercomputers in the world, launched by G42 and Cerebras in July using 116 billion Arabic tokens and 279 billion English tokens.

“Jais was born in Abu Dhabi and offers more than 400 million Arabic speakers the opportunity to harness the potential of generative AI,” Preslav Nakov, professor and deputy department chair of Natural Language Processing at MBZUAI, told Arab News.

“It will facilitate and expedite innovation, highlighting Abu Dhabi’s leading position as a hub for AI, innovation, culture preservation and international collaboration.”

As an open-source model, Jais is expected to engage scientists, academics and developers to accelerate the growth of a an Arabic language AI ecosystem. It could also serve as a model for other languages now underrepresented in mainstream AI.

FASTFACTS

• Large language models, or LLMs, are a type of AI that can mimic human intelligence.

• Arabic is spoken by 400m people, but accounts for 1 percent of total global online content.

• Jais was created by Cerebras, MBZUAI, and a subsidiary of G42 called Inception.

“Jais outperforms existing Arabic models by a sizable margin,” said Nakov. “It is also competitive with English models of similar size despite being trained on significantly less English data.

“This exciting result shows that the model’s English component learned from the Arabic data and vice versa, opening a new era in LLM development and training.”

In Jais’s development, significant attention was devoted to pre-processing Arabic text, enhancing support for the language’s unique features, including its writing style and word order.

Jais also maintains a balanced Arabic-English dataset focus for optimal performance, offering a marked improvement over models with a limited Arabic text presence.

Its developers say Jais, unlike other models, captures linguistic nuances and even comprehends various Arabic dialects and cultural references.

“Jais facilitates faster customization for specific Arabic-focused use cases and addresses data ownership concerns by being based in the UAE, offering a reassuring solution for local enterprises,” said Inception CEO Jackson.




LLMs are functional machine learning models that use deep learning algorithms to process and understand natural human language. (Supplied)

The UAE’s Ministry of Foreign Affairs and Ministry of Industry and Advanced Technology, Abu Dhabi’s National Oil Company and Department of Health, Etihad Airways, First Abu Dhabi Bank, and global technology group e& are planning to utilize Jais, offering valuable insights to enhance the model and its applications across their industries.

Given the strong digital transformation efforts by several of the Arab Gulf governments, accompanied by huge investments in high-tech industries and homegrown tech startups, AI programs that are responsive to the Arabic language could widen access to a transformational new technology and challenge the monopoly of a clutch of Silicon Valley companies.

Last month, Technology Innovation Institute, an Emirati research center in Abu Dhabi, released Falcon 180b, an open-source AI model. Established in 2020, TII released Falcon 40b, the first version of its flagship open-source AI model, in May this year, after unveiling Noor, an Arabic-based AI model, last year.

According to a report in The Economist magazine, TII is the applied-research arm of the Advanced Technology Research Council, a government agency that employs an 800-strong multinational staff working on subjects from biotechnology and robotics to quantum computing.

“We are entering the game to disrupt the core players,” Faisal Al-Bannai, secretary-general of the ATRC, told The Economist, adding that TII will build new proprietary models and applications catering for specific fields such as medicine and law.

For its part, Saudi Arabia launched its National Strategy for Data and Artificial Intelligence in October 2020, aiming to become a global leader in the field as it seeks to attract $20 billion in foreign and local investments by 2030.

The Kingdom is also determined to future-proof its workforce, initially by training and developing a pool of 20,000 AI and data specialists. In May this year, Deloitte’s AI Institute was officially launched at the Experience Analytics conference in Riyadh.

Just last week Saudi Arabia launched a National Olympiad for Programming and Artificial Intelligence open to all middle- and high-school pupils. An estimated 300,000 students will be selected from 3 million participants for training in programming and AI, according to media reports.




The hope is that the advent of AI and the automation of rapid translation will be a game changer for Arabic content. (LEAP)

The initiative is a collaboration between the Saudi Data and Artificial Intelligence Authority, in collaboration with the Ministry of Education and King Abdulaziz and His Companions Foundation for Giftedness and Creativity (Mawhiba).

Saudi Arabia’s adoption of digitalization and emerging technologies is forecast to contribute about 2.4 percent to its gross domestic product by 2030, according to a recent report by global consultancy firm PwC.

In terms of average annual growth in the contribution of AI by region, Saudi Arabia is expected to grab a 31.3 percent share in the technology’s expansion between 2018 and 2030, the PwC report added.

“AI is developing rapidly, and its impact will be felt more and more across all sectors and areas of life,” said MBZUAI’s Nakov. “In this context, it is vital that the Arab world has access to an advanced LLM that can be adapted and utilized across all sectors.

“The rapid advancement of AI means that organizations that fail to adapt and start using AI sooner rather than later will be left behind, which makes it even more essential for the Arab world to have access to quality LLMs.”

Beyond its business applications, however, a crucial aspect of a program such as Jais is its ability to champion neglected languages, preserve them in a fast-changing economy, and promote digital inclusivity.

Although Arabic is an official language in 22 countries and is partly spoken in 11 others, it accounts for just 1 percent of total global online content, according to Jais’s creators. The hope is that the advent of AI and the automation of rapid translation will be a game changer.

By placing the language at the forefront of the AI revolution, Jais and its successors could help to maintain Arabic’s global prominence and its distinctive cultural significance in the digital age.


Beirut blast investigator charges 10 more people: judicial official

Updated 10 sec ago
Follow

Beirut blast investigator charges 10 more people: judicial official

BEIRUT: Beirut blast investigator charges 10 more people, judicial official says. 


Palestinian president meets Red Cross chief in Ramallah

Updated 22 min 35 sec ago
Follow

Palestinian president meets Red Cross chief in Ramallah

  • Mirjana Spoljaric assessed the humanitarian needs of Palestinians in the Gaza Strip
  • Mahmoud Abbas underlined the significance of the upcoming ICRC conference in Switzerland

LONDON: Palestinian President Mahmoud Abbas met with Mirjana Spoljaric, the president of the International Committee of the Red Cross, at the Palestinian Authority’s headquarters in Ramallah on Thursday.

Abbas expressed gratitude to Spoljaric for visiting the Gaza Strip this week to assess the humanitarian needs of nearly 2 million Palestinians who have endured 15 months of war with Israel.

Younis Al-Khatib, president of the Palestinian Red Crescent Society, attended the meeting.

The PA is dedicated to allowing Red Cross teams to deliver humanitarian relief materials to the Gaza Strip without restrictions, the Palestine News & Information Agency reported.

Abbas outlined to Spoljaric the significance of the ICRC conference in Switzerland in March, which will address issues concerning Palestine, including the treatment of prisoners in Israeli jails and the occupation policies in the Palestinian territories.


US envoys working to resolve last-minute dispute over Gaza deal, US official says

Updated 41 min 39 sec ago
Follow

US envoys working to resolve last-minute dispute over Gaza deal, US official says

  • The dispute was over the identities of several prisoners that Hamas is demanding to be released
  • Working on the issue is President Joe Biden’s Middle East envoy, Brett McGurk

WASHINGTON: A last-minute glitch surfaced on Thursday in the details of the Gaza ceasefire-for-hostages deal and US envoys are working to resolve it, a US official said.
The dispute was over the identities of several prisoners that Hamas is demanding to be released, the official said. The official said the issue is expected to be resolved soon.
Working on the issue is President Joe Biden’s Middle East envoy, Brett McGurk, and President-elect Donald Trump’s envoy, Steve Witkoff. They are both in Doha with Qatari and Egyptian negotiators, the official said.
“We’re aware of these issues and we are working through them with the Israeli government, as well as other partners in the region. We are confident these implementing details can be hammered out and that the deal will move forward this weekend,” White House national security spokesperson John Kirby said separately.
The agreement, reached on Wednesday, is supposed to begin to be implemented on Sunday.


Bootleg alcohol claims lives of at least 30 people in Turkiye

Updated 39 min 29 sec ago
Follow

Bootleg alcohol claims lives of at least 30 people in Turkiye

  • Six people were detained for allegedly selling the counterfeit drinks and two suspects were charged with "deliberate murder"
  • Many people resort to cheaper alternatives or homemade spirits as the prices of alcoholic beverages continue to rise

ANKARA: At least 30 people have died in Istanbul over the past three days after drinking bootleg alcohol, Turkiye’s state-run news agency reported Thursday, as authorities intensified a crackdown on counterfeit drinks.
The dead were among some 80 people who sought treatment in hospitals around Istanbul, Anadolu Agency reported. At least 31 patients were in intensive care units.
Deaths from counterfeit alcohol has become increasingly frequent in Turkiye, where the prices of alcoholic beverages continue to rise. Many people, confronted with ever-increasing costs, resort to cheaper alternatives or homemade spirits, increasing the risk of poisoning from toxic substances.
A combination of soaring inflation and government taxes has driven beverage prices to all-time highs.
On Wednesday, six people were detained for allegedly selling the counterfeit drinks while two other suspects were charged with “deliberate murder,” the Istanbul governor’s office said in a statement.
Authorities also seized 29 tons of bootleg alcohol in raids around Istanbul since Jan. 1 and revoked the licenses of 64 businesses for allegedly selling counterfeit or smuggled alcohol, according to the statement.
“We consider those who cause the death of dozens of our citizens by producing or selling fake alcohol to be no different from the terrorists who kill people,” the statement said. “Our fight against the scoundrels who attempt to kill our people for material gains will continue unabated.”


Netanyahu bets on political survival with Gaza ceasefire

Updated 16 January 2025
Follow

Netanyahu bets on political survival with Gaza ceasefire

  • Parents of soldiers fighting in Gaza have accused Netanyahu of derailing months-long efforts to end the fighting for political gain
  • Far-right members of Netanyahu’s coalition have threatened to quit his administration over any ceasefire deal

JERUSALEM: Israel’s Prime Minister Benjamin Netanyahu has faced pressure for months from political allies and the families of hostages and soldiers to end the Gaza war, but analysts say he now hopes the ceasefire will help him stay in power.
The ceasefire and hostage release deal announced by mediators Qatar and the United States on Wednesday represents a pivotal moment for the Israeli leader.
Since Hamas’s October 7, 2023, attack on Israel, Netanyahu has faced sharp public criticism for not securing the release of hostages sooner.
Parents of soldiers fighting in Gaza have accused Netanyahu of derailing months-long efforts to end the fighting for political gain, as he battles corruption charges in a lengthy trial.
Some 800 parents of soldiers earlier this month sent him a letter saying they could no longer “allow you to continue sacrificing our children as cannon fodder.”
More than 400 troops have been killed in the Palestinian territory since the start of the war.
But far-right members of Netanyahu’s coalition have threatened to quit his administration over any ceasefire deal and pushed for an even harder response in Gaza.
Despite the conflicting pressures, analysts say that the obstacles clouding his mandate in recent months are unlikely to bring down the leader long seen as a political survivor.
After the October 7 attack, which resulted in the deaths of 1,210 people, mostly civilians, Netanyahu vowed to crush Hamas and bring home the hostages.
During their assault, militants took 251 people hostage, 94 of whom are still being held in Gaza, including 34 the Israeli military says are dead.
While Hamas has not been defeated, Israel has decimated its leadership and its military structure.
It has also massively weakened its Lebanese foe Hezbollah in a parallel war to the north that took out the Iran-backed group’s longtime leader Hassan Nasrallah and a string of other commanders.
Netanyahu could now seek a way to use the ceasefire agreement to his advantage, potentially by pivoting away from the far-right coalition partners he has relied on since 2022.
The deal could even pave the way to a long-sought normalization deal with Saudi Arabia, backed by incoming US president Donald Trump.
“The key is not the situation but how you play the game, and the bottom line is that (Netanyahu) is the best player of the game there is,” said Jonathan Rynhold, head of the political studies department at Bar-Ilan University in Tel Aviv.
Before the Hamas attack, Israeli ally the United States was close to clinching a normalization deal between Saudi Arabia and Israel.
“The question is what is Netanyahu getting out of the deal beyond the hostage release and the ceasefire and that is where we get into the Saudi question,” said Anshel Pfeffer, a journalist and author of a 2018 biography of Netanyahu.
He said it was possible that the agreement “could be part of something much bigger... Trump wants a deal” between Saudi Arabia and Israel.
While Netanyahu’s far-right partners have vowed to oppose the ceasefire, Pfeffer said it was unlikely any disagreements in the ruling coalition would bring him down.
Still, the ceasefire will be “a moment of truth” for Netanyahu, where he might try to “pivot away from the far right in the coalition to some sort of legacy-defining deal with the Saudis.”
After all but crushing his enemies in Hamas and Lebanon, Gayil Talshir, a political scientist at the Hebrew University of Jerusalem, said Netanyahu may no longer need to rely on the far right.
Bezalel Smotrich, the finance minister, and Itamar Ben Gvir, the security minister, are both far-right members of Netanyahu’s cabinet and have expressed their opposition to the deal.
“It may well be that both Smotrich and Ben Gvir will not be part of such a deal, which means that behind heavy curtains, it may be the case that Netanyahu is preparing for that day,” Talshir said.
She noted that former defense minister Benny Gantz, opposition leader Yair Lapid and other figures have already indicated they would work with Netanyahu if he reaches an agreement to free the hostages or if he strikes a deal with Saudi Arabia.
Aviv Bushinsky, a political commentator and Netanyahu’s former chief of staff, said that despite some turbulence sparked by the ceasefire, “politically, it’s not a game changer.”
Nonetheless, the October 7 attack would continue to cast a shadow over Netanyahu, he said.
The prime minister “will want people to remember the ones he has managed to bring back but not the ones he was unable to bring back,” Bushinsky said.
“But this thing will continue to haunt him... It will be the first time since Israel was established” that its military was unable to rescue missing civilians, he added.