Teaching machines to speak Arabic

Short Url
Updated 08 November 2025
Follow

Teaching machines to speak Arabic

  • Innovation is helping AI understand the region’s language, culture, and voice

JEDDAH: As developers across the Arab world work to formalize Arabic for artificial intelligence — grappling with its many dialects, limited datasets, and deep cultural nuance — English-based AI systems have continued to surge ahead. Now, industry experts say it’s time for Arabic users to gain the same technological momentum.

The performance gap between Arabic and English natural language processing is most visible in speech recognition, where pronunciation, rhythm, and vocabulary differ sharply across dialects. These variations make it challenging for one model to understand spoken Arabic with consistent accuracy.

Despite these hurdles, progress is accelerating. With rising investment and government-backed initiatives led by Saudi Arabia and other regional powers, Arabic AI is steadily closing in on English in sophistication and accessibility.




As Arabic AI evolves, experts emphasize the importance of cultural nuance and dialect diversity in future language models. (aramcoworld.com)

Amsal Kapetanovic, head of KSA at Infobip, told Arab News: “While written NLP tasks like basic chatbots can be managed with additional work, speech recognition really exposes the limitations of current models. It requires even more fine-tuning and adaptation to handle the diversity of spoken Arabic effectively. This is where the gap between Arabic and English NLP is most pronounced.”

Infobip’s recent collaborations with telecom and private sector partners across the Gulf reveal a similar pattern: Arabic chatbots and virtual assistants often require greater oversight in their early stages than English systems. However, once they are retrained using region-specific conversational data and Gulf dialects, both accuracy and customer satisfaction rise sharply.

Arabic remains one of AI’s greatest linguistic challenges. Unlike English, it is not a single unified language but a family of dialects stretching from Asia to Africa. Its complex morphology — with prefixes, suffixes, gender and number agreement, and the absence of short-vowel diacritics — poses major obstacles for tokenization and model training.

Opinion

This section contains relevant reference points, placed in (Opinion field)

Kapetanovic referenced a 2025 study published in JMIR Medical Informatics (“InfectA-Chat: An Arabic Large Language Model for Infectious Diseases”), which tested instruction-tuned models like GPT-4 in both English and Arabic. The research found that Arabic models still trail English by 10–20 percent in complex tasks.

“Arabic models still lag slightly behind English ones, particularly in areas like accuracy and sentiment analysis,” he said. “This is primarily due to the smaller size of Arabic training datasets and the complexity of Arabic dialects.”

He added: “Arabic itself is a family of languages and dialects — much richer and more complex than many others. This diversity adds another layer of challenge.”




Amsal Kapetanović, head of KSA unit at Infobip. (Supplied)

Yet optimism remains strong. “The good news is that there is significant investment happening, especially in the MENA region, with countries like Saudi Arabia leading the way,” Kapetanovic said. “Initiatives like Vision 2030 are accelerating progress, and we’re seeing more focus on localizing AI for Arabic speakers.”

Speech recognition continues to represent the most visible gap. “A Lebanese speaker and a Saudi speaker might use different words and speak at different speeds, making it challenging for a single model to recognize and process spoken Arabic accurately,” he said.

Localization, Kapetanovic explained, extends far beyond translation. “At Infobip, we are defining the evolution of communications in co-creation with our customers and partners throughout the region. Gartner has recognized us as a Leader in their 2025 Magic Quadrant for CPaaS. We are committed to delivering the next generation of AI-powered customer conversations to unlock seamless, high-impact engagement for MENA businesses. That’s why we put a strong emphasis on localizing our AI-driven platforms and tools to serve Arabic-speaking users effectively.”




Technical, cultural, and ethical challenges shape the future of Arabic AI, as developers strive for inclusion and linguistic parity. (aramcoworld.com)

Real-world applications are already bearing fruit. “For example, Nissan Saudi Arabia rolled out a WhatsApp chatbot (‘Kaito’) that handles customer queries in both Arabic and English,” he said. “These bots leverage Infobip’s Answers platform, which includes built-in NLP capabilities for Arabic — such as right-to-left text support and Arabic stop-word recognition — to interpret queries and intent.”

“For Saudi Arabia and the Gulf, we’ve gone beyond simple translation by implementing features and partnerships tailored to the region,” he continued.
“We’ve partnered with Lucidia, a leading Saudi tech company, to co-develop solutions that address local business needs and integrate with popular regional channels like WhatsApp and X.”
“We’ve also built language models that recognize Gulf-specific dialects and cultural expressions, making our chatbots and automation tools more intuitive for users. Additionally, our platform supports local payment integrations and business workflows unique to the region. These initiatives reflect our commitment to delivering genuinely localized technology, not just Arabic language support.”

DID YOU KNOW?

• Saudi Arabia is leading investment in Arabic AI, with Vision 2030 initiatives.

• AI can become biased and exclusionary if it does not speak or understand Arabic well.

• Infobip’s Arabic chatbots now ‘think’ in Gulf dialects, improving accuracy.

Cultural understanding, he added, is key to truly human-like AI. “Culturally aware AI should ideally be AI that understands the why behind the what,” he said. “It’s about deep research and understanding the background — not just giving straight answers to straight questions.”

“At Infobip, we integrate with multiple large language models and do so in an agnostic way,” he said. “We combine them and see which ones serve which purpose, giving us the flexibility to avoid pitfalls like AI hallucination or unwanted replies.”

The ethics of language and inclusion

Kapetanovic cautioned that neglecting Arabic in AI development poses not only technical risks but ethical ones.

“The ethical risk is that AI can become biased and exclusionary if it doesn’t speak or understand Arabic well,” he said. “If AI systems don’t handle certain languages or dialects properly, or if they lack enough regional data, they can exclude parts of the narrative or reinforce bias.”

“It’s essential for everyone in the AI ecosystem to contribute to making AI as inclusive and democratized as possible. Otherwise, we risk reinforcing disparities in services, information, and opportunities.”
 

 


Closing Bell: Saudi main index closes in red at 10,709

Updated 12 sec ago
Follow

Closing Bell: Saudi main index closes in red at 10,709

RIYADH: Saudi Arabia’s Tadawul All Share Index dipped on Thursday, losing 138.89 points, or 1.28 percent, to close at 10,709.04.

The total trading turnover of the benchmark index was SR6.59 billion ($1.75 billion), as 102 of the listed stocks advanced, while 154 retreated.

The MSCI Tadawul Index decreased, down 22.40 points or 1.52 percent, to close at 1,450.58.

The Kingdom’s parallel market Nomu lost 123.85 points, or 0.54 percent, to close at 22,792.98. This came as 30 of the listed stocks advanced, while 40 retreated.

The best-performing stock was Al-Rajhi Co. for Cooperative Insurance with its share price surging by 9.96 percent to SR74.50.

Other top performers included Jazan Development and Investment Co., which saw its share price rise by 9.89 percent to SR8.33, and Gulf Insurance Group, which saw a 7.48 percent increase to SR23.

On the downside, City Cement Co. and Al Gassim Investment Holding Co. saw declines, with their shares dropping by 5.51 percent and 4.22 percent to SR11.50 and SR13.15, respectively.

On the announcement front, Almoosa Health Co. has signed a construction contract with Almajal Alarabi Group valued at SR608.85 million to complete the electrical, mechanical, and architectural finishing works for the new Almoosa Specialized Hospital in AlHofuf City. 

The agreement, finalized on Feb. 26, covers all complementary internal and external works based on approved engineering designs to ensure the facility is fully operationally ready upon completion. 

According to a Tadawul statement, work on the project will commence immediately, with an expected completion timeline of 16 months. 

Almoosa Health intends to finance the development through a combination of its own resources and long-term Shariah-compliant facilities secured from local banks, with the financial impact anticipated to begin following the hospital’s completion and commissioning.

Almoosa’s share price surged by 4.24 percent to reach SR147.50.