Teaching machines to speak Arabic

Short Url
Updated 08 November 2025
Follow

Teaching machines to speak Arabic

  • Innovation is helping AI understand the region’s language, culture, and voice

JEDDAH: As developers across the Arab world work to formalize Arabic for artificial intelligence — grappling with its many dialects, limited datasets, and deep cultural nuance — English-based AI systems have continued to surge ahead. Now, industry experts say it’s time for Arabic users to gain the same technological momentum.

The performance gap between Arabic and English natural language processing is most visible in speech recognition, where pronunciation, rhythm, and vocabulary differ sharply across dialects. These variations make it challenging for one model to understand spoken Arabic with consistent accuracy.

Despite these hurdles, progress is accelerating. With rising investment and government-backed initiatives led by Saudi Arabia and other regional powers, Arabic AI is steadily closing in on English in sophistication and accessibility.




As Arabic AI evolves, experts emphasize the importance of cultural nuance and dialect diversity in future language models. (aramcoworld.com)

Amsal Kapetanovic, head of KSA at Infobip, told Arab News: “While written NLP tasks like basic chatbots can be managed with additional work, speech recognition really exposes the limitations of current models. It requires even more fine-tuning and adaptation to handle the diversity of spoken Arabic effectively. This is where the gap between Arabic and English NLP is most pronounced.”

Infobip’s recent collaborations with telecom and private sector partners across the Gulf reveal a similar pattern: Arabic chatbots and virtual assistants often require greater oversight in their early stages than English systems. However, once they are retrained using region-specific conversational data and Gulf dialects, both accuracy and customer satisfaction rise sharply.

Arabic remains one of AI’s greatest linguistic challenges. Unlike English, it is not a single unified language but a family of dialects stretching from Asia to Africa. Its complex morphology — with prefixes, suffixes, gender and number agreement, and the absence of short-vowel diacritics — poses major obstacles for tokenization and model training.

Opinion

This section contains relevant reference points, placed in (Opinion field)

Kapetanovic referenced a 2025 study published in JMIR Medical Informatics (“InfectA-Chat: An Arabic Large Language Model for Infectious Diseases”), which tested instruction-tuned models like GPT-4 in both English and Arabic. The research found that Arabic models still trail English by 10–20 percent in complex tasks.

“Arabic models still lag slightly behind English ones, particularly in areas like accuracy and sentiment analysis,” he said. “This is primarily due to the smaller size of Arabic training datasets and the complexity of Arabic dialects.”

He added: “Arabic itself is a family of languages and dialects — much richer and more complex than many others. This diversity adds another layer of challenge.”




Amsal Kapetanović, head of KSA unit at Infobip. (Supplied)

Yet optimism remains strong. “The good news is that there is significant investment happening, especially in the MENA region, with countries like Saudi Arabia leading the way,” Kapetanovic said. “Initiatives like Vision 2030 are accelerating progress, and we’re seeing more focus on localizing AI for Arabic speakers.”

Speech recognition continues to represent the most visible gap. “A Lebanese speaker and a Saudi speaker might use different words and speak at different speeds, making it challenging for a single model to recognize and process spoken Arabic accurately,” he said.

Localization, Kapetanovic explained, extends far beyond translation. “At Infobip, we are defining the evolution of communications in co-creation with our customers and partners throughout the region. Gartner has recognized us as a Leader in their 2025 Magic Quadrant for CPaaS. We are committed to delivering the next generation of AI-powered customer conversations to unlock seamless, high-impact engagement for MENA businesses. That’s why we put a strong emphasis on localizing our AI-driven platforms and tools to serve Arabic-speaking users effectively.”




Technical, cultural, and ethical challenges shape the future of Arabic AI, as developers strive for inclusion and linguistic parity. (aramcoworld.com)

Real-world applications are already bearing fruit. “For example, Nissan Saudi Arabia rolled out a WhatsApp chatbot (‘Kaito’) that handles customer queries in both Arabic and English,” he said. “These bots leverage Infobip’s Answers platform, which includes built-in NLP capabilities for Arabic — such as right-to-left text support and Arabic stop-word recognition — to interpret queries and intent.”

“For Saudi Arabia and the Gulf, we’ve gone beyond simple translation by implementing features and partnerships tailored to the region,” he continued.
“We’ve partnered with Lucidia, a leading Saudi tech company, to co-develop solutions that address local business needs and integrate with popular regional channels like WhatsApp and X.”
“We’ve also built language models that recognize Gulf-specific dialects and cultural expressions, making our chatbots and automation tools more intuitive for users. Additionally, our platform supports local payment integrations and business workflows unique to the region. These initiatives reflect our commitment to delivering genuinely localized technology, not just Arabic language support.”

DID YOU KNOW?

• Saudi Arabia is leading investment in Arabic AI, with Vision 2030 initiatives.

• AI can become biased and exclusionary if it does not speak or understand Arabic well.

• Infobip’s Arabic chatbots now ‘think’ in Gulf dialects, improving accuracy.

Cultural understanding, he added, is key to truly human-like AI. “Culturally aware AI should ideally be AI that understands the why behind the what,” he said. “It’s about deep research and understanding the background — not just giving straight answers to straight questions.”

“At Infobip, we integrate with multiple large language models and do so in an agnostic way,” he said. “We combine them and see which ones serve which purpose, giving us the flexibility to avoid pitfalls like AI hallucination or unwanted replies.”

The ethics of language and inclusion

Kapetanovic cautioned that neglecting Arabic in AI development poses not only technical risks but ethical ones.

“The ethical risk is that AI can become biased and exclusionary if it doesn’t speak or understand Arabic well,” he said. “If AI systems don’t handle certain languages or dialects properly, or if they lack enough regional data, they can exclude parts of the narrative or reinforce bias.”

“It’s essential for everyone in the AI ecosystem to contribute to making AI as inclusive and democratized as possible. Otherwise, we risk reinforcing disparities in services, information, and opportunities.”
 

 


BYD Americas CEO hails Middle East as ‘homeland for innovation’

Updated 21 January 2026
Follow

BYD Americas CEO hails Middle East as ‘homeland for innovation’

  • In an interview on the sidelines of Davos, Stella Li highlighted the region’s openness to new technologies and opportunities for growth

DAVOS: BYD Americas CEO Stella Li described the Middle East as a “homeland for innovation” during an interview with Arab News on the sidelines of the World Economic Forum.

The executive of the Chinese electric vehicle giant highlighted the region’s openness to new technologies and opportunities for growth.

“The people (are) very open. And then from the government, from everybody there, they are open to enjoy the technology,” she said.

BYD has accelerated its expansion of battery electric vehicles and plug-in hybrids across the Middle East and North Africa region, with a strong focus on Gulf Cooperation Council countries like the UAE and Saudi Arabia.

GCC EV markets, led by the UAE and Saudi Arabia, rank among the world’s fastest-growing. Saudi Arabia’s Public Investment Fund has been aggressively investing in the EV sector, backing Lucid Motors, launching its brand Ceer, and supporting charging infrastructure development.

However, EVs still account for just over 1 percent of total car sales, as high costs, limited charging infrastructure, and extreme weather remain challenges.

In summer 2025, BYD announced it was aiming to triple its Saudi footprint following Tesla’s entry, targeting 5,000 EV sales and 10 showrooms by late 2026.

“We commit a lot of investment there (in the region),” Li noted, adding that the company is building a robust dealer network and introducing cutting-edge technology.

Discussing growth plans, she envisioned Saudi Arabia and the wider Middle East as a potential “dreamland” for innovation — what she described as a regional “Silicon Valley.” 

Talking about the EV ambitions of the Saudi government, she said: “If they set up (a) target, they will make (it) happen. Then they need a technology company like us to support their … 2030 Vision.”