In recent weeks, Kenya has witnessed a rapid rise in civic tech development, driven by Gen Z activists protesting the Kenya Finance Bill 2024. Alongside the street protests, a digital movement emerged to break down the bill into digestible, easy-to-understand formats. This effort empowered Kenyans by explaining the reasons behind the protests using short videos on TikTok, X (formerly Twitter), and Instagram.
These videos quickly spread across WhatsApp groups, initially in English but soon expanding into more than 10 local languages to reach Gen Z and parents in rural areas. This grassroots translation effort emphasized the voluntary spirit behind the movement. However, it also revealed a significant gap: an inclusive AI ecosystem that caters to Africa’s linguistic diversity.
A Large Language Model (LLM) released during the protests could only support English and Kiswahili, further highlighting the underrepresentation of African languages in AI tools.
Kiswahili’s Role in Digital Transformation and AI Inclusion
Coincidentally, Kenya hosted the World Kiswahili Language Day celebrations from July 5-7, 2024, marking a significant moment for the language’s promotion. This annual celebration, established by UNESCO in November 2021, recognizes the role of Kiswahili in cultural preservation, social participation, and awareness creation. With over 200 million speakers in more than 14 countries, Kiswahili is not only the most widely spoken language in Sub-Saharan Africa but also ranks among the top 10 most spoken languages in the world.
Despite its prominence, Kiswahili’s digital presence remains limited, placing it among low-resource languages. This lack of digital resources reflects a broader trend affecting over 2,000 African languages—most of which are either underrepresented or completely absent in the AI landscape.
The Digital Divide and the Exclusion of African Languages
The limited availability of digitized African language data creates several challenges:
- Limited access to AI-powered tools such as virtual assistants (e.g., Siri, Alexa) and machine translation services.
- Barriers to AI innovation due to the lack of text and speech data required to train models for African languages.
- Exacerbation of the digital divide, excluding millions of Africans from economic and technological opportunities.
The AI revolution requires three critical steps:
- Language data collection – gathering text and speech resources for training.
- Model development – building AI tools using these datasets.
- Tool deployment – making these tools accessible to users.
However, without adequate datasets, Natural Language Processing (NLP) researchers cannot create the AI tools needed for African languages, limiting access to essential technology.
A Global Language Readiness Index: Paving the Way Forward
Given the importance of datasets for bespoke African AI models, there is an urgent need for a systematic approach to language data creation. An African Language Readiness Index—similar to the Government AI Readiness Index 2023—could be a game-changer. Such an index would provide a framework to:
- Identify gaps in data availability for African languages.
- Set priorities for data collection and expansion efforts.
- Design strategies to efficiently collect and expand language datasets.
Through collaborative efforts across the AI and NLP sectors, such an index could assess how prepared a language is for machine learning applications. This systematic approach would accelerate the development of African language tools, ensuring speakers of these languages can access digital services in their native tongues.
Building an Inclusive Digital Future for Africa
The recent surge in digital civic engagement across Kenya is a clear indication of Africa’s growing interest in technology-driven activism. However, to ensure that all Africans benefit from this digital wave, the tools and platforms being created must cater to all languages spoken across the continent.
Addressing the linguistic exclusion in AI will:
- Bridge the digital divide, ensuring more Africans have access to technology.
- Incorporate African voices in the global digital space.
- Boost economic growth, as people can interact with AI tools in their own languages.
The digital transformation unfolding in Kenya and across Africa offers an opportunity to reshape the AI landscape to include the diverse languages of the continent. By focusing on language inclusion, Africa can create a more inclusive and innovative future for all.