NLP of conversations in Swahili Slang (Sheng) (Well Told Story)

Continuing our partnership with Well Told Story, we are helping to develop their in-house data infrastructure.

In 2018, we will be supporting WTS’ data analysis work with natural language processing tools and expertise that can simplify and deepen analysis of the unprecedented volume of audience engagement WTS receives via SMS and Social Media.

Our partnership with Well Told Story (WTS) stretches back to 2014.

Through a rich multi-media platform, Well Told Story engages young people in Kenya and Tanzania around social and economic issues.

A core aim for Well Told Story is engaging young East Africans in conversations around contraception to better understand and shift the social norms that underpin their decisions. Kenya in particular registers high adolescent pregnancy rates (World Bank, 2016) and over 1.5 million Kenyans live with HIV (Avert, 2016). Better reproductive health and contraceptive use have great potential to mitigate these public health issues, with profound implications for the economic, social and educational outcomes of young people. There is a pressing need to understand youth attitudes towards contraception and how their discourse is changing over time to identify the barriers to use and increase safe practices.

Shujaaz, the WTS media series that spans comic books, radio shows and social media messaging, provides an accessible and candid platform for youth to discuss and learn about contraception, a topic that is usually taboo. Shujaaz content reaches over six million youth in East Africa, who engage predominantly by sending messages in Sheng, an urban youth slang that combines Swahili, English and other local languages. These messages create rich but complex data that represent how youth communicate and understand their decisions around contraception.

Over the years, AVF has worked with WTS to create and test tools that bring the power of big data analytics with a high degree of accuracy to a rapidly evolving and low-resource language.

What we're doing

Thus far, AVF has developed technology systems to support single projects or test new approaches to textual data analysis. In 2018, we are to consolidating and scaling what we have learned by building tools that can systematically process high volumes of data (currently, WTS receives nearly 100,000 messages a month) and provide important audience analytics on topics and sentiment that can be incorporated into new media content designed by WTS.

As WTS’ usership grows, its manual data analytics processes become untenable. The time spent producing regular reports about campaign performance and user participation prevents the investment of time and resources into deeper qualitative exploration of the data. Thanks to several successful pilot projects, AVF is well positioned to provide algorithmic tools that can automate certain parts of the analysis process. These tools can also benefit from the increasing size of data, gaining accuracy as they are used over time.

AVF is developing two algorithmic tools; one will classify messages by topic, and the other will analyze the sentiments expressed in the messages. These tools will be integrated into the existing WTS data workflow and support planned investments for more data management systems. Once both tools have been built using data relating to contraception they can be expanded to cover other topics relevant to WTS.

The methods of textual analysis used in both tools have rarely been applied to low-resource languages like Sheng and Swahili, making the academic contributions of this project just as impactful as the increased research capabilities for WTS.

With these tools WTS will be able to track SMS conversations in response to their media in near time and measure the sentiments in these conversations, allowing them to ascertain if increased engagement is leading to positive shifts in social norms. While improving internal capacity for analysis, the near-time insights from using these tools will also enable WTS to be more responsive to their audience by adjusting their content accordingly. As a result, WTS will be able to better engage the youth of East Africa and facilitate interactive and productive discussions about contraception.

To be continued...

Our work with WTS is underway, but check back here soon for updates on our progress!