A Look at the Gender Distribution of Tweets about Global Development

Becky Band Jain and René Clausen Nielsen
May 19, 2016

Moments after the Sustainable Development Goals were adopted, UN Secretary-General Ban Ki-Moon said, “these Goals are a blueprint for a better future. Now we must use the goals to transform the world. We will do that through partnership and through commitment. We must leave no-one behind." The Post-2015 Sustainable Development Agenda is one of inclusion, where all voices are heard. Social media is a central means to hear these voices, understand the opinions of people around the world, and empower them to play a role in enacting this agenda.

Complementing the United Nation’s MY World global survey -- which pro-actively asked over 7 million people to chose the development topics that matter most to them -- Global Pulse and the Millennium Campaign also developed an interactive visualisation which showed how much people tweet about the same topics.

Showing the result of filtering through over 500 million daily Twitter posts for 25,000 keywords relevant to 16 development topics, the visualisation illustrates which countries proportionally talk most about those topics on social media. Using keywords in English, French, Spanish and Portuguese the filter yielded about 10 million new tweets each month reaching a total of over 350 million between May 2012 to July 2015. To see how this was done, take a look at the taxonomy of keywords here, read a summary of the project here, and further background context here.

Taking this project a step further, the original data has now been sex-disaggregated and visualised, thanks to the support of Data2X, and technical help from the Centre for Innovation at Leiden University. This makes it possible to analyse how vocal women and men are on Twitter on those same topics. (The code used to sex-disaggregate tweets is available on GitHub and you can test the functionality on the demo site.)

Explore the interactive visualization at:

Read on for a a selection of preliminary insights from the gender distribution, and for some very important caveats about the method.

Global-Level Insights

Overall, the top 5 topics tweeted about by both men and women were: “An honest and responsive government,” “Better job opportunities,” “Freedom from discrimination and persecution”, “A good education,” and “Political freedoms.” In the data collected through this analysis, women proved to be much more likely to tweet about "equality between men and women," and also more likely to tweet about "education" and "freedom from discrimination." Women were also more likely to tweet about (the lack of) phone and internet access. Women are more likely to tweet about "Protection against crime and violence" than men in almost all geographies.

A significantly higher percentage of tweets written by men were about “An honest and responsive government”, though it is also the most talked about topic by women. 

Country-Level Analysis

Countries where we identified more women than men tweeting about development topics overall included Kyrgyzstan, Mali, Libya, Chad, Central African Republic, Republic of Congo, Eritrea, South Korea, Myanmar, Indonesia, and the Philippines. This incidence can be attributed to a variety of demographic and technological differences. (Review the Caveats section below for more.)

Overall, the distribution revealed that both women and men were vocal about development-related topics across both the Global North and Global South. For this summary, we focus mostly on countries with a larger volume of tweets, for the sake of finding strong stories.


In Brazil, men tweeted much more than women about “Better job opportunities,” more about "Political freedoms", and roughly the same about “Freedom from discrimination”. Women more often generated tweets, over 2015 in particular, related to “Protection against crime and violence” and “Phone and internet access.” When talking about the former, they most often mentioned fear of gender-based violence, while tweets about "Phone and internet access" were usually about poor phone or internet signal. Brazilians are the 8th most likely of the 193 countries to tweet about "Phone and internet access".

There was a spike in tweets with keywords related to "honest and responsive government" by both men and women before the October 2014 elections, in which there were widespread allegations of corruption against the incumbent government, and in March 2015 in which there were wide-scale protests against rising prices, corruption, and slow economic growth. Surprisingly, the "protection of forests, rivers and oceans" did not appear as a prominent theme of discussion on Twitter overall.

South Africa

"Freedom from discrimination and persecution" was overwhelmingly the most discussed topic among both men and women in South Africa during the period analysed. In the first half of 2015, the prevalence of tweets related to discrimination reached nearly 50%, linked to xenophobic attacks in Durban.

Women tended to discuss "Equality between men and women" and "Protection against crime and violence" more often than men. Women also tweeted more about "Protecting forests rivers and oceans" than men, although the topical gender distribution of tweets in South Africa seems to be quite equal.



In April of 2015, "Freedom from discrimination and persecution" received the most tweets in any month in Nigeria. It seemed to be because of the xenophobic attacks in South Africa mentioned above. Around that time, tweets related to ensuring peaceful and participatory elections were also common. As the election approached and conversations on a range of topics surrounding it intensified, male-female participation came closer to being even. In the same year, there was a drop in currency value in Nigeria, potentially causing social media content related to “An honest and responsive government” to rise. Around the time of the passage of Nigeria’s Same Sex Prohibition Act in early 2014, conversations surrounding "Freedom from discrimination and persecution" also rose.


In India, women tweet more often about "equality between men and women" and "freedom from discrimination than men." By 2015, it seemed as if men and women had started to tweet equally much about those topics, potentially indicating that it had become a mainstream topic of discourse.

Among women, the topic of "An honest and responsive government" was the most discussed followed by access to a good education and equality between men and women.

There was also a spike in conversations about "protecting forests, rivers and oceans" during the summer of 2015, driven by @Gurmeetramrahim's #MSGTreePlantationDrive campaign.


In Nepal, women are more likely to tweet about "Equality between men and women", "Protection against crime and violence" and "Freedom from discrimination" than men. The most mentioned topic among women was indeed gender equality while the second most discussed theme among women was "education."
In May 2015, following the April 25 earthquake, the discussions were mainly around “Support for people who can’t work". The focus of most tweets in this category was unsurprisingly on disaster relief. Through the period of the study, Nepal was only surpassed by the UK in the percentage of tweets would about "Support for people who can't work". Nepalis are also more likely to write about "Protecting forests rivers and oceans" than most other countries.



It is important to acknowledge that on Twitter, “global" data is highly skewed towards the US and other high-income countries. In the dataset analysed, we estimate that roughly a third of the tweets are from the US. The graph below shows the disparity of representation across income levels and regions in the approximately 300 million tweets captured in this project:

Number of Tweets by Income Group

A related caveat is that news originating from the United States tends to travel more widely on social media than news from most other countries. (For example, the same-sex marriage debates in the US were reflected in tweets from many other countries.) When people from any given country tweet about a topic, they are not necessarily talking about issues in their own country. 

Men tend to be more active users of Twitter -- and the internet in general -- than women, except in a handful of countries. Nonetheless, disaggregating the data by sex allows us to hear women’s voices more clearly.

Another point worth mentioning is that certain topics related to the Post-2015 development agenda are inherently gendered, especially the one dedicated to “Equality between men and women.” Another more subtle example is that "Protection against crime and violence" includes gender-based crime, so that is an underlying factor influencing discussions.

As mentioned above, the dataset used for this project was generated by filtering through millions of tweets using a categorized taxonomy of 25,000 keywords, which were developed manually with the support of many volunteers. The study only filtered for Tweets in French, English, Spanish and Portuguese. No keyword taxonomy can ever be perfect (there are nuances related to language and slang, as well as word choice and combinations of words used by people on social media that are not always precise).

Finally, a note on the method for sex-disaggregating tweets employed for this project. Tweets do not come with structured demographic information about the person who is posting. However, each Twitter user profile contains these elements:

  • Name of the User
  • Profile Photo
  • User Description
  • User Location
  • User URL
  • Birthday (only since July 2015; this field is only seldom filled out by users)


What this means is that a given Twitter user's sex must be inferred by drawing from these nuggets of information. In this study, a lexicon of first names - with associated sex - was built. If a Twitter user's name is listed as Jacqueline, the system infers that it is a woman, while if the name is listed as John, the styem infers that it is a man. If a person is called Alex, it is categorised as unisex. When a user's sex cannot be inferred from their name, the algorithm then tried to use Face++ face recognition technology to see if it can guess the sex of the user from their profile photo. Organisational Twitter handles were categorised as "organisation."

The method works on many profiles but comes with all the imperfections of inferred data.


By understanding the role gender plays in people's global development priorities, policymakers can better address the needs of their diverse constituents. To get there, it is necessary to get more and better sex-disaggregated data. Gender inequalities are often subtle and difficult to pinpoint.

Data is never neutral, but rather a product of social processes and biases, and as such we should always work to get better at understanding these biases. One step in that direction is disaggregating by gender, age, abilities, income levels, ethnicity and more where appropriate.

This experimental project aimed at developing a methodology for sex-disaggregating tweets is just one small step, and we encourage others to build upon and improve the methodology, or suggest different techniques. There is still much to do, but every step forward increases the chances of better understanding crucial aspects of the lives of women and girls.


This effort is part of a partnership with Data2X, which has spearheaded a number of research pilots to explore how different methods of collecting and analyzing big data could potentially close global gender gaps.




Add comment