As part of the 2020 JournalismAI Collab, an international team of journalists from news organisations in Europe, Asia, and Latin America came together to explore a big challenge: How might we leverage the power of AI to understand, identify, and mitigate newsroom biases? To find answers, the AIJO Project conducted two experiments using AI technology to uncover the binary gender representations in various news publications. The results show that we still have a long way to go to reach a balanced gender representation in media.
We wanted to explore what AI can do for us, not to us.
Failure to represent and report on different people, points of view, and lived realities feeds the social divisions we see around us all too often. If the moral argument isn’t enough, publishers can build a business case for inclusivity.
Although reports on the pitfalls of AI have been often justified, AI can also be leveraged to understand, identify, and mitigate human biases in the newsroom.
Challenges in terms of bias differ across countries, but many are shared across cultures, such as gender-related bias: that’s why the AIJO project decided to focus on how gender bias can manifest itself in the journalistic process.
In total, 28,051 images were analysed by the AI system. The results show that the average ratio of women represented in images used by news media stood between 22.9% and 27.2%. In contrast, that number ranged between 77.1% and 72.8% for men.
In recent years, AI’s potential harm in media settings has been subject to much public debate, with racist facial recognition algorithms, rabbit holes of radicalisation, and questionable use of automation. Many of the critical reports are justified. But what happens to the innovation power of the media industry if this is the only narrative? The AIJO team – eight news organisations that joined forces under the JournalismAI Collab – wanted to shift the focus on what AI can do for us, not to us. They decided to explore how we can leverage AI technologies to understand, identify, and mitigate biases within newsrooms that influence both organisational structures and the journalistic content itself.
Experimenting with Computer Vision and Natural Language Processing (NLP)
As the concept of bias differs across countries and context, the team decided to focus on gender representation, a shared bias across cultures.
The machine learning model deployed by the AIJO team went through 28,000 pictures sourced from the different media outlets involved to measure the number of images representing female and male characters. The results were predictable: men are significantly more represented in the media than women.
The process involved four steps: face detection, gender classification, filtering, and human review. The results showed that women’s average ratio was 22.9% and 77.1% for men. When the model identifies the gender for a given face in an image, it also outputs a confidence score value. This allows to manually review the images with lower confidence scores to achieve higher accuracy, with the least amount of effort. After the human review, the average ratio of women was 27.2% and 72.8% for men.
The AIJO project decided to focus their joint effort on exploring gender bias through images for their first experiment. Language-based AI solutions would have struggled in providing consistent accuracy across all of the languages in which the eight participants work and publish.
In a second experiment, the English-language news organisations involved in the project decided to use the momentum to assess the gender representation in textual content as well. With the help of the Gender Gap Tracker, provided by Simon Fraser University, the team was able to measure the proportion of men and women mentioned and quoted in articles. On average, women represented about 21% of the total people mentioned in articles, compared to around 73% for men (the remaining ones were not identified by the model). As for quoted sources, only 22% were women. The editorial priorities and the news events that took place during the analysis period impacted the results: many of the focus points of international news stories were men.
The project’s provided immensely valuable results, but one of the main takeaways of this collaboration is the need for diverse datasets. AI learns from datasets, and without proper diverse training data, we’ll end up building AI solutions that underperform for parts of our audiences.
Looking ahead, the AIJO team suggests that it would be valuable to explore how NLP can help uncover nuances on how we write and talk about different groups in society, beyond gender bias – for example, looking at what adjectives are most frequently associated with any given minority. This could provide important insights for any newsroom looking to make their journalism more inclusive.
- Explore the AIJO project
- Download the team’s presentation
- Read about the team’s Collab experience on the blog