An AI Ethics Case Study
In January 2019, IBM Research published a blog post about a new dataset that the company had just released. It began with a question: “Have you ever been treated unfairly?” It then added, “[m]ost people generally agree that a fairer world is a better world…. That’s why we are harnessing the power of science to create AI systems that are more fair and accurate.” The blog explained:
Today, IBM Research is releasing a new large and diverse dataset called Diversity in Faces (DiF) to advance the study of fairness and accuracy in facial recognition technology. The first of its kind available to the global research community, DiF provides a dataset of annotations of 1 million human facial images. Using publicly available images from the YFCC-100M Creative Commons data set, we annotated the faces using 10 well-established and independent coding schemes from the scientific literature …. We believe by extracting and releasing these facial coding scheme annotations on a large dataset of 1 million images of faces, we will accelerate the study of diversity and coverage of data for AI facial recognition systems to ensure more fair and accurate AI systems.
Initially, media outlets such as CNET, CNBC, and Venturebeat covered the announcement as an advance in the effort to combat various algorithms’ uneven levels of accuracy in identifying individuals from different groups. As Venturebeat explained, researchers had pointed out that “facial recognition made by Microsoft, IBM, and Chinese company Megvii misidentified gender in up to 7 percent of lighter-skinned females, up to 12 percent of darker-skinned males, and up to 35 percent of darker-skinned females”; in response, employees at many tech companies were working to reduce those misidentifications.
In March, however, an NBC News report focused on a different aspect of the database: the fact that the “YFCC-100M Creative Commons data set” consisted of pictures that people had uploaded to Flickr.
Reporter Olivia Solon noted that some of the Flickr users whose photographs selected by IBM for its dataset “were surprised and disconcerted when NBC News told them that their photographs had been annotated with details including facial geometry and skin tone and may be used to develop facial recognition algorithms.” She added that it is “almost impossible to get photos removed [from the dataset]. IBM requires photographers to email links to photos they want removed, but … there is no easy way of finding out whose photos are included.”
Moreover, according to IBM, even if a photo would be removed, upon request, from DiF, it would not be removed from versions already shared with researchers; about 250 organizations had requested access up to that point.
Following the NBC News article, a number of outlets published similar stories about the ways in which DiF had been created and distributed. This time, the headline in CNET read, “IBM Stirs Controversy by Sharing Photos for AI Facial Recognition.”
Several other databases which had already been used to train facial recognition algorithms had also been collected by scraping photographs from internet sites (YouTube, Google Images, Wikipedia, mug shot collections, and more). In June, after a critical article in the Financial Times, Microsoft took down MS Celeb--a different dataset compiled and made available for training facial recognition algorithms. Duke and Stanford also took down similar datasets.
As of January 2020, the IBM Research website still includes a page for the “Diversity in Faces” paper under “Publications”; however, clicking on what looks like a link for the download of the dataset leads to a more general page titled “Trusting AI.”
Discussion Questions
Who are the stakeholders involved—the people whose interests were directly or indirectly impacted by the creation and release of this database? Who should be consulted about such a project’s goals and development?
How might the development and deployment of this database be evaluated through the ethical lenses of rights, justice/fairness, utilitarianism, common good, and virtue ethics?
In this project, what moral values are potentially conflicting with each other? Are there ways to reconcile them, or to respect all relevant interests/values? If so, how?
For additional guidance, see the Markkula Center for Applied Ethics’ Framework for Ethical Decision-Making.