Findings Usability Research on the Algorithmic Equity Toolkit:
Participatory Design on the Algorithmic Equity Toolkit We had CAIR, Densho, the ACLU, and grassroots civil rights activists complete a survey to test their baseline knowledge, understanding, concerns, and comfort regarding ADS’s, and then we had them complete it afterwards.
We found that there was a wide diversity in the baseline understanding regarding automated decision system (ADS) technologies. A third of the 9 people surveyed said they had a limited understanding of ADS technologies, another third said they had a basic understanding, and another third said they had an advanced understanding. The leaders of civil rights organizations were the only ones with an advanced understanding of ADS technologies. Two-thirds of civil rights organizers had a basic understanding of ADS technologies, and two-thirds of civil rights activists had a limited understanding of them.
As expected, the level of understanding about ADS technologies increased with the leadership hierarchy and concern about these technologies increased with level of understanding. A person’s comfort with testifying on ADS technologies was influenced by their baseline knowledge of these systems and their experience with testifying on government bills and policies. From these findings, we determined that our toolkit may be read by organizers and civil rights activists that may have mixed baseline understandings of ADS technologies. Thus, we moved more complex information in the toolkit to the appendix so that it was accessible for both audiences. If people had an interest in learning more advanced concepts like ADS in-depth explanations or follow-up questions, then they could read the appendix for additional information.
Most of the people surveyed said that the toolkit significantly or substantially improved their knowledge about ADS technologies. The toolkit also slightly improved comfort in testifying on ADS technologies amongst all the people surveyed. The fact that the toolkit improved everyone’s level of comfort on publicly testifying on ADS technologies demonstrated that the toolkit was proving to be useful for our primary audience.
The feedback we received from civil rights advocates demonstrated that our toolkit needed to improve in many ways to increase its clarity. The users were also asked about how easily they could understand how the government technology examples used in the surveillance and ADS identification guide work. The vast majority responded with a 3 out of 5 with 5 being extremely easy to understand. Thus, we made many adjustments such as labeling the icons, rearranging concepts, and redefining terms in the identification guide. We also received comments through the survey that asked about laws currently regulating artificial intelligence, so we made sure to include that information in our toolkit. Through our follow-up meetings, we identified specific pain points and some suggestions for how to improve these lagging areas in our toolkit. A lot of the surveyed civil rights advocates expressed confusion about the use-case scenario for the toolkit. This feedback demonstrated that the toolkit had to improve in clarity in not only the concepts but also the instructions on how to actually use the toolkit. When developing the toolkit we had made implicit assumptions about how the toolkit would be used and the use-case scenario it would be situated in without being explicit about these assumptions within the actual toolkit. The usabiliInteractive Web
Interactive Web Demo:
Our analysis involved running ten celebrity photos in Open Face’s model using a database of 60 celebrity photos collected from Labeled Faces in the Wild and Google image searches. We then selected the top 8 closest images for each of the nine celebrity photos to include in our demo. Of all the ten celebrity photos, the minimum similarity score of the top 8 closest images was 0.15, between a photo of Aaron Peirsol and Ai Sugiyama, and the maximum similarity score was 1.384, between two different photos of LeBron James. Overall, celebrities with lighter skin tones had lower similarity scores than celebrities with darker skin tones. Our findings of differences in similarity scores along the lines of skin tone are consistent with the literature surrounding facial recognition software and accuracy according to skin tone (Buolamwini and Gebru 2018). Buolamwini and Gebru (2018)’s work demonstrates that increased diversity in the training set can increase a model’s accuracy for intersectional identities, such as for people and women of color. OpenFace trains its model on 13 public and 1 private dataset. The scope and time limitations of our project prevented us from exploring the breakdown of these datasets in regards to skin tone and gender. Future work could explore this breakdown to determine if lack of diversity in the training datasets may be responsible for higher similarity scores for people of color in our demo analysis.
Diverse Voices:
The Diverse Voices panels revealed that many disadvantaged communities have noticed and felt the disproportionate impact of surveillance technologies within their communities. Many of the panelists thought that the information presented in the toolkit was helpful. The panelists also identified other information they would like to see in the toolkit. There was a desire for the toolkit to include the impacts regarding the technology examples in the identification guide, and a desire for the guide to include information about where human decisions affect these government technologies. We received feedback that they would like the questionnaire to include resources on how to take action and how to get connected to organizations regarding some of the issues identified with these technologies. The panels also provided feedback to improve the clarity of the facial recognition web demo such as removing or fading facial images that do not match with the selected image. We are currently in the process of transcribing the audio recordings from the panels, finding major themes, and finding concrete recommendations for changes to the Algorithmic Equity Toolkit.
Deliverables
We designed an Algorithmic Equity Toolkit, a set of three tools for identifying and auditing algorithmic processes used in the public sector, especially regarding ADS technologies. The toolkit has three components:
Surveillance and ADS Identification Guide, for distinguishing surveillance and ADS’s and their different functions. This will help civil rights advocates ask the right questions depending on the technology type.
Questionnaire, for surfacing the social context of a given system, its technical failure modes (i.e., potential for not working correctly, such as false positives), and its social failure modes (i.e. its potential for discrimination when working correctly). The questionnaire is a list of sample questions that users can use to inquire about the potential harms of surveillance or ADS technologies.
An interactive facial recognition false positive web demo that illustrates the underlying harms and mechanics of facial recognition technology, one of the technologies in the questionnaire. Our interactive demo illustrates false positives, disparities in accuracy along lines of race and gender, and the potential harms with choosing too low a threshold to determine matches, resulting in false positives. Our demo aims to illustrate an algorithmic harm in an accessible way and involves interactivity for an engaging user experience. The primary users of the Algorithmic Equity Toolkit will be community members, including civil rights advocacy and grassroots organizations and anyone interested in algorithmic equity. We hope community members will be better empowered with this toolkit to hold government agencies accountable for the technologies they implement in their communities. The toolkit can be used when engaging with policymakers, government representatives, or when users want to learn more about surveillance and ADS technologies and their potential harms. With the toolkit, civil rights activists, grassroots organizers, and community members can identify a government surveillance and an ADS technology, how they work, and what are the potential social justice implications and harms with using the public sector systems. An ADS is a computerized implementation of algorithms to assist in decision-making. ADS’s are increasingly used in our society to analyze data and make decisions more quickly and efficiently; however, the increasing use of ADS’s decreases transparency and accountability due to their complexity and the lack of awareness about how they work.
We hope that with this toolkit civil rights activists can distinguish between surveillance tools from ADS tools and be empowered to challenge the implementation and expansion of both surveillance and ADS technologies by asking the right questions.
Steps for using the toolkit:
Step 1: Start with the Surveillance and ADS Identification Guide This guide should be used to help you determine whether a government technology is a surveillance or ADS tool or system. It will also help you understand the different functions of surveillance and ADS tools and systems. With this Surveillance and ADS ID Guide, civil rights advocates can better detect the presence of algorithms and what those features do.
Step 2: Questionnaire Use the questionnaire to inquire about the potential harms of surveillance or ADS technologies when engaging with policymakers and other public officials.
Step 3: Interactive facial recognition web demo Use the interactive demo on facial recognition tool to explore how facial recognition technology matches faces to identities based on setting a minimum similarity score, which we refer to as the threshold. The demo will also helps explain false positives, illustrate bias against people, especially women, of color, and draw attention to the philosophical problems with employing facial recognition technology, regardless of accuracy rates.
ACLU, WA as the primary stakeholder has provided connections to other community organizations such as Densho, CAIR who are members of the Tech Fairness Coalition. The organizations have provided insights, feedback and suggestions on the toolkit design and how to make it accessible for non-technical community members. The organizations have expressed interest in using the toolkit and sharing it on their websites and social media accounts.
Outcomes
The primary goal of this project is to empower community members with a toolkit that helps them ask questions about algorithmic technologies and biases to their elected officials. In addition, we hope the toolkit will help inform local and national technology policy changes and lead to algorithmic equity.
The City of Seattle and Washington State are both world leaders in technology policy. The Washington State House has drafted a tech fairness bill (HB 1655) a first step in the direction of broad algorithmic regulation. However, previous research indicates that even expert policymakers are not prepared to understand the particular risks of algorithmic systems as such. We anticipate the toolkit to be adopted both within government and by policy advocates such as the ACLU to strengthen HB 1655 and other existing, ongoing, and future regulatory efforts.
References
Buolamwini, J. & Gebru, T.. (2018). Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. Proceedings of the 1st Conference on Fairness, Accountability and Transparency, in PMLR 81:77-91