Introducing DSSG Fellow Mitchell Goist
I’m coming to DSSG after finishing the fourth year of my program in Political Science at Penn State. My dissertation is focused on state–sponsored death squads in El Salvador and South Africa. I use Natural Language Processing (NLP) techniques to extract information from Truth and Reconciliation testimony in order to analyze the organization of death squads. The social sciences often has a breadth of qualitative, textual evidence, but a dearth of high quality data, so I am bullish on the contribution that information extraction techniques, such as the ones I use for my disseration, can have in studying social phenomena.
Data Science for the Social Good often sounds like buzzword alphabet soup to my non-academic friends, so perhaps I can point them to this blog after it’s published for a little clarification. To me, data science is the more ambiguous term, but it refers to computationally intensive techniques to solve statistical problems. Most of these techniques, such as neural networks or Bayesian modelling have been around for decades, but researchers haven’t had the computing resources to implement them. This has particularly impacted the field of statistics, as most modern statistical techniques, are designed to minimize the computing power or amount of data necessary to arrive at some inference. Now that neither of those concerns are typically valid for current researchers, data science presents a departure from typical methods, while being wholly in line with statistical theory.
Social good is even trickier to pin down, and it’s tempting to go with the cliched Supreme Court pornography definition: you know it when you see it. However, despite what we may think, we often don’t know what the social good is when we see it, and we certainly don’t know how to achieve it. History is replete with examples of well–intentioned people making horrible and consequential decisions in the name of the social good. We could, of course, dismiss these people as misguided idiots, and, since (ostensibly) we don’t think of ourselves as misguided idiots, problem solved! But as people who thinks probabilistically (being data scientists and all), it’s likely we’re at the same position in both the intellect and common sense distributions as our blundering forebearers.
Take, for example, international development, a subject that I spent my first two years of graduate school studying. In the immediate aftermath of decolonization, the largest recipients of foreign aid were Congo and Somalia. Foreign donors continued to gush about Rwanda as a development success story up until the genocide. Not exactly a picture of economic and political success!
The consistent failure of otherwise well-intentioned individuals to achieve social good outcomes points us to an inescapable fact. Social processes are hard to study! Think of how difficult it is to create an accurate model of decision-making processes in a single individual. Now imagine the complexity in measure the collective decision–making processes of multitudes of individuals, acting under an intricate web of institutions, affected by long–running historical factors, and then projected over time. Yikes!
There has been some (very legitimate) concern about “technological solutionism” or the potential for data science approaches to social good problems obscuring political realities or blinding researchers to their own biases. However, data science done right should lead to an enormous and abiding sense of deep skepticism. To return to the develpoment example this changes the questions being asked by researchers from “can countries enact necessary economic reforms under democratic governments” (an awfully complex question that is hopeless to answer scientifically), to “if I give someone 1000 USD in cash, does this make them less poor a year later when I visit?”
I recently finished a book about a Zen monk’s search for englightenment called “An Ongoing Lesson in the Extent of My Own Stupidity.” In many ways, this is a perfect title for DSSG, although it probably won’t help me explain it to my friends.