I’m a senior at the University of Washington finishing up my B.S. in Economics and B.A. in Information Systems. My primary research experience is in labor and development economics. Specifically, I’m studying the effects of female labor force participation on domestic violence in the household. My experience made me excited about the power of economics to capture and model tragedies like violence against women, which are often dismissed as deeply ingrained, cultural, and thus unsolvable problems. More broadly, my interest in modeling and answering questions about social phenomena with large amounts of data led me to explore the relatively new field of data science, where I’ve found that my background in information systems plays as big of a role as my skills in statistics and econometrics.

I have two general goals that I hope to accomplish during my summer in the DSSG program. First, I simply want to be useful. I am not unique in saying that I want to lend myself to social good initiatives, and in the past I’ve volunteered in Bangladesh, teaching English and working at a health clinic in an effort to help. However, I learned that while my experience served as an excellent exercise in empathy, I ultimately did very little to help those in need. I was inevitably using more resources than I was contributing, partially because of the steep learning curve I experienced as a foreigner, doing work that a local volunteer could do more efficiently and effectively. I realized that my contributions do not always have to be global, and in fact may be more useful if I channel them towards issues here in Seattle, my hometown. The eScience Institute provides an opportunity to contribute to a social good initiative using skills that I already have, in an environment that supports further development of those skills without hindering project goals. I hope that in this environment, my contributions will help the team deliver useful, accurate and functional memos and open source code for the counties and the Gates Foundation.

My second goal is to gain a better understanding of the challenges, limitations, and best practices in the process of teasing out meaningful information from large amounts of messy, noisy data. I am incredibly excited by the potential of big data projects, but I realize the importance of maintaining a healthy amount of disillusionment in order to recognize biases (inherent in both the data and my mindset) and set realistic goals in any data-driven research project.

The DSSG program has a strong professional development component, so my second learning goal is already in the works. Over the course of the summer we’re lucky enough to participate in workshops on big data analytics, statistical software, and attend high-level talks on the various facets of data science. Already, we’ve heard a number of extremely intelligent, relevant talks, a few of which directly addressed some key issues regarding data objectivity and bias that we are beginning to face in our project - a very interesting topic for another blog post. For now, I’ll finish by saying that I’m super excited to be working with the amazing data scientists, domain experts, and fellow DSSG interns that the eScience Institute has brought together this summer.