A unique quality of the Bill & Melinda Gates Foundation is that it can bring together many different stakeholders to work toward a common goal. In other words, it often acts as a facilitator of multi-stakeholder collaboration—a network in which “actors from business, civil society and governmental or supranational institutions come together in order to find a common approach to an issue that affects them all and that is too complex to be addressed effectively without collaboration” Roloff, 2008, p.3. A good case in point is the Data Solutions Work Group, which is co-led by the foundation and the non-profit Building Changes and includes data experts from King, Snohomish, and Pierce County, to develop a more efficient way to use data to drive policy and decision-making on issues that affect family homelessness.

##What does this mean for the Data Science for Social Good (DSSG) project?

Independently from the Data Solutions Work Group, the Gates Foundation applied for a research project on family homelessness with the DSSG program (whose research theme this year is urban science), aiming to supplement the efforts the Work Group. Thus, it is the official stakeholder of this DSSG project. However, the research questions and data required to conduct the DSSG project naturally concern the other members of the Data Solutions Work Group, so all of these members are also crucial stakeholders of the DSSG project.

As a result, the DSSG project is a more elaborate form of multi-stakeholder collaboration that includes the Work Group—itself already a multi-stakeholder platform—and the DSSG team, with the foundation acting as the bridge between the two. How has this worked in real time? Borrowing from Candace Faber’s talk on multi-stakeholder collaboration, I describe the collaboration in four aspects:

###Get the data owners on board

The immediate result of this multi-stakeholder collaboration is that the DSSG team, working under a formal agreement of strict use and privacy protections, is able to analyze de-identified data on family homelessness from the three counties. The counties’ data experts bring their data to the foundation, and then the foundation provides the DSSG team with secure access to the data through a portal. This is illustrative of the foundation’s bridging and “gatekeeper” role in the collaboration, “controlling the flow of resources” between the actors that it connects with Hawe Webster & Shiell, 2004, p.974. In this instance of secure data sharing, this ‘gatekeeper’ role has not impeded, but catalyzed, the collaboration.

All in all, the data owners (i.e. the counties) are supportive of the DSSG project thanks to a good rapport between them and the foundation, as well as a close working relationship between the foundation’s project leads and the DSSG team.

###Put the numbers in context: engage domain experts first

The de-identified data in question are collected and managed through a federally mandated system called the Homelessness Management Information System (HMIS) from the U.S. Department of Housing and Urban Development (HUD). But given the idiosyncrasies of each county—such as population demographics and data collection methods, the de-identified HMIS data from one county are represented and interpreted very differently from that of another county. On top of this, HUD has changed some HMIS policies during the time frame of the data, leading to more discrepancies between records in the same set of data.

Fortunately, the DSSG student interns are not only analyzing the de-identified data, they also have direct contact with the county data experts and Building Changes. Through attending two of the Work Group meetings hosted by the Gates Foundation, the student interns have had a chance to report their progress and receive feedback directly from all stakeholders. Outside of these in-person meetings, the student interns can also contact the county data experts by email with pressing questions about the data. While we would prefer more frequent in-person contact with the data experts, in any multi-stakeholder collaboration it is a challenge to find time and space.

This direct relationship between the student interns and the data experts has proven to be incredibly valuable. There have been several cases where puzzling findings from the data are clarified when the data experts provide additional context. This reinforces the understanding that results of data analysis must be put into context to have real implications. And the only way to get the relevant contextual information, in this case, is to engage with all members of the Work Group and develop trust among these stakeholders.

###Develop a shared language

Developing trust with these stakeholders involves developing a shared language. This shared language is not just about the appropriate vocabulary to use when discussing the homelessness data. It also pertains to a mutual understanding of certain procedures that are not mere formalities.

A good example is the procedure of approval for publicly disclosed information. Due to the sensitive nature of the subject matter and the significant implications of the research results, each stakeholder of the DSSG project needs to approve any information meant to be publicly disclosed. For instance, this blog post requires an email approval from all three counties’ data experts, a representative from Building Changes, and the co-leaders of the Work Group from the foundation and Building Changes. It makes for a long process, but it is absolutely vital to the integrity of the project.

When it comes to presenting analysis results to a public audience, this procedural formality is necessary to ensure that all stakeholders know exactly what is being disclosed about their data. On one occasion, miscommunication between the DSSG team, the eScience Institute, and the Work Group on the content of a mid-way presentation prevented the project from being highlighted in a newspaper article. A lesson learned is that in order to present valuable and relevant information to the public, the DSSG team must give the stakeholders ample time to understand and approve such information.

###Involve them in the design of the project

Needless to say, the best kind of multi-stakeholder collaboration happens when researchers involve the stakeholders as much as they can in the project. At every in-person meeting with the stakeholders, the DSSG student interns present their research strategies and the assumptions made for each strategy. In this way, the stakeholders are continuously reassured of the following: The student interns are using the data to answer questions the data can answer. The student interns understand the points of view of these stakeholders. The student interns understand the complexity and limitations of the data.

For example, the data used for this research project consist of families that enter the HMIS-participating homeless housing system during a set time frame. However, not all families who become homeless seek or receive services from this homeless housing system; and homeless housing services that are not publicly funded do not always participate in HMIS. The DSSG student interns understand that the data cannot be used to make inferences about the factors associated with any family becoming homeless, but the data can help understand the factors associated with a HMIS-included homeless family successfully “exiting homelessness” by getting permanent housing.

Although the unit of analysis for the bulk of the project is a family, the definition of families is not clearly presented in the de-identified data. The definition used by the DSSG team, as explained in Chris’s blog post, is the product of an in-person consultation with the stakeholders.

The same goes with defining an “episode” of homelessness for a family. The same family could be enrolled multiple times into different homeless programs. For example, some families enroll into an emergency shelter before enrolling into another housing program. The DSSG team decided—with stakeholder approval—that if the enrollments are close together, then they would be one episode.

Last but not least, any contextual information, assumptions, and potential biases will be clearly stated along with analysis results. In this way, the stakeholders can be well-informed and feel comfortable with making policy decisions that are actually based on the scientifically sound research of the DSSG team. For instance, the data experts will hopefully be able share the analysis with their county leadership, who can subsequently work with service providers, housing program staff, and other relevant stakeholders to improve services for homeless families.


To conclude, multi-stakeholder collaboration surely renders some difficulties of coordination and communication. However, as Innes & Booher, 2014 assert, “collaboration is about conflict … as efforts to overcome it can lead to a range of new options and outcomes that were not previously foreseen” (p.9). As a DSSG intern, I firmly believe that any dialogues with the stakeholders make this research more meaningful and impactful.