Selecting Participants for Hack Weeks

All of our hack weeks have been oversubscribed by at least a factor of 2. This brings up an important question: who do we invite to the hack week, and who do we leave out? The interactive nature of a hack week necessitates a careful procedure for participant selection. Selection committees will likely want to pick participants who are on board with and willing to advance the vision and values of the hack week model around collaboration and inclusivity. It may be tempting and straightforward to select candidates that are known personally to the committee as good citizens of the community, but we warn organizers that taking this approach might run counter several goals that a hack week might have: organizers should ask themselves how diverse their own networks are in practice. Do these networks contain predominantly researchers from privileged backgrounds and/or institutions? If so, they might want to consider strategies to broaden their reach and draw in researchers from outside of their networks.

General Notes on the Selection Strategy

Participant selection is one of the most involved and fraught, but also often overlooked aspects of organizing a participant-driven workshop. Organizers should start the process early, ideally at the same time as the team formulates their goals for organizing the hack week: formulating a strategy for selecting participants starts with a formulation of the goals and objectives of the workshop, along with the core values that organizers are aiming to implement and advance. From there, organizers may ask how participant selection might serve these goals and values. Possible goals are: * teaching data science methods to a wide range of researchers in your domain * encouraging collaborations between researchers of different fields * encouraging collaborations between researchers in different stages of their careers * improving access to modern methods and tools for researchers from underrepresented groups

These are just examples: the goals of your workshop will necessarily depend on the particular community the hack week is embedded in and its needs. In the ideal case, the formulated goals lead to a clear strategy for selecting participants. For example, a workshop that targets early-career researchers and is more teaching-oriented may wish to preferentially admit graduate students and postdocs, whereas a workshop aimed at collaboration across career stages may prefer to mix career stages more broadly. In practice, however, it is often not trivial to match up workshop goals with specific choices to be made

Similarly, organizers should articulate a strategy for assessing whether candidates have the appropriate skill level and are likely to participate in a way that creates a welcoming community for the group as a whole. This is generally a hard problem: research from the hiring literature suggests that there is an irreducible variance when trying to predict future performance from applications and intervies (see e.g. Highhouse, 2008). This is also the place where unconscious biases are most likely to creep into the process. Our best advice is to think critically about the selection process, ensure that all committee members are aware of common unconscious biases and are reminded of them throughout the selection process, and to evaluate and re-evaluate every part of the selection process before, during and after each workshop. For all of our hack weeks, we have been continuously learning and aiming to improve each year, and we continue to incorporate our own observations and external feedback in our workshops.

Designing and Assessing Application Forms

Because our hack weeks are oversubscribed, we generally ask participants to apply using a form, and then go through an internal selection procedure to select participants from among the applicants. What information candidates are asked to provide during the application differs quite widely among the hack weeks, but for all hack weeks (and indeed all workshops), it is important to keep in mind that one can only use information for a selection that has been collected on the application form. This may seem obvious, but experience has taught us in several instances that omissions on the form led to serious difficulties in our selection later. Especially with questions that aim to probe values and traits such as collaborativeness, designing questions that elicit useful information about the trait in question is difficult, and on many occasions a question we thought would do so did not, in fact, provide us with useful information.

It is therefore worth the time and effort to design application forms carefully and intentionally. In the process, especially for open-ended questions, organizers should interrogate their own expectations about how participants might answer, and their biases in the answers that are expected. For example, a bias against non-native English speakers might lead reviewers to rate responses by these candidates lower. Similarly, for questions around diversity and inclusion, it is worth considering how differences across cultures and countries might affect what terminology and concepts candidates may be familiar with. There may also be effects related to seniority and familiarity with university environments and culture: more senior participants, especially those from more privileged institutions, might have a better sense of what information organizers are trying to elicit, simply because they have had more training in responding to questions like those likely asked on hack week application forms, and because they have been embedded in the particular culture of academic departments.

While many of the hack weeks use Google Forms for application forms, it is worth critically examining that choice on the basis of respondent privacy and data rights, especially when forms may ask for sensitive demographic information.

Assessing Qualifications

Some hack weeks have requirements about the proficiency with programming or tools that participants are expected to have at the start of the hack week. Proficiency could be assessed, as in admission contexts, through university transcripts. However, grades do not necessarily reflect proficiency, and the grading systems employed in different countries may make an assessment difficult. Organizers could also simply ask participants whether they believe themselves to be at a certain skill levels, e.g. whether they are beginners, intermediate users or experts. Because there is no objective scale for what an "expert programmer" might be, and different people might give different answers to that question, assessing proficiency that way may lead to biases in the selection.

In the hack weeks, we generally aim to assess proficiency with questions that tie skills to particular milestone achievements. For example, when asking about proficiency with machine learning, we might ask participants whether they've only encountered machine learning in their course work, whether they have used a machine learning algorithm for a research project, or whether they have developed and implemented machine learning algorithms themselves. One goal here is to make the questions as clear and unambiguous as possible. However, even here biases may affect the selection. In particular, participants from institutions or countries that are less well-resources may be less likely to have encountered computational classes or trainings in their home institutions. As a result, they may be less likely to be proficient programmers, and simply selecting on coding ability may select out these participants. We suggest that organizers view these questions in the larger context and take the opportunities that a particular applicant may or may not have had into account during the selection process.

Assessing Core Values

Hack weeks thrive through participants who are enthusiastic, curious, collaborative and kind to others. Assessing these traits is perhaps the most difficult part of any selection procedure. In addition, some hack weeks have asked questions around the contributions that participants are likely to make, and about the impact that someone's participation might have on their local community. Our hack weeks have assessed these values and traits through a range of different methods, including open-ended questions on the application form, personal statements and recommendation letters. As mentioned above, questions should be phrased thoughtfully and intentionally with the goal of minimizing biases during this stage. Following best practices from the hiring literature (e.g. Bohnet, 2015), these responses, letters or statements should be graded by several reviewers, using a clear, unambiguous set of rubrics that have been vetted carefully to minimize biases. Some hack weeks have performed this grading using a blinded set, where the names and demographics of the applicants were hidden from the raters. Other hack weeks have instead taken other information, for example career stage, home institution and demographic information into account at this stage in order to mediate effects that may systematically disadvantage some candidates. Based on our current knowledge as organizers, there is no single best way to do this: while there are well-documented effects that indicate that certain candidates may be disadvantaged based on their name alone in hiring contexts Gaddis, 2017, some research also shows that committees that are careful about taking systematic effects and biases into account tend to do worse when demographic information about the candidates is removed Behaghel et al., 2014.

Once all applications have been rated, the members of the organizing committee might then take the ratings and carefully examine them for biases. For example, one may look at interrate reliability to see whether specific raters tend to only give grades in a certain subset of the available ratings. One may also look at whether certain groups (for example junior academics) are systematically rated worse, indicating that biases have not been fully eliminated during earlier stages of the process. Data visualizations are enormously helpful here, as is a practice of multiple organizers taking an independent look at the ratings generated by the reviewers.

Questions about the Participant's Background

For all hack weeks, considerable thought and discussion goes into which information to elicit regarding the applicants' backgrounds. This includes for example their current institution, academic field of study and/or research, and career stage, but also questions around gender identity, sexual orientation and race and ethnicity. In particular sensitive questions around one's identity are difficult to phrase and incorporate in selection procedures. Organizers may decide to simply not include them. However, if the pool of applicants is biased with respect to those categories, organizers risk that the workshop itself may reproduce those biases, or exacerbate them if other categories of assessment (for example open-ended answers) correlate with demographic categories.

There are different ways many of these questions can be phrased, and different ways that information can be included in a participant selection strategy. Questions related to demographic background should always be voluntary, and non-disclosure should not negatively affect someone's likelihood of being accepted. Workshops that are globally advertised may run into difficulties with phrasing questions around race and ethnicity, where categories may be defined differently in different countries. Hack weeks have generally done one of two things: either follow US-based categorization of race and ethnicity, or formulate a question that asked whether someone considered themselves a minority with respect to race and ethnicity in their field of study. The former will allow for a more fine-grained selection process that takes into account how different groups experience different kinds of oppression and privilege, but they may also require applicants, especially those that do not identify with the US-based categorization, into a system that does not correspond to their lived reality. The latter approach side-steps those cross-cultural issues, but combines all participant into minority and non-minority categories, which may disproportionally disadvantage some groups. In addition, recent feedback suggests that phrasing the question this way might induce stereotype threat in our applicants, and thus some hack weeks are moving away from that approach. One potential solution might be to move away from an approach of categorization for demographic categories (including for example also gender identity) and simply allow applicants to self-identify using a free-form response.

How should these categories---professional categories like someone's career stage or institution, and demographic ones like gender identity and race and ethnicity---be included in the selection process? As mentioned above, they could be part of a selection where individual candidates are considered on the bases of their responses as well as the information disclosed in this section of the application form (if it was disclosed). Career stage, location and demographic categories could in this scenario inform selection by providing information about the applicant's opportunities, and the privilege or oppression they may have experienced in the field.

Some of the hack weeks, motivated by the idea of diversifying the workshop across many of these categories, have used a selection process mediated by an algorithm (implemented in the software Entrofy Huppenkothen et al., 2020 to support the selection. In this approach, the committee will generally perform some pre-selection based on the open-ended answers, personal statements and recommendation letters to identify a subset of qualified candidates. The final cohort is selected using the algorithm, where the objective is to have the cohort globally match a set of target values across all categories. Astro Hack Week and Geo Hack Week have both implemented procedures that followed this model. It is worth nothing here that the use of an algorithm does not make the selection intrinsically less biased. Categories, the allowed values within each category, and the target values are all chosen by humans, most likely the organizers, and are thus subject to the biases these humans impose during creation of the application form or the subsequent selection on open-ended answers. As Meredith Broussard states in her book "Artificial Unintelligence", algorithms are social constructs, because they are constructed by humans. Using algorithmic mediation in selection processes does not absolve organizers of critical interrogation of their procedures and the biases they might impose on it, however unintentionally.

One advantage of the approach employing Entrofy is that it allows for more straightforward transparency and accountability than more traditional approaches might. Organizers can (and do so in practice) share the details of the selection procedure, categories and targets used, and often the code employed for selection (after removal of all confidential participant data). However, the structure of the algorithm requires that applicants can fit themselves into pre-defined categories, which may be difficult given the aforementioned problems when categories are intrinsically ill-defined or not inclusive.

The hack weeks are continuously iterating and changing our approach with the aim to make our selection procedures more equitable and fair, and we share information and experiences among one another to learn from our different events. However, none of us feel that we have a single set of best practices to share that would guarantee and equitable selection. Our best advice is: