Tutorial Design

A purpose of a hack week is to expose participants to a broad range of data science tools and methods. With this in mind, we suggest the following guidelines when developing a tutorial:

Tutorial Timeline

It is a good idea to prepare your tutorials well in advance of the event so that they can be reviewed by other tutorial developers and members of the community. In some cases early preparation is required so that shared computing resources can be tested and configured to contain the libraries and datasets necessary to run the tutorials. Early tutorial preparation also cretes an opportunity for exploring the overall flow of tutorial ideas and concepts across the event.

Technical Considerations

Teaching software and data science tools to a large group of people is challenging when each person arrives with a different computer and operating system configuration. To avoid these challenges we often deploy shared computing resources, such as an instance of JupyterHub on a commercial cloud platform. We then have participants log in to this system using GitHub credentials. This centralized computing architecture can be directly linked to virtual drives that store any of the sample datasets used in the tutorials. Each tutorial can also be tested in advance on this system to ensure all the correct libraries and tools are installed.

There are several computing ecosystems that can be explored to support the technical resources described above. We have the most experiece deploying Pangeo instances that involve deploying JupyterHub with Kubernetes. This blog about ICESat-2 hack week explains our typical Pangeo deployment in more detail.

Tutorial Formats

We have experimented with several different ways to construct and serve tutorial content online. These include:

Interactive content

The hack week model strives to provide participants with an interactive and immersive experience. Building interactive tutorials is a good way to achieve this goal. We have experimented with the Software Carpentry approach to teaching software which involves inviting participants to solve a coding challenge or work through an example during the tutorial. We provide each particiant with two different colored post-it notes that they can use to indicate if they do or do not need help with a particular example. This approach requires having multiple assistants in the room who are familiar with the tutorial and can be available to help answer questions. Without these assistants it can become too overwhelming for one tutorial lead to both navigate the overall pace of the teaching and answer detailed questions from individual participants.

Tutorial Scheduling and flow

Once the overall learning outcomes of each individual tutorial are articulated, the planning committee should spend some time considering the best way to sequence the tutorials. Ideally core concepts, for example the use of GitHub and other collaborative software tools, should be introduced early in the event. If possible, look for ways that one tutorial can build on the previous one. In some cases it may be desirable to have a common sample dataset that is worked with across multiple tutorials as a way to provide a coherent theme.

The scheduling of tutorials will also impact the overall pacing of the event. Consider the energy of the participants at different stages, and their capacity to absorb new knowedge. For example, having too many dense and more complex tutorials on a given day might not leave people with enough energy to take part in projects or hacking later in the day.